ggplot accessing data labels rather than data values

799 Views Asked by At

I've imported some SPSS data into R, the data is labelled and I want to plot the label names, rather than the values. I've tried the below, but it prints the values

library(labelled)
library(tidyverse)

cols <- c("White", "Red", "Black", "Green", "Brown", "Pink", "Orange")
letters <- c("D", "E", "F", "G", "H", "I", "J")

# add labels to the color values
tmp <- diamonds %>% 
 mutate(color = as.character(color)) %>%
 set_value_labels(color = setNames(letters, cols)) 

ggplot(tmp) + geom_bar(aes(x=print_labels(color)))

enter image description here

The graph still uses the color values as the y-axis, rather than the labels. How can we plot the dataframe labels?

2

There are 2 best solutions below

0
On BEST ANSWER
library(haven)

cols <- c("White", "Red", "Black", "Green", "Brown", "Pink", "Orange")
letters <- c("D", "E", "F", "G", "H", "I", "J")

tmp <- diamonds %>% 
  mutate(color = as.character(color)) %>%
  set_value_labels(color = setNames(letters, cols)) 

ggplot(tmp) + geom_bar(aes(x=as_factor(color)))

Using the haven library, we can use the as_factor(x) command, which converts the labels into factors

enter image description here

7
On

Based on cols and letters, I would create a dataframe for these and then merge (via dplyr::inner_join) with tmp to create a new column that has the names that you want along the x-axis.

library(ggplot2)
library(dplyr)
library(tibble)

#Create a dataframe key to get the color names for each category and make a new column with the names.
col_key <- attributes(tmp$color)$labels %>%
  as.data.frame() %>%
  dplyr::rename(cols = 1) %>%
  tibble::rownames_to_column("color")

tmp <- dplyr::inner_join(tmp,
                         col_key, by = "color")

ggplot(tmp) +
  geom_bar(aes(x = cols))