Sequence plot with ggplot2 and geom_raster

422 Views Asked by At

I want to create a sequence plot like here (first figure, b)): http://traminer.unige.ch/preview-visualizing.shtml

But I want to use ggplot2 for this. I "melted" the data for this purpose because I don't like the wide-format. Now I plotted the resulting data with geom_raster and the result was the following:

1

I expected to get six "horizontal blocks" at the beginning like in the first link (I hope you know what I mean) but the job variable is quite shuffled. This is my code, I think only the lines with the ordering and the ones using ggplot are relevant for the problem:

library(TraMineR)
library(data.table)
library(magrittr)
library(zoo)
library(stringr)
library(purrr)
library(ggplot2)

Sys.setlocale("LC_ALL","English")

data(mvad)
Data <- as.data.table(mvad)
rm(mvad)

Data %<>%
  melt(measure.vars = c("Belfast", "N.Eastern", "Southern", "S.Eastern", "Western"),
       variable.name = "school", 
       value.name = "school.boolean") %>% 
  .[school.boolean == "yes"] %>% 
  .[, -"school.boolean"]

time.vars <- 
  names(Data) %>% 
  .[str_detect(., "[:alpha:]{3}\\.[:digit:]{2}")]

boolean.cols <- 
  c("male", "catholic", "Grammar", "funemp", "gcse5eq", "fmpr", "livboth")

Data %<>% 
  melt(measure.vars = time.vars,
       variable.name = "month",
       value.name = "job") %>%
  .[, month := as.yearmon(month, "%b.%y")] %>% 
  setorder(id, month) %>% 
  .[, (boolean.cols) := map(.SD, ~ {.x == "yes"}),
    .SDcols = boolean.cols] %>% 
  .[, Sex := ifelse(male == TRUE, "Male", "Female")] %>% 
  .[, -"male"] %>% 
  setnames(names(.), names(.) %>% str_to_title) %>% 
  .[, Id := factor(Id, levels = Id[order(Job, Month)] %>% unique)] 

Data %>% 
  ggplot(aes(x = Month, y = Id, fill = Job)) + 
  geom_raster() + 
  labs(y = NULL)

Edit: .[, Id := factor(Id, levels = Id[order(Month, Job)] %>% unique)] does not work either.

1

There are 1 best solutions below

0
On

After time.vars are defined one has to set the order of the dataset by them. Then one has the right ID ordering.

Data %<>% setorderv(time.vars)
ID.order <- Data[, id %>% unique]

At the end this row

.[, Id := factor(Id, levels = Id[order(Job, Month)] %>% unique)] 

has to be replaced with this one:

.[, Id := factor(Id, levels = ID.order)]