I am attempting to plot multiple time series variables on a single line chart using ggplot. I am using a data.frame which contains n time series variables, and a column of time periods. Essentially, I want to loop through the data.frame, and add exactly n goem_lines to a single chart.
Initially I tried using the following code, where;
- df = data.frame containing n time series variables, and 1 column of time periods
- wid = n (number of time series variables)
p <- ggplot() + scale_color_manual(values=c(colours[1:wid])) for (i in 1:wid) { p <- p + geom_line(aes(x=df$Time, y=df[,i], color=var.lab[i])) } ggplotly(p)
However, this only produces a plot of the final time series variable in the data.frame. I then investigated further, and found that following sets of code produce completely different results:
p <- ggplot() + scale_color_manual(values=c(colours[1:wid])) i = 1 p = p + geom_line(aes(x=df$Time, y=df[,i], color=var.lab[i])) i = 2 p = p + geom_line(aes(x=df$Time, y=df[,i], color=var.lab[i])) i = 3 p = p + geom_line(aes(x=df$Time, y=df[,i], color=var.lab[i])) ggplotly(p)
p <- ggplot() + scale_color_manual(values=c(colours[1:wid])) p = p + geom_line(aes(x=df$Time, y=df[,1], color=var.lab[1])) p = p + geom_line(aes(x=df$Time, y=df[,2], color=var.lab[2])) p = p + geom_line(aes(x=df$Time, y=df[,3], color=var.lab[3])) ggplotly(p)
In my mind, these two sets of code are identical, so could anyone explain why they produce such different results?
I know this could probably be done quite easily using autoplot, but I am more interested in the behavior of these two snipits of code.
What you're trying to do is a 'hack' way by plotting multiple lines, but it's not ideal in ggplot terms. To do it successfully, I'd use
aes_string
. But it's a hack.How to do it properly
To make this plot more properly, you need to pivot the data first, so that each aesthetic (aes) is mapped to a variable in your data frame. That means we need a single variable to be
color
in our data frame. Hence, wepivot_longer
and plot again: