Following the guidance presented in a previous post, I attempted to extend the solution to a dataset where the significant "Change" in grades (of 3 or more points) may be present at different months for different students, and that a Student could present a significant "Change" in more than two consecutive months.
This is the dataframe containing all the information required for plotting:
test = structure(list(Student = c("Ana", "Brenda", "Max", "Alan", "Ana",
"Brenda", "Max", "Alan", "Ana", "Brenda", "Max", "Alan"), Month = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("January",
"February", "March"), class = "factor"), Grade = c(7L, 6L, 7L,
7L, 8L, 10L, 7L, 10L, 5L, 8L, 10L, 7L), Change = list("February",
"January", "February", c("January", "February"), "February",
"January", "February", c("January", "February"), "February",
"January", "February", c("January", "February")), xend = structure(c(2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, NA, NA, NA, NA), .Label = c("January",
"February", "March"), class = "factor"), yend = c(8L, 10L, 7L,
10L, 5L, 8L, 10L, 7L, NA, NA, NA, NA)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -12L), groups = structure(list(
Student = c("Alan", "Ana", "Brenda", "Max"), .rows = structure(list(
c(4L, 8L, 12L), c(1L, 5L, 9L), c(2L, 6L, 10L), c(3L,
7L, 11L)), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))
What I attempted was to make the "Change" column a list, and to simply use the operator '%in%' instead of '==' like so:
ggplot(test, aes(x = Month, y = Grade, color = Student, group = Student)) +
geom_point() +
geom_segment(aes(xend = xend, yend = yend, linetype = Month %in% Change[[1]])) +
scale_x_discrete(limits = unique(test$Month)) + scale_linetype_manual(values = c(`TRUE` = "solid", `FALSE` = "dashed"))
Where I'm telling R to plot Months x Grades, to use a different color for each student, and to switch the linetype of the line segment according to whether the element in column "Month" is present in the column "Change" or not; so that solid lines represent the period where there was a significant shift in that student's grade, or dashed if otherwise. However, the resulting plot is not as I would expect:
Seeing as, for example, Brenda's grade suffered the 'Change' from January to February, one would expect the line/segment to be solid at that portion. Or in the case of Alan, we would expect to see two segments of the line being solid. Instead we see that all Students present the segment from January to February to be dashed, and the segment from February to March to be solid.
I don't comprehend what is wrong with my implementation, seeing that, for example:
test[test$Student == "Brenda", ]$Month == test[test$Student == "Brenda", ]$Change[[1]]
Returns:
TRUE FALSE FALSE
Any help is appreciated, I would really like to know why this is not working as intended.
Your approach looks a bit complicated to me. Instead of adding a list column you could simply check whether there was a significant change using the absolute value of the difference between
Grade
andyend
(the Grade in the next month):