My goal is to use a heat map to plot the data below. I have obtained the layout of the heat map I wanted, but I am struggling to find a way to color the tiles as intended. My guess is that the code below colors the tiles proportional to the number inside the tile itself. However, this is not what I need. I would like to:
- have the same color within each of the rectangles (yellow for the rectangle on the left and blue for the rectangle on the right);
- a gradient from yellow to blue for the other tiles proportional to how far away the tiles are from the blue area.
Thanks to anyone who will help!
library(tidyverse)
# First, I create the simulated dataset with 200 individuals
set.seed(1243) # set seed for reproducibility
#I simulate test 1 scores
test1_score <- sample(c(3.5, 3.8, 4), 200, replace = TRUE)
#I simulate test 2 scores
test2_score <- round(runif(200, 0, 38))
#I create diagnostic classes based on test 2 scores
test2_class <- ifelse(test2_score < 20, "non meet",
ifelse(test2_score < 30, "below 40th",
ifelse(test2_score < 35, "above 40th",
ifelse(test2_score < 37, "meet", "master"))))
#I create exit_program variable based on test 1 scorea and test 2 classes
exit_program <- ifelse(test1_score == 4 & test2_class %in% c("above 40th", "meet", "master"), 1, 0)
#I create data frame with simulated data
df <- data.frame(ID = 1:200, test1_score, test2_class, exit_program)
#I calculate the group size for each combination of test 1 scores and test 2 classes
df2 <- df |>
group_by(test1_score, test2_class, exit_program) |>
summarize(n = n()) |>
mutate(test2_class = factor(test2_class, levels = c("non meet", "below 40th", "above 40th", "meet", "master"))) |> arrange(test2_class)
#plot data
df2 |>
ggplot(aes(x = test2_class, y = as.factor(test1_score))) +
theme(legend.position = "none", panel.background = element_rect(fill = "white"),
axis.text = element_text(size = 12), axis.title = element_text(size = 14)) +
geom_tile(aes(fill = n)) +
geom_text(aes(label = n), size = 7, family = "sans") +
labs(x = "classification", y = "scores", size = 10) +
scale_fill_gradient(low = "#56B4E9", high = "#F0E442", guide = "none") +
geom_rect(aes(xmin = 2.5, xmax = 5.5, ymin = 2.5, ymax = 3.5), linewidth = 2, color = "#0072B2", fill = NA) +
geom_rect(aes(xmin = 0.5, xmax = 1.5, ymin = 0.5, ymax = 3.5), linewidth = 2, color = "#E69F00", fill = NA)
Maybe this is what you are looking for. You are right. Mapping
n
onfill
means to color the tiles by the value ofn
.As I understand the question you want each "column" to have the same color and colored with a gradient from yellow on the left to blue on the right. To this end you could add a column with a measure of the distance from the left or the right end to your data which could then be mapped on the
fill
aes. One option would be to convert yourtest2_class
to a numeric then compute the absolute distance from the maximum value (which corresponds to the "master" level).