How to handle multiple bar plot panels with variable sample sizes using ggplot?

50 Views Asked by At

I am designing a figure that has several groups that I am trying to display in facets. Each group has a different sample size. I would like to have the width of each bar be the same, which motivated my choice to use facets instead of plotting each group individually and then putting them together via patchwork.

The problem is that because the facet groupings are determined by one variable, I can only figure out how to them them out horizontally. What I would like to figure out is how to display the facets on multiple rows to be more space efficient while keeping the bar width and scale the same. I couldn't get a facet grid to work because each group has a different number of sample sizes. Is there a way that I can do this?

I have simply just laid them out horizontally and then used adobe to chop it, but I would prefer to do it programmatically. long version This is more or less what I want to achieve in R

adobe chopped version with two rows

The code I used to produce the first one:

 library(ggplot2)
library(tidyr)
library(dplyr)

set.seed(123)

selected_pops <- data.frame(
  id = numeric(),
  pop = numeric(),
  val_high = numeric(),
  val_medium = numeric(),
  val_low = numeric()
)

pops = c("A", "B", "C", "D", "E", "F", "G", "H", "I")
for (i in 1:length(pops)) {
  num_iterations <- floor(rnorm(1, mean = 9, sd = 3))
  print(paste("num iterations for ", i, "is: ", num_iterations))
  
  pop_values <- rep(pops[i], num_iterations)
  id_values <- 0:(num_iterations - 1)
  val_high_values <- rnorm(num_iterations, mean = 3, sd = 1)
  val_medium_values <- rnorm(num_iterations, mean = 7, sd = 2)
  val_low_values <- rnorm(num_iterations, mean = 9, sd = 3)
  
  selected_pops <- rbind(selected_pops, data.frame(
    id = id_values,
    pop = pop_values,
    val_high = val_high_values,
    val_medium = val_medium_values,
    val_low = val_low_values
  ))
}



 selected_pops <- selected_pops %>%
    arrange(desc(rowSums(select(., starts_with("val.")))))
  
  # Reshape data to long format
  selected_long <- gather(selected_pops, key = "level", value = "value", -id, -pop)
  selected_long$level <- factor(selected_long$level)
  
  print(selected_long)
  
  # Define the order in which you want the facets
  populations_to_include <- pops
  desired_facet_order <- populations_to_include
  
  # Reorder the 'pop' variable
  selected_long$pop <- factor(selected_long$pop, levels = desired_facet_order)
  
  # Plot colors
  custom_colors <- c("val_high" = "darkred", "val_medium" = "red", "val_low" = "orange")
  legend_labels <- c("val_high" = "High",
                     "val_medium" = "Medium",
                     "val_low" = "Low"
                     )
  
  #do actual plotting
  big_bar <- ggplot(selected_long, aes(x = as.factor(id), y = value, fill = factor(level))) +
    geom_bar(stat = "identity", position = "stack", width = 1) +
    labs(
      title = "Summed values by group",
      fill = "Sub values",  # Change legend title
         x = "Individual ID",     # Change x-axis label
         y = "Summed Values"
      ) +
    scale_fill_manual(values = custom_colors, labels = legend_labels) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1, size = 8)) +
    facet_grid(. ~ pop, scales = "free", space = "free")
  
big_bar
0

There are 0 best solutions below