Problematic behaviour of either error bars or geom_text labels in ggplot with dodged bars of equal width

60 Views Asked by At

I have a problem getting a plot right with the geom_text labels and the error bars in the correct position. Mock example:


# Load necessary libraries
library(ggplot2)

# Mock data for ggplot
incidence_table <- data.frame(
  stage = sample(c("Stage1", "Stage2", "Stage3", "Stage4"), size = 21, replace = T),
  ageGroup = factor(sample(c("AgeGroup1", "AgeGroup2", "AgeGroup3", 
                               "AgeGroup4", "AgeGroup5", "AgeGroup6"), size = 21, replace = T)),
  Incidence = sample(500:1000, size = 21, replace =T),
  patients = sample(30:100, size = 21, replace = T)
)

incidence_table$IncUpper <- incidence_table$Incidence + sample(30:60, size = 21, replace = T)
incidence_table$IncLower <- incidence_table$Incidence - sample(30:60, size = 21, replace = T)


# Create the ggplot
b <- ggplot(incidence_table, aes(x = factor(stage), y = Incidence, fill = factor(ageGroup))) + theme_bw()

# Add columns
b <- b + geom_col(position = position_dodge2(width = 0.9, preserve = 'single'), colour = "black") +
  scale_fill_brewer("age group", palette = "Oranges") + xlab("stage") + ylab("Incidence of event") +
  geom_text(aes(label = patients, y = -60 ), size = 5, position = position_dodge2(width = .9, preserve = "single")) +
  geom_errorbar(data = incidence_table, aes(x = factor(stage), ymin = IncLower, ymax = IncUpper),
                position = position_dodge2(width = .9), width = 0.25, color = "black")


# Print the plot
print(b)

I have tried position_dodge() and position_dodge2(), but was not able to get the error bars and the text labels below the bars in the correct position

1

There are 1 best solutions below

3
On

I don't understand why using the width argument to position_dodge2 doesn't work (maybe somebody else can explain) but the way to do this is with the padding argument to reduce the errorbar widths (keeping the width of the error bars the same as the columns). So:

# Create the ggplot
b <- ggplot(incidence_table, aes(x = factor(stage), y = Incidence, fill = factor(ageGroup))) + theme_bw()

# Add columns
b <- b+
  geom_col(position = position_dodge2(width = 0.9, preserve = 'single'), colour = "black") +
  scale_fill_brewer("age group", palette = "Oranges") + 
  xlab("stage") + 
  ylab("Incidence of event") +
  geom_text(aes(label = patients, y = -60 ), size = 4, position = position_dodge2(width = .9, preserve = "single")) +
  geom_errorbar(aes(ymin = IncLower, ymax = IncUpper),
                position = position_dodge2(preserve = "single",padding=0.7), 
                width = .9, 
                color = "black")


# Print the plot
print(b)

enter image description here

Personally though I think I'd use facet_grid for this instead of dodging. With your new example data in the comments:

ggplot(incidence_table) + 
  aes(x=factor(ageGroup), y=Incidence,fill = factor(ageGroup)) + 
  facet_grid(~stage, scale="free", space="free") + 
  geom_col(color="black", position = "dodge") + 
  scale_fill_brewer("age group", palette = "Oranges") + 
  xlab("stage") + 
  ylab("Incidence of event") +
  geom_text(aes(label = patients, y = -60 ), size = 4) +
  geom_errorbar(aes(ymin = IncLower, ymax = IncUpper), 
                width = .25, 
                color = "black")  + theme_bw()+ 
  theme(axis.text.x = element_blank(), axis.ticks.x = element_blank())

enter image description here

The scale="free", space="free" arguments will make the panels change size depending on which age groups are present at each stage.