ggalluvial geom_text repeating

253 Views Asked by At

I would like to use geom_text within the ggalluvial plots I am creating. The code provided below works as expected unless there is a 50/50 even split between performance levels. Is there an argument within geom_text() to prevent the text from repeating on the geom_stratum(aes(fill = profEOY))? In short, the results for 2018 look as expected, the results for 2019 with the 50/50 split are problematic.

I am displaying this plot for 160+ reporting sites in Shiny, so I do not think annotate() will work.

Example Plot

library(tidyverse)
library(ggalluvial)
df <-
  structure(
    list(
      profBOY = c(
        "At or Above",
        "At or Above",
        "Below or Well Below",
        "Below or Well Below",
        "At or Above",
        "At or Above",
        "Below or Well Below",
        "Below or Well Below"
      ),
      profEOY = c(
        "At or Above",
        "Below or Well Below",
        "At or Above",
        "Below or Well Below",
        "At or Above",
        "Below or Well Below",
        "At or Above",
        "Below or Well Below"
      ),
      EndYear = c(
        2018L,
        2018L,
        2018L,
        2018L,
        2019L,
        2019L,
        2019L,
        2019L
      ),
      sum = c(37, 0, 6, 21, 27, 3, 5, 22),
      totaln = c(74L,
                 74L, 74L, 74L, 80L, 80L, 80L, 80L),
      totalNBoy = c(68, 68, 68, 68, 64, 64, 64, 64),
      totalInBoyPB = c(
        "38",
        "38",
        "30",
        "30",
        "32",
        "32",
        "32",
        "32"
      ),
      percentInBoyPB = c(55.9, 55.9, 44.1, 44.1,
                         50, 50, 50, 50),
      totalNEoy = c(69, 69, 69,
                    69, 73, 73, 73, 73),
      totalInEoyPB = c(
        "44",
        "25",
        "44",
        "25",
        "34",
        "39",
        "34",
        "39"
      ),
      percentInEoyPB = c(63.8, 36.2, 63.8, 36.2, 46.6, 53.4, 46.6, 53.4),
      percent = c(
        "50%",
        "0%",
        "8%",
        "28%",
        "34%",
        "4%",
        "6%",
        "28%"
      )
    ),
    row.names = c(NA,-8L),
    class = c("tbl_df", "tbl", "data.frame")
  )

ggplot(df,
       aes(y = sum,
           axis1 = str_wrap(profBOY, 10),
           axis2 = str_wrap(profEOY, 10),
           fill = profBOY
       )) +
  geom_flow(color = '#e57a3c', curve_type = 'quintic') +
  scale_x_discrete(limits = c("Beginning \nof Year", "End \nof Year")) +
  scale_fill_manual(values = c("#315683","#6c7070")) +
  geom_stratum(aes(fill = profEOY), color = 'grey', width = 1/2) +
  geom_stratum(aes(fill = profBOY), color = 'grey', width = 1/2) +
  geom_text(stat = 'stratum', 
            aes(label = paste0(percentInBoyPB, '%')), 
            vjust = 1, size = 4, color = 'white')+
  labs(fill = 'Performance Level')+
  facet_wrap(vars(factor(EndYear)), 
             nrow = 1, 
             scales = 'free_y')+
  theme_minimal()+
  theme(axis.text.y = element_blank(), 
        axis.title.y = element_blank(), 
        axis.ticks = element_blank(),
        panel.grid = element_blank(), 
        legend.position = 'top', 
        legend.text = element_text(size = 12),
        legend.title = element_text(size = 14),
        axis.text.x = element_text(size = 16), 
        strip.text = element_text(size = 18), 
        strip.background = element_rect(fill = 'lightgrey', color = 'lightgrey')
  )

Thanks for any advice!

1

There are 1 best solutions below

0
On

To solve this issue, I think that the "cleanest" idea would be to convert the data into the "long" (or Lodes) format. This makes the stratum labels much easier to control.

There is another way to control this using the after_stat() function that recovers variables obtained with the statistical transformation to build the alluvial.

In the geom_text() if you write aes(label = after_stat(x)), the stratums will have these following labels: Alluvial_plot_example_1

You can now see the x variable written on their respective stratum. We can also see that x correspond to the alluvial axis: The first axis (Beginning of Year) is "1" and the second axis (End of Year) is "2".

Knowing this, you can now control the label you want to write with a simple ifelse() statement.

In summary, my solution consisted in modifying the aesthetics of geom_text() by writing aes(label = ifelse(test = after_stat(x) == "1", paste0(df$percentInBoyPB, '%'), "")) to obtain:

Alluvial_plot_example_2

The whole ggplot was:

ggplot(df,
       aes(y = sum,
           axis1 = str_wrap(profBOY, 10),
           axis2 = str_wrap(profEOY, 10),
           fill = profBOY
       )) +
  geom_flow(color = '#e57a3c', curve_type = 'quintic') +
  scale_x_discrete(limits = c("Beginning \nof Year", "End \nof Year")) +
  scale_fill_manual(values = c("#315683","#6c7070")) +
  geom_stratum(aes(fill = profEOY), color = 'grey', width = 1/2) +
  geom_text(stat = 'stratum', 
            aes(label = ifelse(test = after_stat(x) == "1", paste0(df$percentInBoyPB, '%'), "")), 
            vjust = 1, size = 4, color = 'white')+
  labs(fill = 'Performance Level')+
  facet_wrap(vars(factor(EndYear)), 
             nrow = 1, 
             scales = 'free_y')+
  theme_minimal()+
  theme(axis.text.y = element_blank(), 
        axis.title.y = element_blank(), 
        axis.ticks = element_blank(),
        panel.grid = element_blank(), 
        legend.position = 'top', 
        legend.text = element_text(size = 12),
        legend.title = element_text(size = 14),
        axis.text.x = element_text(size = 16), 
        strip.text = element_text(size = 18), 
        strip.background = element_rect(fill = 'lightgrey', color = 'lightgrey')
  )