How do I merge columns using gt on summary rows?

93 Views Asked by At

Background and Question

I'm using the gt package in R to create presentation quality tables. For the table I am creating, I am trying to show both the counts and the percentage of the total that the counts represent in a condensed format. How do I apply the results seen on the individual country lanes to the subtotal and grand total rows?

Data

library(tidyverse)
library(gt)

df <- tibble(country_lane = c("CA-US", "MX-CA", "MX-US", "US-CA", 
                                "US-MX", "US-US"), 
                     `Above Market` = c(2L, 3L, 33L, 3L, 3L, 54L
                                ), 
                     `Below Market` = c(18L, 0L, 2L, 14L, 3L, 370L), 
                     `In Market` = c(15L, 0L, 25L, 15L, 3L, 240L),
                 BU = c("US", "CAN", "US", "CAN", "US", "US"))
> df  
# A tibble: 6 × 5
  country_lane `Above Market` `Below Market` `In Market` BU   
  <chr>                 <int>          <int>       <int> <chr>
1 CA-US                     2             18          15 US   
2 MX-CA                     3              0           0 CAN  
3 MX-US                    33              2          25 US   
4 US-CA                     3             14          15 CAN  
5 US-MX                     3              3           3 US   
6 US-US                    54            370         240 US   

Working the Problem....

This part works:

I'm open to suggestions how to do this first part better, but I'm generally happy this works as is to create the percentage columns and groupings for subtotal and total.

mkt_cols <- c("Below Market", "In Market", "Above Market")  # Used for across() below

denom <- sum(df$`Below Market` + df$`In Market` + df$`Above Market`)  # Was hoping to find an easier way to employ mkt_cols here in case in the future I have more columns, but this is okay to get the denominator.

gt_out1 <- df %>% 
mutate(across(all_of(mkt_cols), ~ .x / denom , 
              .names = "{.col}_pct" )) %>%
gt(rowname_col = c("country_lane", "Subtotal"),  groupname_col = "BU") %>%
summary_rows(columns = c("Below Market", "In Market", "Above Market"),
             fns = list("Subtotal" = ~sum(.))) %>%
summary_rows(columns = ends_with("_pct"),
             fns = list("Subtotal" = ~sum(.)),
             fmt = ~ fmt_percent(., decimals = 1)) %>% 
grand_summary_rows(columns = c("Below Market", "In Market", "Above Market"),
                   fns = list("Total" = ~sum(.))) %>%
grand_summary_rows(columns = ends_with("_pct"),
                   fns = list("Total" = ~sum(.)),
                   fmt = ~ fmt_percent(., decimals = 1)) %>% 
fmt_percent(columns = ends_with("_pct"),
            decimals = 1)

Table that is incomplete, but directionally correct so far.

But here's my problem:

I wanted to do a cols_merge() to show the percentages inside parentheses directly next to the counts and condense the table. This works great for the individual rows, but I cannot get it to apply to the subtotal or total rows. Here's the code I have and the output I don't like.

gt_out2 <- gt_out1 %>% 
cols_merge(columns = c("Below Market", "Below Market_pct"),
           pattern = "{1} ({2})") %>% 
cols_merge(columns = c("In Market", "In Market_pct"),
           pattern = "{1} ({2})") %>% 
cols_merge(columns = c("Above Market", "Above Market_pct"),
           pattern = "{1} ({2})")

Percentages missing on the subtotal and total rows.

Closing

I did look at gtsummary which looks like it would do something similar to what I'm asking here, but I am just learning gt() and it seems like I might have better control over the formatting here, plus I couldn't quite figure out the aggregation I needed for the percentages and layout with gtsummary. Any help or suggestions are appreciated. Thank you.

1

There are 1 best solutions below

3
On BEST ANSWER

A data.frame is generated same way you pointed in your question. Note that the column names contains underscores instead spaces:

library(tidyverse)
library(gt)

df <- tibble(country_lane = c("CA-US", "MX-CA", "MX-US", "US-CA", 
                              "US-MX", "US-US"), 
             Above_Market = c(2L, 3L, 33L, 3L, 3L, 54L
             ), 
             Below_Market = c(18L, 0L, 2L, 14L, 3L, 370L), 
             In_Market = c(15L, 0L, 25L, 15L, 3L, 240L),
             BU = c("US", "CAN", "US", "CAN", "US", "US"))

Columns for the percentages from the total is calculated similar way you did it.

total <- sum(df %>% select(Above_Market,Below_Market,In_Market))
df.pct <- df %>% mutate(across(c("Above_Market","Below_Market","In_Market"), ~ .x/total, .names = "{.col}_pct"))

After a pivot_longer transformation of the data.frame, Subtotal and Total rows are calculated.

df.long <- df.pct %>% pivot_longer(-c(country_lane,BU),names_to = "Market", values_to = "values")
df.subtotal <- rbind(df.long, 
                     df.long %>% group_by(Market,BU) %>% summarise(values=sum(values)) %>% mutate(country_lane="Subtotal"),
                     df.long %>% group_by(Market) %>% summarise(values=sum(values)) %>% mutate(country_lane="Total",BU=NA_character_)) %>%
                     pivot_wider(names_from = "Market", values_from = "values") 

Here I show how would be the last rows of df.subtotal data.frame

tail(df.subtotal)
# A tibble: 6 × 8
  country_lane BU    Above_Market Below_Market In_Market Above_Market_pct Below_Market_pct In_Market_pct
  <chr>        <chr>        <dbl>        <dbl>     <dbl>            <dbl>            <dbl>         <dbl>
1 US-CA        CAN              3           14        15          0.00374          0.0174        0.0187 
2 US-MX        US               3            3         3          0.00374          0.00374       0.00374
3 US-US        US              54          370       240          0.0672           0.461         0.299  
4 Subtotal     CAN              6           14        15          0.00747          0.0174        0.0187 
5 Subtotal     US              92          393       283          0.115            0.489         0.352  
6 Total        NA              98          407       298          0.122            0.507         0.371  

gt object is built removing Total row and considering BU column as groupname_col:

gt_out1 <- df.subtotal %>% filter(country_lane != "Total") %>% 
  gt(rowname_col = c("country_lane"),  groupname_col = "BU") %>%
  rows_add(.list=as.list(df.subtotal %>% filter(country_lane=="Total"))) %>%
  fmt_percent(columns = ends_with("_pct"),
              decimals = 1)

Finally, cols_merge transformation is done

gt_out2 <- gt_out1 %>% 
  cols_merge(columns = c("Below_Market", "Below_Market_pct"),
             pattern = "{1} ({2})") %>% 
  cols_merge(columns = c("In_Market", "In_Market_pct"),
             pattern = "{1} ({2})") %>% 
  cols_merge(columns = c("Above_Market", "Above_Market_pct"),
             pattern = "{1} ({2})")