modelsummary not reading columns from custom tidy and glance functions

460 Views Asked by At

Apologies for not having a full reproducible example here, but the models take a fairly long time to run and the model objects are fairly large.

I have a chunk of code that to generate a table from three multilevel multinomial logit models run using brms. This is a chunk of code that worked previously, and it's been a couple of years since I first assembled the code, but when I try to run it I'm running into problems getting modelsummary to read the contents of the custom tidy and glance functions.

There's a lot here, but here's the code. I've tried to use notes to break down what's going on at each stage. The gist of it is that I read in the models, make a list, assign a custom class to the three models, create the custom tidy and glance functions, and then run the modelsummary function at the end. It should generate a modelsummary table with 9 columns, 3 for each model.


# Read in three files
m.c.t1 <- readRDS(here::here("../Book/Output/Chapter-Contact/m.c.t1.rds"))
m.c.g1 <- readRDS(here::here("../Book/Output/Chapter-Contact/m.c.g1.rds"))
m.c.p1 <- readRDS(here::here("../Book/Output/Chapter-Contact/m.c.p1.rds"))

# Create the list of models. Omit model names because modelsummary pastes them in the table header for every column and makes it too wide. Will add single header to span three relevant columns manually later.
mod.list <- list(" " = m.c.t1,
                 " " = m.c.p1,
                 " " = m.c.g1)

# Assign custom class to each model in the list. Maybe not necessary anymore?
for (i in seq_along(mod.list)){
  class(mod.list[[i]]) <- c("custom", class(mod.list[[i]]))
}

#### Pay close attention to how brms and/or parameters is treating the % symbols in the income variable.Depending on the output the code below will need to be updated so the income variable isn't dropped.

# brms collapes the coefficient/variable name in with the outcome level of multinomial models. Use a custom tidy method to extract level names and put them into a separate columns.
tidy_custom.brmsfit <- function(x, conf.level=.95, ...) {
  out = parameters::parameters(x, ci=conf.level, verbose=FALSE) %>%
        parameters::standardize_names(style="broom") %>%
        mutate(group = case_when(grepl("mudk", term) ~ "Don't Know",
                                 grepl("muneg", term) ~ "Negative",
                                 grepl("mupos", term) ~ "Positive"),
               term = gsub("dk|neg|pos", "", term)) %>% # remove outcome level prefix brms attaches to variable names
    as.data.frame()

  return(out)
}

# Run custom glance function to limit gof output.
glance_custom.brmsfit <- function(x, ...) {
  ret <- tibble::tibble(N = summary(x)$nobs)
  return(ret)
}

# Run through the model list and extract varying intercept/effect components and dispersion metrics. Creates a new set of rows that can be inserted at the bottom of the table.
rows <- lapply(mod.list, function(x){

  temp.sd <- parameters::parameters(x, effect = "random")  %>% 
    filter(grepl(".*sd.*", Parameter)) %>% 
    filter(grepl(".*persYes.*|.*nonpersYes.*|.*Intercept.*", Parameter)) %>% 
    dplyr::select(Parameter, Median) %>% 
    dplyr::mutate(Median = as.character(round(Median, 2)),
      y.level = case_when(
      grepl(".*mudk.*", Parameter) ~ "dk",
      grepl(".*mupos.*", Parameter) ~ "pos",
      grepl(".*muneg.*", Parameter) ~ "neg",
    ),
    Parameter = gsub("_mudk_|_mupos_|_muneg_", "_", Parameter)) %>% 
    pivot_wider(id_cols = Parameter,
                values_from = Median,
                names_from = y.level) %>% 
    mutate(Parameter = case_when(
      grepl(".*Intercept.*", Parameter) ~ "sd(Intercept)"
    ))
  
  temp.obs <- tibble::tribble(~Parameter, ~dk, ~neg, ~pos,
                              "N", as.character(nobs(x)), "", "",
                              "Group", "Country", "", "",
                              "\\# Groups", as.character(length(unique(x$data$country))), "", "")
  

  temp.com <- bind_rows(temp.obs, temp.sd)

  return(temp.com)
  
  }
)

# Combine the three outputs from the previous lapplycommand and filter out the relevant columns.
rows.com <- bind_cols(rows[[1]], rows[[2]], rows[[3]]) %>% 
  dplyr::select(1, 2, 3, 4, 6, 7, 8, 10, 11, 12) 

# Name the columns to facilitate appending later.
names(rows.com) <- c("term", "col1", "col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9")

# Turn off GOF stuff
gof.check <- modelsummary::gof_map
gof.check$omit <- TRUE

# Create a list of coefficient names and labels
coef.list <- tibble::tribble(~raw, ~clean,
                            "b_mu_contact_persYes", "Personal Contact: Yes",
                            "b_mu_contact_persDontknowDdeclinetoanswer", "Personal Contact: DK/Decline",
                            "b_mu_contact_nonpersYes", "Network Contact: Yes",
                            "b_mu_contact_nonpersDontknowDdeclinetoanswer", "Network Contact: DK/Decline",
                            "b_mu_benefit_persYes", "Personal Benefit: Yes",
                            "b_mu_benefit_persDontknowDdeclinetoanswer", "Personal Benefit: DK/Decline",
                            "b_mu_benefit_nonpersYes", "Network Benefit: Yes",
                            "b_mu_benefit_nonpersDontknowDdeclinetoanswer", "Network Benefit: DK/Decline",
                            "b_mu_age25to34years", "25-34",
                            "b_mu_age35to44years", "35-44",
                            "b_mu_age45to54years", "45-54",
                            "b_mu_age55to64years", "55-65",
                            "b_mu_ageAge65orolder", ">65",
                            "b_mu_income.5.cat21M40%", "21-40",
                            "b_mu_income.5.cat41M60%", "41-60",
                            "b_mu_income.5.cat61M80%", "61-80",
                            "b_mu_income.5.cat81M100%", "81-100",
                            "b_mu_genderFemale", "Female",
                            "b_mu_genderNonMbinary", "Non-binary",
                            "b_mu_genderNoneoftheabove", "None of the above",
                            "b_mu_minorityYes", "Minority: Yes",
                            "b_mu_minorityDeclinetoanswer", "Minoriy: Decline to answer",
                            "b_mu_religCatholicism", "Catholic",
                            "b_mu_religChristianityprotestant", "Protestant",
                            "b_mu_religBuddhism", "Buddhism",
                            "b_mu_religHinduism", "Hindu",
                            "b_mu_religIslam", "Islam",
                            "b_mu_religJudaism", "Judaism",
                            "b_mu_religShinto", "Shinto",
                            "b_mu_religMormonism", "Mormonism",
                            "b_mu_religLocal", "Local Religion",
                            "b_mu_religOther", "Other",
                            "b_mu_religDeclinetoanswer", "Religion: Decline to answer",
                            "b_mu_ed_z", "Education",
                            "b_mu_ideology_z", "Ideology",
                            "b_mu_troops_crime_persYes", "Personal Crime Experience: Yes",
                            "b_mu_american_inf_1DontknowDdeclinetoanswer", "Influence 1: DK/Decline",
                            "b_mu_american_inf_1Alittle", "Influence 1: A little",
                            "b_mu_american_inf_1Some", "Influence 1: Some",
                            "b_mu_american_inf_1Alot", "Influence 1: A lot",
                            "b_mu_american_inf_2DontknowDdeclinetoanswer", "Influence 2: DK/Decline",
                            "b_mu_american_inf_2Veryative", "Influence 2: Very negative",
                            "b_mu_american_inf_2Negative", "Influence 2: Negative",
                            "b_mu_american_inf_2Positive", "Influence 2: Positive",
                            "b_mu_american_inf_2Veryitive", "Influence 2: Very positive",
                            "b_mu_basecount_z", "Base count",
                            "b_mu_gdp_z", "GDP",
                            "b_mu_pop_z", "Population",
                            "b_mu_troops_z", "Troop deployment size",
                            "b_mu_Intercept", "Intercept")


# Create a list of coefficient group labels to be applied in the kableExtra portion of the table formatting.
coef.list <- coef.list %>% 
  mutate(group = case_when(
           grepl(".*ontact.*", raw) ~ "Contact Status",
           grepl(".*enefit.*", raw) ~ "Economic Benefits",
           grepl(".*age.*", raw) ~ "Age",
           grepl(".*ncome.*", raw) ~ "Income Quintile",
           grepl(".*gender.*", raw) ~ "Gender Identification",
           grepl(".*minority.*", raw) ~ "Minority Self-Identification",
           grepl(".*relig.*", raw) ~ "Religious Identification",
           grepl(".*ed_z.*", raw) ~ "Education",
           grepl(".*ideology_z.*", raw) ~ "Ideology",
           grepl(".*crime.*", raw) ~ "Crime Experience",
           grepl(".*inf_1.*", raw) ~ "American Influence (Amount)",
           grepl(".*inf_2.*", raw) ~ "American Influence (Quality)",
           TRUE ~ "Group-Level Variables"
         )) 


# Find how long the coefficient list is for the final table hline
last.line <- length(coef.list[[1]]) * 2

# Create a coefficient map object and index objects to streamline the row groupings in kableExtra formatting below.
coef_map <- setNames(coef.list$clean, coef.list$raw)
idx <- rle(coef.list$group)
idx <- setNames(idx$lengths  * 2, idx$values)

# Run modelsummary of the model list and format output with kableExtra. Original was in latex but this has been modified to facilitate bookdown output.
panel.1 <- modelsummary::modelsummary(mod.list,
                  estimate = "{estimate}",
                  statistic = "conf.int",
                  metrics = c("LOOIC", "ELPD"),
                  fmt = 2,
                  shape = term ~ model + group,
                  gof_map = gof.check,
                  coef_map = coef_map,
                  add_rows = rows.com,
                  stars = FALSE,
                  longtable = TRUE,
                  output = "kableExtra",
                  caption = "Bayesian multilevel multinomial logistic regressions. Population level effects.") %>% 
  kable_styling(font_size = 4, position = "left") %>%
  add_header_above(c(" ", "US Presence" = 3, "US People" = 3, "US Government" = 3)) %>% 
  group_rows(index = idx, bold = TRUE, background = "gray", color = "white", hline_after = TRUE) %>% 
  row_spec(last.line, hline_after = TRUE) 

When I run the code chunk I get the following warnings and messages:

Warning: `modelsummary could not extract goodness-of-fit statistics from a model
of class "custom". The package tried a sequence of 2 helper functions:

performance::model_performance(model)
broom::glance(model)

One of these functions must return a one-row `data.frame`. The `modelsummary` website explains how to summarize unsupported models or add support for new models yourself:

https://vincentarelbundock.github.io/modelsummary/articles/modelsummary.htmlWarning: Term name mismatch. Make sure all `tidy_custom` method returns a data frame
  with proper and matching term names.Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.9 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmupWarning: `modelsummary could not extract goodness-of-fit statistics from a model
of class "custom". The package tried a sequence of 2 helper functions:

performance::model_performance(model)
broom::glance(model)

One of these functions must return a one-row `data.frame`. The `modelsummary` website explains how to summarize unsupported models or add support for new models yourself:

https://vincentarelbundock.github.io/modelsummary/articles/modelsummary.htmlWarning: Term name mismatch. Make sure all `tidy_custom` method returns a data frame
  with proper and matching term names.Warning: `modelsummary could not extract goodness-of-fit statistics from a model
of class "custom". The package tried a sequence of 2 helper functions:

performance::model_performance(model)
broom::glance(model)

One of these functions must return a one-row `data.frame`. The `modelsummary` website explains how to summarize unsupported models or add support for new models yourself:

https://vincentarelbundock.github.io/modelsummary/articles/modelsummary.htmlWarning: Term name mismatch. Make sure all `tidy_custom` method returns a data frame
  with proper and matching term names.Error in map_estimates(tmp, coef_rename = coef_rename, coef_map = coef_map,  : 
  At least one of the term names in each model must appear in `coef_map`.

I've tried a few potential solutions. First, I updated {modelsummary} and reviewed the online documentation and updated a couple of things that appear to have changed since I first assembled the code.

I updated some of the arguments from {modelsummary}, including the metrics and shape arguments. The latter used to be group and the former was not included in the version I ran.

I updated the syntax for the custom tidy and glance functions to tidy_custom.brmsfit from tidy.custom and glance_custom.brmsfit from glance.custom. I've also removed any redundant/incorrect versions of these functions from the environment to ensure it's not trying to use an incorrect function.

I changed the grouping variable name for the outcome variable from y.level to group in the custom tidy function to try to match what I understand get_estimates() would generate.

I've also run the custom functions on a single model to ensure that the output is matching, as the final reference to the term name mismatch in the warnings above suggests that there is either something wrong with the column name or a varaible name. The usual suspects in terms of variable names that have caused errors (e.g. income) appear to be consistent in the output and in the variable names/labels that I've listed.

Finally, an example of the custom tidy function's output is as follows, which appears to be consistent with the casing and words modelsummary uses.

> test
                                              term      estimate conf.level      conf.low    conf.high        pd      rhat       ess      group
1                                   b_mu_Intercept  -2.213180000       0.95 -2.617802e+00 -1.816212750 1.0000000 1.0008532  9956.176 Don't Know
2                                   b_mu_Intercept  -0.461685500       0.95 -9.368145e-01  0.017789295 0.9709375 1.0002866  7535.687   Negative
3                                   b_mu_Intercept  -0.236163500       0.95 -5.795692e-01  0.116061925 0.9161875 1.0001725  9122.818   Positive

It's entirely possible I'm overlooking something or have failed to address something from an update, but any advice people can give is very welcome!

0

There are 0 best solutions below