Want to preface this with heaps of appreciate for gtsummary -- wonderful package.

After using tidymodels, GLM, and gtsummary for a while, I've been trying to understand gtsummary's computations for GLM model performance and confidence intervals.

Can the anyone and/or Dr. Sjoberg + gtsummary team explain the following questions 1 & 2

Question 1: Why are standard errors different when using broom::tidy() vs. parameters::model_parameters() functions to extract model residual data? (Bolded text in print outs shows differences)

library(gtsummary)
library(parameters)
library(rsample)
library(broom)


trial2 <- trial %>% select(age, grade, response, trt) %>%
  drop_na()


model_trial2 <- glm(response ~ age + grade + trt, 
                    data = trial2, 
                    family=binomial(link="logit"))

broom::tidy(model_trial2, exponentiate = TRUE)

# # A tibble: 5 × 5
# term            estimate  std.error statistic  p.value
# <chr>           <dbl>     <dbl>     <dbl>     <dbl>
# 1 (Intercept)    0.184    **0.630**    -2.69      0.00715
# 2 age            1.02     0.0114    1.67      0.0952 
# 3 gradeII        0.852    **0.395**    -0.406     0.685  
# 4 gradeIII       1.01     0.385     0.0199    0.984  
# 5 trtDrug B      1.13     **0.321**     0.387     0.699  

preadmission_model_parameters <- model_trial2 %>% parameters::model_parameters(exponentiate = TRUE) 


preadmission_model_parameters
# Parameter    | Odds Ratio |   SE |       95% CI |     z |     p
# ---------------------------------------------------------------
# (Intercept)  |       0.18 | **0.12** | [0.05, 0.61] | -2.69 | 0.007
# age          |       1.02 | 0.01 | [1.00, 1.04] |  1.67 | 0.095
# grade [II]   |       0.85 | **0.34** | [0.39, 1.85] | -0.41 | 0.685
# grade [III]  |       1.01 | 0.39 | [0.47, 2.15] |  0.02 | 0.984
# trt [Drug B] |       1.13 | **0.36** | [0.60, 2.13] |  0.39 | 0.699


Question 2: (a) What method does gtsummary use to produce confidence intervals? (b) can the user define (stratified or unstratified) k-fold cross-validation or bootstraps to produce confidence intervals?

(Bolded differences in confidence intervals for the reg_intervals() bootstrapped confidence intervals and the unknown method gtsummary tbl_regression() confidence intervals.)

library(gtsummary)
library(parameters)
library(rsample)
library(broom)

trial2 <- trial %>% select(age, grade, response, trt) %>%
  drop_na()

bootstraps(trial2, times = 10)

trial_bootrapped_confidence_intervals <- reg_intervals(response ~ age + grade + trt, 
                                                  data = trial2,
                                                  model_fn = "glm",
                                                  keep_reps = TRUE, 
                                                  family=binomial(link="logit"))

trial_bootrapped_confidence_intervals_exp <- trial_bootrapped_confidence_intervals %>%
  select(term:.alpha) %>%
  mutate(across(.cols = c(.lower, .estimate, .upper), ~exp(.))) %>%
  as_tibble()

trial_bootrapped_confidence_intervals_exp
# # A tibble: 4 × 5
# term       .lower .estimate .upper .alpha
# <chr>       <dbl>     <dbl>  <dbl>  <dbl>
# 1 age        0.997     1.02    1.04   0.05
# 2 gradeII    **0.400**     0.846   **1.86**   0.05
# 3 gradeIII   0.473     1.01    2.10   0.05
# 4 trtDrug B  0.600     1.14    2.22   0.05

model_trial2_tbl_regression <- 
  glm(response ~ age + grade + trt, 
      data = trial2,
      family=binomial(link="logit")) %>%
  tbl_regression(
    exponentiate = T
  ) %>%
  add_global_p() 

model_trial2_tbl_regression_metrics <- model_trial2_tbl_regression$table_body %>%
  select(
    label, 
    estimate, 
    std.error, 
    statistic, 
    conf.low , 
    conf.high, 
    p.value
  )

model_trial2_tbl_regression_metrics
# A tibble: 8 × 7
# label                  estimate std.error statistic conf.low conf.high p.value
# <chr>                     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>   <dbl>
# 1 Age                       1.02     0.0114    1.67      0.997      1.04  0.0909
# 2 Grade                    NA       NA        NA        NA         NA     0.894 
# 3 I                        NA       NA        NA        NA         NA    NA     
# 4 II                        0.852    0.395    -0.406     **0.389**      **1.85** NA     
# 5 III                       1.01     0.385     0.0199    0.472      2.15 NA     
# 6 Chemotherapy Treatment   NA       NA        NA        NA         NA     0.699 
# 7 Drug A                   NA       NA        NA        NA         NA    NA     
# 8 Drug B                    1.13     0.321     0.387     0.603      2.13 NA     

1

There are 1 best solutions below

0
On

The issue is with the exponentiation (applied as the family is binomial). Broom::tidy does not exponentiate the standard errors but parameters does. You can see this with broom::tidy(model_trial2, exponentiate = TRUE) and broom::tidy(model_trial2, exponentiate = FALSE), which return the same standard errors. parameters::model_parameters(exponentiate = TRUE) and parameters::model_parameters(exponentiate = FALSE) return different standard errors. When exponentiate is FALSE for parameters, the standard errors match. This is discussed in Check exponentiate behavior in tidy methods #422

To create a custom tidier for gtsummary, see FAQ + Gallery