I have collected count data from a number of subjects over time, before and after a treatment was applied. There are two treatment/group levels (placebo/control) and 3 time points (pre/post1/post2).I have built a generalised linear mixed effects model using GLMMadaptive in R to estimate and compare treatment effects over time.
I would like some advice regarding my interpretation of marginal_coefs() vs. predict(). I have provided some background to my problem below, and then my questions at the end.
The model I have fitted is as follows:
fm1 <- mixed_model(count ~ time*group,
random = ~ 1 | id,
data = data_in,
family = poisson())
The output from marginal_coefs() looks like this:
mcoef <- marginal_coefs(fm1, std_errors = TRUE, cores = 5)
mcoef
Estimate Std.Err z-value p-value
(Intercept) 2.9492 0.0542 54.4191 < 1e-04
time2 0.4186 0.0578 7.2459 < 1e-04
time3 0.4536 0.0538 8.4259 < 1e-04
group 0.0243 0.0728 0.3343 0.73814
time2:group -0.0233 0.0788 -0.2961 0.76715
time3:group -0.0167 0.0734 -0.2279 0.81976
Exponentiating these back to the response scale:
mcoef$betas %>% exp()
(Intercept) time2 time3 group time2:group time3:group
19.0902344 1.5197587 1.5739196 1.0246412 0.9769296 0.9834229
It is my understanding that in order to obtain estimates of the counts at each time point for each group (i.e., response scale), I can multiply these numbers together:
group0_pre = (Intercept)
group0_post1 = (Intercept) * time2
group0_post2 = (Intercept) * time3
group1_pre = (Intercept) * group
group1_post1 = (Intercept) * group * time2 * time2:group
group1_post2 = (Intercept) * group * time2 * time3:group
The exponentiated marginal_coefs() estimates also provide the interpretation for the group, time, and interaction effects in terms of ratios of counts across different time points. It is my understanding that these are the average effects at the population level, so they should be appropriate to use for inference and hypothesis testing (i.e., differences in treatment effects over time). I also believe the model-scale confidence intervals can be exponentiated back to the response scale, to provide confidence intervals for the ratios.
I would also like to provide readers with a plot of the summary statistics, showing the means at each time point for each group +/- SE. The only way I have found to obtain SE estimates for the counts on the response scale is by using the predict() function, as per below:
nDF <- expand.grid(time=as.factor(1:3),
group=1:2)
pred <-
predict(fm1,
newdata = nDF,
type_pred = "response",
type = "marginal",
se.fit = TRUE)
pred
$pred
1 2 3 4 5 6
19.56064 29.04163 30.27652 20.04264 29.07074 30.50830
$se.fit
1 2 3 4 5 6
0.05056784 0.04782917 0.04156674 0.11305274 0.11100000 0.09725224
However, I am wondering why these estimates are different to those obtained from marginal_coefs()? Is this is due to the simulation component, or have I misunderstood how to use these 2 different functions?
And is my use of the response estimates from predict() and their SE's appropriate to report as summary statistics for each group at each time point?