I have a dataset from an RCT with two groups and continous outcomes assessed at three timepoints (baseline, follow-up 1, and follow-up 2). The data was in long form, and after reading guides about how to set up analyses for RCT data involving 3+ timepoints (e.g., https://link.springer.com/book/10.1007/978-3-030-81865-4) I created a dataframe with the baseline assessment rows removed and included the assessment of the outcome at baseline as a new variable instead (same values for both timepoints per participant) so it could be as a covariate in a linear mixed model.
There are several other covariates in the dataset, but a simplified version of the model is:
mod <- lmer(outcome ~ timepoint + treatmentgroup + baseline + covariate, (1|site/participantid), data = data_no_base)
The general question was whether the intervention works. Since the model only involves follow-up 1 and 2 in the outcome data and has baseline scores as a covariate, I would have thought that the treatmentgroup
main effect would address this.
However, I've also been asked to contrast scores between the groups at all the different timepoints, including baseline. Since baseline scores aren't in the model, I can't use emmeans
to contrast the marginal means. So would it be appropriate to run a new model on the full original data to do this? e.g.,
mod2 <- lmer(outcome ~ timepoint + treatmentgroup + covariate, (1|site/participantid), data = data)
When I do this, the results contrasting each group at different timepoints are the same (e.g., see below control 0 vs treat 0 and control 1 vs treat 1), which doesn't seem right:
emmeans(mod2, specs = pairwise ~ treatmentgroup:timepoint, adjust = "none")
$emmeans
treatmentgroup timepoint emmean SE df lower.CL upper.CL
Control 0 4.58 0.368 6.97 3.71 5.45
Treatment 0 5.62 0.359 18.36 4.86 6.37
Control 1 5.19 0.372 7.72 4.33 6.05
Treatment 1 6.23 0.357 18.81 5.48 6.98
Control 2 4.80 0.374 7.89 3.94 5.67
Treatment 2 5.84 0.359 19.02 5.09 6.59
Results are averaged over the levels of: covariate
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
Control timepoint0 - Treatment timepoint0 -1.039 0.343 2.39 -3.026 0.0754
Control timepoint0 - Control timepoint1 -0.612 0.254 191.26 -2.409 0.0170
Control timepoint0 - Treatment timepoint1 -1.651 0.422 5.50 -3.911 0.0093
Control timepoint0 - Control timepoint2 -0.224 0.255 191.58 -0.877 0.3818
Control timepoint0 - Treatment timepoint2 -1.263 0.422 5.49 -2.992 0.0271
Treatment timepoint0 - Control timepoint1 0.427 0.432 6.27 0.987 0.3601
Treatment timepoint0 - Treatment timepoint1 -0.612 0.254 191.26 -2.409 0.0170
Treatment timepoint0 - Control timepoint2 0.815 0.434 6.36 1.881 0.1063
Treatment timepoint0 - Treatment timepoint2 -0.224 0.255 191.58 -0.877 0.3818
Control timepoint1 - Treatment timepoint1 -1.039 0.343 2.39 -3.026 0.0754
Control timepoint1 - Control timepoint2 0.389 0.258 186.69 1.509 0.1329
Control timepoint1 - Treatment timepoint2 -0.650 0.429 5.97 -1.518 0.1801
Treatment timepoint1 - Control timepoint2 1.428 0.430 6.07 3.320 0.0157
Treatment timepoint1 - Treatment timepoint2 0.389 0.258 186.69 1.509 0.1329
Control timepoint2 - Treatment timepoint2 -1.039 0.343 2.39 -3.026 0.0754
Results are averaged over the levels of: covariate
Degrees-of-freedom method: kenward-roger
I noticed that when I add a timepoint*treatmentgroup
interaction into the model that all emm contrasts become unique, but I'm not sure whether this is appropriate.
In general I would appreciate any guidance to set me straight with these analyes. I've been cobbling suggestions together from various sources including guides about how to do it in STATA, but I've yet to find a guide for this specific kind of analysis in R.