Fix variance components in a mixed model in R (glmmTMB or sommer)

71 Views Asked by At

I have a dataset with which I would like to predict response log(Y) based on treatment variables A, B and C, and grouping variable X, where each level is a different trial. For each observation of Y, I have an associated variance V (it’s a meta-analysis).

The formula for the conditional/fixed effect part of the model would be log(Y) ~ A + A:B + A:C

In case it’s relevant, treatment variables A and B are categorical and C is continuous numeric.

In the model, I need to include:

  • The V for each observation of Y, for which the variance component is fixed/kept constant
  • A random effect for the ‘trial by treatment interaction’ (X by A, B, C), for which the variance component is estimated

I understand that fixing variance components can be done in glmmTMB with the start and map arguments, and in also in sommer … but I don’t know how to specify this model in either package. No other questions on StackOverflow that I can find (e.g. here or here) seem to ask the question in quite the right way for me to make sense of it well enough to specify this model. Any guidance on how to specify this model would be much appreciated.

Update: now with a reproducible example incorporating Ben Bolker's suggested solution

#some made-up example data
data <- structure(list(A = c("H", "H", "H", "H", "H", "H", "H", "H", "H", "H", "J", "J", "J", "J", "J", "J", "J", "J", "J", "J"),
B = c("i", "i", "k", "k", "k", "i", "i", "i", "k", "i", "k", "i", "i", "k", "k", "i", "i", "k", "k", "k"),
C = c(5L, 5L,7L, 10L, 2L, 6L, 5L, 4L, 2L, 3L, 10L, 4L, 3L, 9L, 8L, 5L,1L, 10L, 3L, 9L),
X = c("x1", "x2", "x3", "x4", "x5", "x1","x2", "x3", "x6", "x7", "x3", "x3", "x1", "x8", "x9", "x9","x10", "x10", "x11", "x12"),
Y = c(7.7, 2.9, 24, 3.6, 7.8,7.1, 73, 5.7, 18, 5.2, 43, 4.7, 3.4, 12, 5.8, 88, 6.9, 9.4, 1.1, 31),
SD = c(2.566666667, 0.966666667, 8, 1.2, 1.95, 2.366666667, 18.25, 2.85, 9, 1.3, 10.75, 1.566666667, 1.133333333,4, 1.45, 44, 1.725, 2.35, 0.275, 15.5)), 
row.names = c(NA,-20L), class ="data.frame")

#make the observation ID
data$obs <- factor(seq(nrow(data)))

#run suggested solution (random effect tweaked to include C)
model <- glmmTMB(log(Y) ~ A + A:B + A:C + (A + A:B + A:C|X),
        family = gaussian,
        dispformula = ~ 0 + obs,
        start = list(betad = log(data$SD)),
        map = list(betad = factor(rep(NA, length(data$SD)))),
        data=data
        )
#the model does not converge - I expect because example dataset is too small for this model structure
1

There are 1 best solutions below

1
On BEST ANSWER

I believe you would want something like this:

Set up an observation-level factor (we'll need this to set the variance separately for each observation):

data$obs <- factor(seq(nrow(data)))

Fit the model with dispformula = ~0 + obs, set the starting values equal to known log(SD) values, and use map to fix them to those values:

glmmTMB(log(Y) ~ A + A:B + A:C + (A+A:B|X),
        family = gaussian,
        dispformula = ~ 0 + obs,
        start = list(betad = log(sdvec)),
        map = list(betad = factor(rep(NA, length(sdvec)))
)

(I'm not sure I got the random-effect term right, I found your statement about the desired RE structure unclear ...)

This may not be quite right, as there was no reproducible example ...