The bias of the PQL estimation for GLMM

177 Views Asked by At

Hi I recently run into the problem of GLMM estimation, specifically, the bias of the variance components (and the fixed-effects) estimation using the PQL method. Two key references are Brewlow&Clayton 1993 and Lin&Breslow 1995, which both claims that PQL under-estimates the variance of the random-effect, see Section 4.3 and Table 1 of the first paper, and Eq (3) and Fig 4 of the second paper. Both papers are highly cited so this should be well accepted.

However when I tried to replicate the simulation, the results are totally different. I coded in R and used MASS::glmmPQL(). Below are some quick code.

## Brewlow&Clayton 1993, Table 1, D=D1, m=1
require(MASS)
require(nlme)
require(lme4)

set.seed(123)
sim.breslow <- matrix(NA, 200, 4+1)
expit <- function(a) 1/(1+exp(-a))    
K = 100         # 100 clusters
nn = rep(7, K)  # size of each cluster
N = sum(nn)
pid <- rep(1:K, nn)  # cluster id
xx = rep(c(1,0), each=K*7/2)
tt = rep(seq(-3,3), K)
X = cbind(tt, xx, tt*xx)
for(ii in 1:200){     # 200 reps
  g.t <- rnorm(K, 0, 1)   # random intercept
  lp = -2.5 + X %*% c(1,-1, -0.5) + rep(g.t, each=7)  # linear predictor
  y = rbinom(N, 1, expit(lp))
  dd = data.frame(y,X,pid=pid)
  names(dd)[2:4] <- c('tt', 'xx', 'xt')
  fit.pql = tryCatch(glmmPQL(fixed = y ~ tt+xx+xt, random = ~1|pid, family = 'binomial', data = dd, verbose=F),
                     error = function(e) NA)
  if(length(fit.pql)!=1){
    b.pql <- fit.pql$coef$fixed
    D.pql <- as.numeric(VarCorr(fit.pql)[1,1])
  }else{
    b.pql <- rep(NA, 4)
    D.pql <- rep(NA, 1)
  }
  sim.breslow[ii,] <- c(b.pql, D.pql)
cat('.')
}

res = apply(sim.breslow, 2, mean) 
res = cbind(c(-2.5, 1,-1, -0.5, 1), res)
row.names(res) <- c('a0', 'a1', 'a2', 'a3', 's00')
colnames(res) <- c('true', 'PQL')
round(res, 2)
#     true   PQL
# a0  -2.5 -2.61
# a1   1.0  1.04
# a2  -1.0 -0.93
# a3  -0.5 -0.52
# s00  1.0  1.72

Notice that the var component s00 is over-estimated (1.72 compared to the truth 1.0), this is different from the paper claims (under-estimated as 0.68 while the truth is 1.0). I also tested other settings in the two papers and the same conclusion, I can post more simulation code if necessary. I checked the source code of glmmPQL and seems no problem.

So my questions are

  1. Why glmmPQL result differs from the conclusion the references claimed? Did I miss anything?
  2. Are there any other functions I can use to cross-validate? MASS::glmmPQL is the only one I can find to do PQL.
  3. BTW, anyone aware of any packages that do the bias correction of Lin&Breslow 1995?

Thanks!

0

There are 0 best solutions below