I am working on a mediation analysis using survival analysis techniques. This requires bootstrapping using a for loop. My challenge is that this model will take a long time to run and I would like to optimize it if possible. I have heard turning this into an lapply will be helpful or parallelizing this code may be helpful as well.
This is my code below. You can ignore system.time.
system.time(
for(i in 1:ntrials-1) {
ntrials <- 10000
npat <-4413
sampledata<-mgfipdma1[sample(nrow(mgfipdma1),npat, replace=TRUE),]
direct <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + gf + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
total <- coxs <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
indirect2 <-exp(total-direct)
indirect<-rbind(indirect, indirect2)
}
)
quantile(indirect, c(0.025, 0.50, 0.975))
Thank you for any help or suggestions you can provide.
So far, I have tried various mediation packages like mediate, lavaan, but none have been able to work with a mixed effects cox proportional hazards model with covariates like the one I am using (as far as I have been able to tell). As a result, I am using this for loop but it takes a significant amount of time to run (weeks for 10k iterations). I'm just wondering if there is any way I can optimize my code to speed up this process without needing to use the cloud.
To parallelize tasks You may use mclapply from parallel package.
Your code may be like the following: