How can I optimize this for loop to run faster using lapply and parallelization?

56 Views Asked by At

I am working on a mediation analysis using survival analysis techniques. This requires bootstrapping using a for loop. My challenge is that this model will take a long time to run and I would like to optimize it if possible. I have heard turning this into an lapply will be helpful or parallelizing this code may be helpful as well.

This is my code below. You can ignore system.time.

system.time(
        
        for(i in 1:ntrials-1) {
                ntrials <- 10000
                npat <-4413
                sampledata<-mgfipdma1[sample(nrow(mgfipdma1),npat, replace=TRUE),]
                direct <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + gf + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
                total <- coxs <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
                indirect2 <-exp(total-direct)
                indirect<-rbind(indirect, indirect2)
        }
)

quantile(indirect, c(0.025, 0.50, 0.975))

Thank you for any help or suggestions you can provide.

So far, I have tried various mediation packages like mediate, lavaan, but none have been able to work with a mixed effects cox proportional hazards model with covariates like the one I am using (as far as I have been able to tell). As a result, I am using this for loop but it takes a significant amount of time to run (weeks for 10k iterations). I'm just wondering if there is any way I can optimize my code to speed up this process without needing to use the cloud.

1

There are 1 best solutions below

0
Ivalbert Pereira On

To parallelize tasks You may use mclapply from parallel package.

Your code may be like the following:

library(parallel)

ncores <- detectCores()
indirect <- mclapply(1:ntrials-1, 
             function(i){
               ntrials <- 10000
               npat <-4413
               sampledata<-mgfipdma1[sample(nrow(mgfipdma1),npat, replace=TRUE),]
               direct <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + gf + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
               total <- coxs <- fixef(coxme(Surv(deathcont/365.25,dth)~ sex + agec10 + ht + smk + diab + offpump + gsum + agsum + statin + LAD + seq + (1|trial), data= mgfipdma1))
               indirect2 <-exp(total-direct)
               return(indirect2)
             }, mc.cores = ncores)

indirect <- do.call(rbind, indirect)