Parallel computing with R in a SLURM cluster

535 Views Asked by Ding Li At 17 October 2025 at 07:30

I need to do a model estimation using the MCMC method on a SLURM cluster (the system is CentOS). The estimation takes a very long time to finish.

Within each MCMC interaction, there is one step taking a particularly long time. Since this step is doing a lapply-loop (around 100000 loops, 30s to finish all loops), so as far as I understand, I should be able to use parallel computing to speed up.

I tried several packages (doMC, doParallel, doSNOW) together with the foreach framework. The setup is

  parallel_cores=8
  
  #doParallel
  library(doParallel) 
  cl<-makeCluster(parallel_cores)
  registerDoParallel(cl)
  
  #doMC
  library(doMC)
  registerDoMC(parallel_cores)
  
  
  #doSNOW, this is also fast
  library(doSNOW)
  ml<-makeCluster( parallel_cores)
  registerDoSNOW(cl)


  #foreach framework
  #data is a list
  data2=foreach(
        data_i=
          data,
        .packages=c("somePackage")
      ) %dopar% {
        data_i=some_operation(data_i)
        
        list(beta=data_i$beta,sigma=data_i$sigma)

      }

Using a doMC, the time for this step can be reduced to about 9s. However, as doMC is using shared memory and I have a large array to store estimate results, I quickly ran out of memory (i.e. slurmstepd: error: Exceeded job memory limit).

Using doParallel and doSNOW, the time for this step even increased, to about 120s, which sounds ridiculous. The mysterious thing is that when I tested the code in both my Mac and Windows machines, doParallel and doSNOW actually gave similar speed compared to doMC.

I'm stuck and not sure how to proceed. Any suggestions will be greatly appreciated!

Original Q&A

Parallel computing with R in a SLURM cluster

There are 0 best solutions below

Related Questions in R

Related Questions in FOREACH

Related Questions in PARALLEL-PROCESSING

Related Questions in SLURM

Related Questions in DOPARALLEL

Trending Questions

Popular # Hahtags

Popular Questions