How to get reliably a complete sessionInfo in mclapply/pbmclapply?

41 Views Asked by At

I'm using pbmcapply::pbmclapply (in Rscript, i.e. non-interactive if that matters) to see a progress bar during multthreading. Now I tried to pass sessionInfo() as an attribute and noticed that "other attached packages" and "loaded via a namespace (and not attached)" are not correctly shown. What's the reason for this? Can it be fixed? How?

Example

As this does not seem to depend on whether multiple cores are used, we can use mc.cores=1L in the examples (and thus also make it available under Windows).

f1 <- \(x) {
  pbmcapply::pbmclapply(x, \(x) {
    library(Matrix)
    var(MASS::mvrnorm(n=1000, rep(0, 2), matrix(c(10,3,3,2),2,2)))
  }, mc.cores=1L)
  `attr<-`(x, 's_info', sessionInfo())
}

f2 <- \(x) {
  parallel::mclapply(x, \(x) {
    library(Matrix)
    var(MASS::mvrnorm(n=1000, rep(0, 2), matrix(c(10,3,3,2),2,2)))
  }, mc.cores=1L)
  `attr<-`(x, 's_info', sessionInfo())
}

pbmcapply::pbmclapply skips the Matrix and MASS packages,

> attributes(f1(1))$s_info
[...]
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.3.2  parallel_4.3.2  tools_4.3.2     pbmcapply_1.5.1

whereas parallel::mclapply shows their use correctly.

> attributes(f2(1))$s_info
[...]
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Matrix_1.6-4

loaded via a namespace (and not attached):
[1] MASS_7.3-60     compiler_4.3.2  parallel_4.3.2  tools_4.3.2     grid_4.3.2     
[6] pbmcapply_1.5.1 lattice_0.22-5 

Info: Using pbmcapply_1.5.1 on R version 4.3.2 (2023-10-31)/Ubuntu 22.04.3+6 LTS and R version 4.3.1 (2023-06-16)/AlmaLinux 9.2 (Turquoise Kodkod).

Update

Just noticed, that this apparently also can happen with parallel::mclapply. How to get a reliable sessionInfo stored in my object when multithreading?

0

There are 0 best solutions below