I am building a package for R which I want to be able to be cross-platform. I am developing under Linux, and the function mclapply
will be used from the parallel
package. This package is not supported for Windows (which uses doParallel
). I really like the parallel
package though for it's simplicity and speed, and I do not know if this should be a reason to have 2 different versions available of the package for CRAN, for the separate OS (seems like extra work to maintain), not to mention if it is even allowed.
Thoughts?
Also, for now I am regarding parallel
's
mclapply(ldata, function(x), mc.cores=cores)
to be equivalent of doParallel
's
cl <- makeCluster(cores)
parLapply(cl, ldata, function(x))
Is that correct?
First, both
mclapply
andparLapply
are in theparallel
package, althoughmclapply
doesn't actually run in parallel on Windows.parLapply
runs in parallel on all supported platforms, but isn't always as efficient asmclapply
. ThedoParallel
package is used with theforeach
package, and acts as an adapter to theparallel
package.To write a package that works on both Windows and non-Windows, you have a variety of reasonable options:
parLapply
since it works everywhereparLapply
on Windows andmclapply
elsewheredoParallel
withforeach
The
doParallel
package is convenient because it makes use ofmclapply
on non-Windows platforms. For example:This uses
mclapply
on Linux and Mac OS X, but will automatically create a PSOCK cluster object behind the scenes on Windows. The use ofpreschedule=TRUE
(added indoParallel
1.0.3) will causedoParallel
to preschedule the tasks usingclusterApply
internally, much likeparLapply
.Note that if you explicitly create and register a cluster object, then
mclapply
will not be used, regardless of the platform. It will work fine, but may not be as efficient. To usemclapply
, you must callregisterDoParallel
with a numeric argument, or no argument at all.You can look at the source code for the
boot
package for an example of how to use eithermclapply
orparLapply
depending on your platform.