I’ve successfully parallelised a function – let’s call it AddOne
- via the doParallel
package, foreach
and %dopar%
and I’m familiar with the .packages
and .export
arguments to foreach
.
My problem is that I would like AddOne
, instead of being a “stand-alone” function, to be an element of a list and in this case, I can’t get things working. Specifically, if AddOne
calls a subroutine AddOneSubroutine
then AddOneSubroutine
does not get found in the “worker” environments even though it is “exported”.
I’m using Windows 10 and R.version
yields:
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 4.1
year 2017
month 06
day 30
svn rev 72865
language R
version.string R version 3.4.1 (2017-06-30)
nickname Single Candle
The doParallel version I have is 1.0.10. Here’s some code that demonstrates the problem as succinctly as I could.
library(doParallel)
if(!exists("Registered")){
registerDoParallel(cores = detectCores(logical = TRUE))
Registered = TRUE
}
AddOne<-function(x){AddOneSubroutine(x)}
AddOneSubroutine <-function(x){x+1}
MyList<-list()
MyList$f<-AddOne
# Not using parallel environments, works correctly when calling AddOne 3 times
Result1 = foreach(i = 1:3) %do% AddOne(i)
Result1
# Not using parallel environments, works correctly when calling MyList$f 3 times
Result2 = foreach(i = 1:3) %do% MyList$f(i)
Result2
# Using parallel environments, works correctly when calling AddOne 3 times,
# despite not explicitly using the .export argument to export AddOneSubroutine
Result3 = foreach(i = 1:3) %dopar% AddOne(i)
Result3
# Using parallel environments, fails when calling MyList$f with error
# "could not find function "AddOneSubroutine"", even though that function is "exported"
Result4 = foreach(i = 1:3,.export = "AddOneSubroutine") %dopar% MyList$f(i)
Result4
What am I failing to understand?
For full reproducibility everywhere, let us make sure we use workers in background R sessions:
Now, I haven't dug into the doParallel backend code in much detail, so I'm not 100% sure what causes this problem. But we know that
AddOneSubroutine
is indeed exported, which you can see if you useforeach(..., .verbose = TRUE)
, or simply do:However, when calling the function
MyList$f()
it is not found, which can be confirmed by using:So, why is
AddOneSubroutine
not in the frames searched from withinMyList$f
? This could be because doParallel does not get the environment forMyList$f
correct. A workaround that seems to work, is the following hack:Unfortunately, it's not very neat nor very convenient.
As an alternative, the doFuture backend (I'm the author) seems to work a slightly better:
PS. You're particular use case interested me because ideally
AddOneSubroutine
should have been exported automatically when using doFuture but it didn't. I've found a fix for this in the underlying globals package (I'm the author) but I need to think more about it before publishing it.My details: