Include a '2nd-level' user function in carrier::crate that can be used in furrr::future_map

45 Views Asked by At

I followed this article to create a crate function which calls a user function which itself calls another user function. I supplied both user functions to crate. This seems to work fine as the user functions show up in the printout of the crate function. But when I use the crate function in future_map the inner user function cannot be found.

library(furrr)

future::plan(multisession, workers = 3)

inner_foo <- function(x){
  x^2
}

outer_foo <- function(x){
  inner_foo(x)
}

outer_crate <- carrier::crate(
  function(x) outer_foo(x),
  outer_foo = outer_foo,
  inner_foo = inner_foo
)

# this works
outer_crate(3)

# shows that inner_foo is packaged with outer_crate
outer_crate

# does not work as inner_foo not found
future_map(1:3, outer_crate,
           .options = furrr_options(globals = FALSE))

# works if inner_foo is manually supplied
future_map(1:3, outer_crate,
           .options = furrr_options(globals = "inner_foo"))

This problem only occurs when there are workers set with plan, otherwise it works as expected.

1

There are 1 best solutions below

0
HenrikB On BEST ANSWER

This is a known issue. It happens because the inner_foo() function lives in R's global environment. The global environment is special, because its content is not carried along when exporting objects to parallel workers.

The solution is to have inner_foo() live in the environment of the function that uses it, i.e. the environment of outer_foo().

There are two ways to achieve this. The first approach is:

outer_foo <- local({
  inner_foo <- function(x){
    x^2
  }
  
  function(x){
    inner_foo(x)
  }
})

The second approach is to "fix it up" after creating the functions:

inner_foo <- function(x){
  x^2
}

outer_foo <- function(x){
  inner_foo(x)
}

environment(outer_foo) <- new.env(parent = environment(outer_foo))
environment(outer_foo)$inner_foo <- inner_foo

Ideally, the Futureverse would do this automagically, but it's a complex problem with its own issues, so that's still on the roadmap.