attaching packages to a "temporary" search path in R

234 Views Asked by At

Inside a function, I am sourcing a script:

f <- function(){
  source("~/Desktop/sourceme.R") # source someone elses script
  # do some stuff to the variables read in
}
f()
search() # library sourceme.R attaches is all the way in the back!

and unfortunately, the scripts that I am sourcing are not fully under my control. They make calls to library(somePackage), and it pollutes the search path.

This is mostly a problem if the author of sourceme.R expects the package that he/she is attaching to be at the top level/close to the global environment. If I myself have attached some package that masks some of the function names he/she is expecting to be available, then that's no good.

Is there a way I can source scripts but somehow make my own temporary search path that "resets" after the function is finished running?

2

There are 2 best solutions below

1
On BEST ANSWER

I would consider sourcing the script in a separate R process using the callr package and then return the environment created by the sourced file.

By using a separate R process, this will prevent your search path from being polluted. I'm guessing there maybe some side effects (such as defining new functions of variables) in your global environment you do want. The local argument of the source functions allows you to specify where the parsed script should be executed. If you return this environment from the other R process, you can access any result you need.

Not sure what yours looks like but say I have this file that would modify the search path:

# messWithSearchPath.R

library(dplyr)

a <- data.frame(groupID = rep(1:3, 10), value = rnorm(30))

b <- a %>% 
  group_by(groupID) %>% 
  summarize(agg = sum(value))

From my top level script, I would write a wrapper function to source it in a new environment and have callr execute this function:

RogueScript <- function(){
  
  rogueEnv <- new.env()
  
  source("messWIthSearchPath.R", local = rogueEnv)
  
  rogueEnv
  
}

before <- search()

scriptResults <- callr::r(RogueScript)

scriptResults$b
#>   groupID       agg
#> 1       1 -2.871642
#> 2       2  3.368499
#> 3       3  1.159509

identical(before, search())
#> [1] TRUE

If the scripts have other side effects (such as setting options or establishing external connections), this method probably won't work. There may be workarounds depending on what they are intended to do, but this should work if you just want the variables/functions created. It also prevents the scripts from conflicting with each other not just your top level script.

0
On

One way would be to "snapshot" your current search path and try to return to it later:

search.snapshot <- local({
  .snap <- character(0)
  function(restore = FALSE) {
    if (restore) {
      if (is.null(.snap)) {
        return(character(0))
      } else {
        extras <- setdiff(search(), .snap)
        # may not work if DLLs are loaded
        for (pkg in extras) {
          suppressWarnings(detach(pkg, character.only = TRUE, unload = TRUE))
        }
        return(extras)
      }
    } else .snap <<- search()
  }
})

In action:

search.snapshot()                                  # store current state
get(".snap", envir = environment(search.snapshot)) # view snapshot
#  [1] ".GlobalEnv"        "ESSR"              "package:stats"    
#  [4] "package:graphics"  "package:grDevices" "package:utils"    
#  [7] "package:datasets"  "package:r2"        "package:methods"  
# [10] "Autoloads"         "package:base"     
library(ggplot2)
library(zoo)
# Attaching package: 'zoo'
# The following objects are masked from 'package:base':
#     as.Date, as.Date.numeric
library(dplyr)
# Attaching package: 'dplyr'
# The following objects are masked from 'package:stats':
#     filter, lag
# The following objects are masked from 'package:base':
#     intersect, setdiff, setequal, union
search()
#  [1] ".GlobalEnv"        "package:dplyr"     "package:zoo"      
#  [4] "package:ggplot2"   "ESSR"              "package:stats"    
#  [7] "package:graphics"  "package:grDevices" "package:utils"    
# [10] "package:datasets"  "package:r2"        "package:methods"  
# [13] "Autoloads"         "package:base"     

search.snapshot(TRUE)                              # returns detached packages
# [1] "package:dplyr"   "package:zoo"     "package:ggplot2"

search()
#  [1] ".GlobalEnv"        "ESSR"              "package:stats"    
#  [4] "package:graphics"  "package:grDevices" "package:utils"    
#  [7] "package:datasets"  "package:r2"        "package:methods"  
# [10] "Autoloads"         "package:base"     

I am somewhat confident (without verification) that this will not always work with all packages, perhaps due to dependencies and/or loaded DLLs. You can try adding force=TRUE to the detach call, not sure if that'll work better or perhaps have other undesirable side-effects.