Good practice on how to store the result of a function for later use in R

264 Views Asked by At

I have the situation where I have written an R function, ComplexResult, that computes a computationally expensive result that two other separate functions will later use, LaterFuncA and LaterFuncB.

I want to store the result of ComplexResult somewhere so that both LaterFuncA and LaterFuncB can use it, and it does not need to be recalculated. The result of ComplexResult is a large matrix that only needs to be calculated once, then re-used later on.

R is my first foray into the world of functional programming, so interested to understand what it considered good practice. My first line of thinking is as follows:

# run ComplexResult and get the result
cmplx.res <- ComplexResult(arg1, arg2)

# store the result in the global environment. 
# NB this would not be run from a function 
assign("CachedComplexResult", cmplx.res, envir = .GlobalEnv)

Is this at all the right thing to do? The only other approach I can think of is having a large "wrapper" function, e.g.:

MyWrapperFunction <- function(arg1, arg2) {
    cmplx.res <- ComplexResult(arg1, arg2)

    res.a <- LaterFuncA(cmplx.res)
    res.b <- LaterFuncB(cmplx.res)

    # do more stuff here ...
}

Thoughts? Am I heading at all in the right direction with either of the above? Or is an there Option C which is more cunning? :)

3

There are 3 best solutions below

1
On

The general answer is you should Serialize/deSerialize your big object for further use. The R way to do this is using saveRDS/readRDS:

## save a single object to file
saveRDS(cmplx.res, "cmplx.res.rds")
## restore it under a different name
cmplx2.res <- readRDS("cmplx.res.rds")
3
On

This assign to GlobalEnv:

CachedComplexResult <- ComplexResult(arg1, arg2)

To store I would use:

write.table(CachedComplexResult, file = "complex_res.txt")

And then to use it directly:

LaterFuncA(read.table("complex_res.txt"))
0
On

Your approach works for saving to local memory; other answers have explained saving to global memory or a file. Here are some thoughts on why you would do one or the other.

Save to file: this is slowest, so only do it if your process is volatile and you expect it to crash hard and you need to pick up the pieces where it left off, OR if you just need to save the state once in a while where speed/performance is not a concern.

Save to global: if you need access from multiple spots in a large R program.