Sourcing multiple R files in

2.4k Views Asked by At

Currently I am using R on my local machine, where I am able to source multiple R scripts conditionally (as they are placed in my local drive).

However when I need to use same scripts on RENJIN in Google Data Flow to achieve parallelism , I am unable to source files.

I have multiple R script files with .R extension. I need to read the main R script file and need to pass into data flow at run time but the main R file has to source tag that is referencing to other R script files. When reading the main R file content from java and passing it to google data flow Jave is not able to interpret source tags within R files which is referencing to other R script files .

To handle this situation I may use one untidy solution to keep entire code in one file. with different function names.

Is there any way in Renjin to hold all the R script files which needs to be used and pass to google data flow at run time.

2

There are 2 best solutions below

1
On

The most logical solution would be to use a package here. I assume that you can install custom packages on the google cloud (just having base R would be painful). I would then put these functions and code inside an R package, and install that package. This would get rid of the use of source and enable the inclusion of documentation and testing.

3
On

If your sources are included as resources in a JAR you are deploying to Google Cloud Dataflow, then you can source them using a 'res' URL:

source("res:com/acme/scripts/myscript.R")

If you can't change the paths in the script, then make sure they are at least relative, for example:

source("myscript.R")

And then set the working directory when you create a new ScriptEngine.

RenjinScriptEngineFactory factory = new RenjinScriptEngineFactory();
ScriptEngine engine = factory.getEngine();
engine.eval("setwd('res:com/acme/scripts')");

Note that setting the working directory to somewhere on the classpath only works reliably if there is only one JAR on the classpath with that path. If, for example, I evaluate:

> setwd("res:org/renjin")
> getwd()
[1] "jar:file:///usr/share/renjin/lib/compiler-0.8.2337.jar!/org/renjin"

The above sets the working directory to the first directory 'org/renjin' on the classpath, which is might not be what you want.

In any case, I would definitely encourage you to put the files together in package as suggested above, but maybe this will help get things moving.