I am using an RStudio project to work with confidential data (i.e. a project associated with a working directory, not-version-control). I want to share my script confidential_script.R
and project confidential_project.Rproj
with a collaborator without sharing any real data, including temporary files or metadata. I am making sure not to save or share any .RData
files. However, RStudio in Windows automatically creates the .Rproj.user
hidden folder with what appears to be project metadata.
Can I share the RStudio project file(s) without compromising any confidential information?
The best way to manage confidential dependencies is to declare them as R objects at the top of a script, and to eliminate the need to share metadata files such as an R project or RStudio project.
Ideally one would create a test version of the confidential information that contains random / anonymized data, develop a few tests / reports for validation, and include these items with the R script so the other collaborators can ensure it works before using it with live data.
The script, parameters, test data and test cases make the script completely reproducible.
Example: download and combine Pokémon stats files
The following example script downloads statistics for the first seven generations of Pokémon and combines them into a single data frame for subsequent analysis.