Do RStudio projects store any temporary data?

248 Views Asked by At

I am using an RStudio project to work with confidential data (i.e. a project associated with a working directory, not-version-control). I want to share my script confidential_script.R and project confidential_project.Rproj with a collaborator without sharing any real data, including temporary files or metadata. I am making sure not to save or share any .RData files. However, RStudio in Windows automatically creates the .Rproj.user hidden folder with what appears to be project metadata.

Can I share the RStudio project file(s) without compromising any confidential information?

1

There are 1 best solutions below

0
On BEST ANSWER

The best way to manage confidential dependencies is to declare them as R objects at the top of a script, and to eliminate the need to share metadata files such as an R project or RStudio project.

Ideally one would create a test version of the confidential information that contains random / anonymized data, develop a few tests / reports for validation, and include these items with the R script so the other collaborators can ensure it works before using it with live data.

The script, parameters, test data and test cases make the script completely reproducible.

Example: download and combine Pokémon stats files

The following example script downloads statistics for the first seven generations of Pokémon and combines them into a single data frame for subsequent analysis.

# name of zip file assigned to theZipFile object
theZipFile <- "https://raw.githubusercontent.com/lgreski/pokemonData/master/pokemonData.zip"

download.file(theZipFile,
              "pokemonData.zip",
              method="curl",mode="wb")
unzip("pokemonData.zip")

thePokemonFiles <- list.files("./pokemonData",
                              full.names=TRUE)
thePokemonFiles 

pokemonData <- lapply(thePokemonFiles,function(x) read.csv(x))

# a list of 7 data frames
summary(pokemonData)

pokemonData <- do.call(rbind,pokemonData)

summary(pokemonData)