I have a large and complicated workflow (lots of initial inputs, recoding, merges, dropped observations, etc) in R and I do that work within many isolated functions specific to each input type, each merge and data manipulation step, etc. Right now only the final "analysis dataset" is returned into the global environment.
However, I want to write a knitr document that documents the data assembly process, but all of the various objects (data frames/tibbles) are local to the functions in which they are assembled, which I take as good practice.
The options seem to be:
I could generate lots of interim data objects to the global environment, but that would clutter the global environment, which I would like to keep neat
I could return lists of interesting attributes (N, merge success info, structures, etc) from the function to the global environment. A little neater, but not completely efficient.
This is clearly now a new problem. I would welcome suggestions on the best way(s) forward?
Return objects with a class attribute, and define a print method for those classes. In the main document, print the objects. That's the standard R approach to this problem.