R Function 'box::help()' Cannot Generate Help File: "Invalid Argument"

451 Views Asked by At

Motivation

My colleagues and I routinely create ad hoc scripts in R, to perform ETL on proprietary data and generate automated reports for clients. I am attempting to standardize our approach, for the sake of consistency, modularity, and reusability.

In particular, I want to consolidate our most commonly used functions in a central directory, and to access them as if they were functions from a proprietary R package. However, I am quite raw as an R developer, and my teammates are even less experienced in R development. As such, the development of a formal package is unfeasible for the moment.

Approach

Fortunately, the box package, by Stack Overflow's very own Konrad Rudolph, provides (among other modularity) an accessible approach to approximate the behavior of an R package. Unlike the rigorous development process outlined by the RStudio team, box requires only that one create a regular .R file, in a meaningful location, with roxygen2 documentation (#') and explicit @exports:

Writing modules

The module bio/seq, which we have used in the previous section, is implemented in the file bio/seq.r. The file seq.r is, by and large, a normal R source file, which happens to live in a directory named bio.

In fact, there are only three things worth mentioning:

  1. Documentation. Functions in the module file can be documented using ‘roxygen2’ syntax. It works the same as for packages. The ‘box’ package parses the
    documentation and makes it available via box::help. Displaying module help requires that ‘roxygen2’ is installed.

  2. Export declarations. Similar to packages, modules explicitly need to declare which names they export; they do this using the annotation comment #' @export in front of the name. Again, this works similarly to ‘roxygen2’ (but does not require having that package installed).

At the moment, I am tinkering around with a particular module, as "imported" into a script. While the "import" itself works seamlessly, I cannot seem to access the documentation for my functions.

Code

I am experimenting with box on a Lenovo ThinkPad running Windows 10 Enterprise. I have created a script, aptly titled Script.R, whose location serves as my working directory. My module exists in the relative subdirectory ./Resources/Modules as the humble file time.R, reproduced here:

###########################
## Relative Date Windows ##
###########################

#' @title Past Day of Week
#' @description Determine the date of the given weekday that fell a given number
#'   of weeks before the given date.
#' @param from \code{Date} object. The point of reference, from which we go
#'   backwards. Defaults to current \code{Sys.Date()}.
#' @param back \code{integer}. The number of weeks to go backward from the point
#'   of reference; negative values go forward. Defaults to \code{1}, for last
#'   week. Weeks begin on \code{"Monday"}.
#' @param weekday \code{character}. The weekday within the week targeted by
#'   \code{back}; one of \code{c("Monday", "Tuesday", "Wednesday", "Thursday",
#'   "Friday", "Saturday", "Sunday")}.
#' @export
#' @return The date of the \code{weekday} falling in the week \code{back} weeks
#'   prior to the week in which \code{from} falls. Defaults to \code{"Monday"}.
past_weekday <- function(from = Sys.Date(),
                         back = 1,
                         weekday = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")
                         ) {
  cycle <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")
  
  from <- as.Date(from)
  back <- as.integer(back)
  
  weekday_index <- (which(cycle == weekday[1]) - 1) %% 7
  
  from_index <- (which(cycle == "Sunday") + as.POSIXlt(from)$wday - 1) %% 7
  
  weekdate <- as.Date(from) - lubridate::days(from_index) - lubridate::weeks(as.numeric(back)) + lubridate::days(weekday_index)
  
  return(as.Date(weekdate))
}

Observe the roxygen2 documentation, as indicated by the special #' comments and @ tags. The documentation for past_weekday() is of far greater interest to me than the function itself.

Last and certainly least, here is reproduced Script.R itself:

# Set the working directory to the location of this very script.
setwd(this.path::this.dir())

# Access the functions in 'time.R' by relative location.
box::use(Resources/Modules/time)

# Run the function with its default values.
time$past_weekday()

# View the help page for the function.
box::help(time$past_weekday)

In theory, that final line will display the documentation for past_weekday(), via box::help():

enter image description here

The box vignette gives a simple example to that effect:

We can also display the interactive help for individual names using the box::help function, e.g.:

 box::help(seq$revcomp)

Problem

The first three lines of Script.R give me exactly what I desire. That is, they load time into R as an environment, from which I can access past_weekday() via time$past_weekday(). This module$function() syntax is analogous to the qualification of functions from formal packages: package::function(). Indeed, past_weekday() itself works just as expected:

time$past_weekday()
# [1] "2021-07-19"

However, when I attempt to interactively access the documentation

box::help(time$past_weekday)

the console displays the following warnings

Warning messages:
1: In utils::packageDescription(package, fields = "Version") :
  no package 'PKG' was found
2: In file.create(to[okay]) :
  cannot create file 'C:\Users\greg\AppData\Local\Temp\RtmpYBTTyG/.R/doc/html/module:Resources/Modules/time.html', reason 'Invalid argument'

and the interactive help window is empty but for this error message:

enter image description here

For my team, this could prove a serious issue. Since we often rely on useful functions written by each other, it is crucial that any user on our team be able to easily access clear documentation by the author of the function...just as the user is accustomed to doing with formal R packages. Without this ability, the user must either bug the author for clarification, or blunder ahead without a clear understanding of the function's purpose and limitations.

Suspicions

When I read the warning

In file.create(to[okay]) :
cannot create file 'C:\Users\greg\AppData\Local\Temp\RtmpYBTTyG/.R/doc/html/module:Resources/Modules/time.html', reason 'Invalid argument'

I was drawn to the filepath

C:\Users\greg\AppData\Local\Temp\RtmpYBTTyG/.R/doc/html/module:Resources/Modules/time.html

as the cause for an Invalid argument to file.create(). To my knowledge, a directory name .../module:Resources/... containing a colon : is illegal on Windows and elsewhere.

Indeed, when I supply another illegal filepath ./illegal:directory:name/missing.txt to file.create()

file.create('./illegal:directory:name/missing.txt')
# [1] FALSE

I get the same warning:

Warning message:
In file.create("./illegal:directory:name/missing.txt") :
  cannot create file './illegal:directory:name/missing.txt', reason 'Invalid argument'

The culprit appears to be this line in help.R:

display_help(doc, paste0('module:', mod_name), help_type)
#                               ^
#                             Here

However, this seems far too simple a diagnosis. Frankly, I would be quite surprised to find such a portability issue within a package designed by a seasoned developer. I find it overwhelmingly more likely that I am simply out of my depth.

What am I missing?


Update 1

I tried it on my MacBook Air, running Mojave, and it actually worked! While I still got the first (rather odd) warning message on the console

Warning message:
In utils::packageDescription(package, fields = "Version") :
  no package 'PKG' was found

the interactive help window does display the intended documentation:

enter image description here

Naturally, this does not exactly solve my problem—the scripts will be executed on a VM running Windows, just like my Lenovo and every other computer used at my company. However, it does support the hypothesis that this issue is specific to box on Windows.


Update 2

Konrad has kindly confirmed that this is indeed a bug, and he's working on a fix. Many thanks to Konrad for his clarification and responsiveness!

1

There are 1 best solutions below

6
On BEST ANSWER

As noted, that’s a bug, now fixed.

But since we’re here, a word on usage:

# Set the working directory to the location of this very script.
setwd(this.path::this.dir())

This is generally not recommended. To quote Jenny Bryan:

If the first line of your R script is

setwd("C:\Users\jenny\path\that\only\I\have")

I will come into your office and SET YOUR COMPUTER ON FIRE .

‘box’ also doesn’t need this; instead, the idea is to configure a global module search path (equivalent to R’s package library, see .libPaths()), e.g. via the box.path option (this would usually go into the user’s .Rprofile configuration — not in the script itself!):

options(box.path = 'C:\User\Konrad\some\path')

Afterwards, modules that are installed in this search path will be found by box::use.

As noted, this should be a global setting. To use project-specific modules, you wouldn’t set a global search path; instead, you’d use relative imports:

box::use(./Resources/Modules/time)

This should work regardless of the working directory; it uses the calling script’s location instead. Consequently, this.path::this.dir() or similar hacks are never necessary with ‘box’. And to find data files, ‘box’ provides the box::file function which also works regardless of the current working directory.

See also FAQ: how to organise globally installed modules?