Consider the following Rmarkdown
document:
---
title: "Environments"
author: "Me"
date: "2023-01-13"
output: html_document
---
```{r setup}
library(glue)
library(purrr)
```
```{r vars}
a <- 1
x <- list("`a` has the value: {a}")
```
```{r works}
glue(x[[1L]])
```
```{r does-not-work, error = TRUE}
map_chr(x, glue)
```
When using RStudio's
knit button, everything works like a charm and the output is as follows:
However, if I try to call render myself with an own environment, it fails:
ne <- new.env()
render("env.Rmd", envir = ne)
So apparently glue
trips over the environments when used within purrr::map
.
How would I call render
with an own environment without generating this error? Ideally, I do not want to chnage the Rmarkdown
itself.
Update
Interestingly enough, if I wrap glue
in an own function
things work smoothly again:
```
glue <- function(...) glue::glue(...)
map_chr(x, glue)
```
Update 2
The problem seems not to be related to knitr/rmarkdown
, but is a general scoping issue which seems to have to do with the environments the involved functions are defined:
library(rlang)
library(purrr)
library(glue)
rm(list = ls())
e <- env(a = 1, x = "`a` has the value: {a}")
delayedAssign("res", map_chr(x, glue), e, e)
e$res
# Error:
# ℹ In index: 1.
# Caused by error:
# ! object 'a' not found
## as opposed to
a <- 1
x <- "`a` has the value: {a}"
delayedAssign("res", {
map_chr(x, glue)
})
res
# [1] "`a` has the value: 1"
This has nothing to do with either RMarkdown or ‘glue’. It’s also not a bug, contrary to what I claimed previously. In fact, the issue can be reproduced by simply accessing a variable inside the environment
e
, e.g. via theget
function:This is a consequence of R’s lexical scoping rules:
lapply
executesFUN
(=get
) inside its call frame.1 Due to the way R scoping works,2FUN
will look up variable names in its calling scope. This calling scope is thelapply
call frame. Of coursea
does not exist in the call frame ofFUN
(by contrast,X
andFUN
exist, since they are parameter names oflapply
).If R does not find a name in the local scope, it continues searching “upwards”, in the parent environment of the current environment. The parent environment of a call frame is the environment in which the function was defined. In the case of
lapply
, this isnamespace:base
.namespace:base
also does not define the namea
, so the search continues upwards. Its parent environment is.GlobalEnv
.3 And that is whylapply("a", get)
works (purely by accident!) if we defineda
inside the global environment.4 However, in our case where we defineda
inside another environment, that environment is never searched, unless weattach()
it to the search path (but of course that’s a bad idea).The workaround is to invoke the function (either
glue
orget
, or whatever needs to access local variables) inside an anonymous function. Strictly speaking we should always do this, not just when working on a different environment:This works because the anonymous function
\(.) get(.)
is defined inside the calling scope which, in this example, ise
. So whenlapply
executes this function,get
first searches the namea
in the local scope of the anonymous function, doesn’t finda
, and then walks up the chain of parent environments. And the first parent environment is the environment in which the anonymous function was defined:e
.Note, however, that we need to take care with the choice of our parameter name! Because the scope of the anonymous function is the first one that is searched, it takes precedence and can hide our intended variable:
1 In fact
lapply
is implemented as an internal function in C; but for the sake of this discussion we can pretend that it is defined in R as follows:Also note that I am using
lapply
instead ofmap_chr
, but analogous reasoning applies tomap_chr
.2 I would like to emphasise that R’s scoping rules make perfect sense and are internally consistent, even though it is inconvenient in this case. In fact, lexical scoping is generally superior to other scoping rules.
3 I’ve argued before that this is in fact a bug in R. At the very least it is a seriously questionable design decision which leads to errors and misunderstandings, and this question is a prime example. For this reason, my package ‘box’ defines module environments differently, specifically to avoid this behaviour.
4 For example (and to illustrate that the original code really only worked purely by accident!), consider the following, where we change the variable name
a
tosum
:… oops!
get
didn’t return the value of the global variable we defined but rather a function defined innamespace:base
, becausenamespace:base
comes before.GlobalEnv
in the chain of parent environments that get searched. And the same is true in the case ofpurrr::map_chr
.