I'm trying to get familiar with using NSE in my code where warranted. Let's say I have pairs of columns and want to generate a new string variable for each pair indicating whether the values in that pair are the same.
library(tidyverse)
library(magrittr)
df <- tibble(one.x = c(1,2,3,4),
one.y = c(2,2,4,3),
two.x = c(5,6,7,8),
two.y = c(6,7,7,9),
# not used but also in df
extra = c(5,5,5,5))
I'm trying to write code that would accomplish the same thing as the following code:
df.mod <- df %>%
# is one.x the same as one.y?
mutate(one.x_suffix = case_when(
one.x == one.y ~ "same",
TRUE ~ "different")) %>%
# is two.x the same as two.y?
mutate(two.x_suffix = case_when(
two.x == two.y ~ "same",
TRUE ~ "different"))
df.mod
#> # A tibble: 4 x 6
#> one.x one.y two.x two.y one.x_suffix two.x_suffix
#> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 1. 2. 5. 6. different different
#> 2 2. 2. 6. 7. same different
#> 3 3. 4. 7. 7. different same
#> 4 4. 3. 8. 9. different different
In my actual data I have an arbitrary number of such pairs (e.g. three.x
and three.y
, . . .) so I want to write a more generalized procedure using mutate_at
.
My strategy is to pass in the ".x" variables as the .vars
and then gsub
the "x" for "y" on one side of the equality test inside the case_when
, like so:
df.mod <- df %>%
mutate_at(vars(one.x, two.x),
funs(suffix = case_when(
. == !!sym(gsub("x", "y", deparse(substitute(.)))) ~ "same",
TRUE ~ "different")))
#> Error in mutate_impl(.data, dots): Evaluation error: object 'value' not found.
This is when I get an exception. It looks like the gsub
portion is working fine:
df.debug <- df %>%
mutate_at(vars(one.x, two.x),
funs(suffix = gsub("x", "y", deparse(substitute(.)))))
df.debug
#> # A tibble: 4 x 6
#> one.x one.y two.x two.y one.x_suffix two.x_suffix
#> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 1. 2. 5. 6. one.y two.y
#> 2 2. 2. 6. 7. one.y two.y
#> 3 3. 4. 7. 7. one.y two.y
#> 4 4. 3. 8. 9. one.y two.y
It's the !!sym()
operation that's causing the exception here. What have I done wrong?
Created on 2018-11-07 by the reprex package (v0.2.1)
The problem is not in
!!sym
, as you can see in the following example:The problem is in trying to unquote
substitute(.)
insidecase_when
:The reason for this is operator precedence. From the help page for
!!
:In the example above, the context for
!!substitute(.)
is the formula, which is itself insidecase_when
. This leads to the expression getting immediately replaced withvalue
, which is defined insidecase_when
and which has no meaning inside your data frame.You want to keep expressions next to their environment, which is what quosures are for. By replacing
substitute
withrlang::enquo
, you capture the expression that gave rise to.
along with its defining environment (your dataframe). To keep things tidy, let's move yourgsub
manipulation into a separate function:You can now use the new
x2y
function directly in your code. With quosures, no unquoting is necessary because the expressions already carry their environments with them; you can simply evaluate them usingrlang::eval_tidy
:EDIT to address the question in your comment: Mushing all your code into a single line is almost always A Bad Idea™, and I strongly advise against it. However, since this question is about NSE, I think it's important to understand why simply taking the content of
x2y
and pasting it insidecase_when
leads to problems.enquo()
, likesubstitute()
, look in the calling environment of the function and replace the argument with the expression that was provided to that function.substitute()
goes only one environment up (findingvalue
insidecase_when
when you unquoted it), whileenquo()
keeps moving up as long as the functions in the calling stack correctly handle quasiquotation. (And most dplyr/tidyverse functions do.) So, when you callenquo(.x)
insidex2y
, it moves up the expressions provided to each function on the calling stack to eventually findone.x
.When you call
enquo()
insidemutate_at
, it is now on the same level asone.x
, so it too replaces the argument (one.x
in this case) with the expression that defined it (the vectorc(1,2,3,4)
in this case). This is not what you want. Rather than moving up levels, you now want to stay on the same level asone.x
. To do so, userlang::quo()
in place ofrlang::enquo()
: