Goal
My goal is to define some functions for use within dplyr verbs, that use pre-defined variables. This is because I have some of these functions that take a bunch of arguments, of which many always are the same variable names.
My understanding: This is difficult (and perhaps impossible) because dplyr will lazily evaluate user-specified variables later on, but any default arguments are not in the function call and therefore invisible to dplyr.
Toy example
Consider the following example, where I use dplyr to calculate whether a variable has changed or not (rather meaningless in this case):
library(dplyr)
mtcars %>%
mutate(cyl_change = cyl != lag(cyl))
Now, lag also supports alternate ordering like so:
mtcars %>%
mutate(cyl_change = cyl != lag(cyl, order_by = gear))
But what if I'd like to create my own version of lag that always orders by gear?
Failed attempts
The naive approach is this:
lag2 <- function(x, n = 1L, order_by = gear) lag(x, n = n, order_by = order_by)
mtcars %>%
mutate(cyl_change = cyl != lag2(cyl))
But this obviously raises the error:
no object named ‘gear’ was found
More realistic options would be these, but they also don't work:
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = ~gear)
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = get(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = getAnywhere(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = lazyeval::lazy(gear))
Question
Is there a way to get lag2 to correctly find gear within the data.frame that dplyr is operating on?
- One should be able to call
lag2without having to providegear. - One should be able to use
lag2on datasets that are not calledmtcars(but do havegearas one it's variables). - Preferably
gearwould be a default argument to the function, so it can still be changed if required, but this is not crucial.
This solution is coming close:
Consider a slightly easier toy example:
We still use
lagand it'sorder_byargument, but don't do any further computation with it. Instead of sticking to the SEmutate, we switch to NSEmutate_and makelag2build a function call as a character vector.This gives us an identical result to the above.
The orginial toy example can be achieved with:
Downsides:
mutate_.paste.gearshould come from. Assigning values togearorcarbin the global environment seems to be ok, but my guess is that unexpected bugs could occur in some cases. Using a formula instead of a character vector would be safer, but this requires the correct environment to be assigned for it to work, and that is still a big question mark for me.