Goal
My goal is to define some functions for use within dplyr
verbs, that use pre-defined variables. This is because I have some of these functions that take a bunch of arguments, of which many always are the same variable names.
My understanding: This is difficult (and perhaps impossible) because dplyr
will lazily evaluate user-specified variables later on, but any default arguments are not in the function call and therefore invisible to dplyr
.
Toy example
Consider the following example, where I use dplyr
to calculate whether a variable has changed or not (rather meaningless in this case):
library(dplyr)
mtcars %>%
mutate(cyl_change = cyl != lag(cyl))
Now, lag
also supports alternate ordering like so:
mtcars %>%
mutate(cyl_change = cyl != lag(cyl, order_by = gear))
But what if I'd like to create my own version of lag
that always orders by gear
?
Failed attempts
The naive approach is this:
lag2 <- function(x, n = 1L, order_by = gear) lag(x, n = n, order_by = order_by)
mtcars %>%
mutate(cyl_change = cyl != lag2(cyl))
But this obviously raises the error:
no object named ‘gear’ was found
More realistic options would be these, but they also don't work:
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = ~gear)
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = get(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = getAnywhere(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = lazyeval::lazy(gear))
Question
Is there a way to get lag2
to correctly find gear
within the data.frame that dplyr
is operating on?
- One should be able to call
lag2
without having to providegear
. - One should be able to use
lag2
on datasets that are not calledmtcars
(but do havegear
as one it's variables). - Preferably
gear
would be a default argument to the function, so it can still be changed if required, but this is not crucial.
Here is my eventual answer that I actually ended up using. It fundamentally relies on a function that explicitly injects any default function values into the expressions of a lazy dots object.
The complete function (with comments) is at the end of this answer.
Limitations:
seq.default
instead ofseq
. If the goal is injection of default values in your own functions, then this generally won't be much of a problem.For example, one can use this function like this:
We can solve the toy problem from the question in several ways. Remember the new function and the ideal use case:
Use
mutate_
withdots
directly:Redefine
mutate
to include the addition of defaults.Use S3 dispatch to do this as the default for any custom class:
Depending on the use case, options 2 and 3 are the best ways to accomplish this I think. Option 3 actually has the complete suggested use case, but does rely on an additional S3 class.
Function: