can not use non standard evaluation in lm() function in r

72 Views Asked by At

I want to write a custom function mylm using the nonstandardized evaluation (NSE) in the rlang package, which should output the same thing as running lm(cyl~ mpg, data = mtcars) directly.

But I tried two different methods of NSE is an error, how can I run successfully mylm function.

I know I can use reformatele () function to construct a formula in mylm function, but I want to know if I could use {{}},!! Or inject() to make mylm run successfully.

# use{{}} for NSE
mylm <- function(y, x, data) {
  lm({{ y }}, {{ x }}, data = data)
}

# use inject for NSE
mylm(cyl, mpg, data = mtcars)
#> Error in eval(expr, envir, enclos): object 'cyl' not found

mylm2 <- function(y, x, data) {
  rlang::inject(lm({{ y }}, {{ x }}, data = data))
}

mylm2(cyl, mpg, data = mtcars)
#> Error in xj[i]: invalid subscript type 'language'

Created on 2023-12-15 with reprex v2.0.2

3

There are 3 best solutions below

0
On BEST ANSWER

I've run into a similar issue and, for me, the best workaround was to use:

mylm <- function(y, x, data) {
  lm(data[[y]] ~ data[[x]], data = data)
}

mylm("cyl", "mpg", data = mtcars)
#> 
#> Call:
#> lm(formula = data[[y]] ~ data[[x]], data = data)
#> 
#> Coefficients:
#> (Intercept)    data[[x]]  
#>     11.2607      -0.2525

Created on 2023-12-15 with reprex v2.0.2

Would that work for your use-case? Or do you want/need to use NSE for some other reason?

0
On

Use substitute:

mylm <- function(y, x, data) {
    lm(substitute(y~x), data = data)
}

mylm(cyl,mpg, mtcars)

Call:
lm(formula = substitute(y ~ x), data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  

Notice how the call object is distorted. If you were to try and update this within a call stack, there will be an error.eg try running update(mylm(cyl,mpg, mtcars)). For a good practice, your function should be able to work in these kind of situations. Compare with update(lm(cyl~mpg, mtcars))


mylm2 <- function(y, x, data){
    fm <- reformulate(as.character(substitute(x)), deparse1(substitute(y)))
    lm(fm, data)
}

mylm2(cyl,mpg, mtcars)

Call:
lm(formula = fm, data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  

If you must use rlang, one way:

library(rlang)
mylm3 <- function(y, x, data){
    x1 <- quo_squash(enquo(x))
    y1 <- quo_squash(enquo(y))
    lm(inject(!!y1~!!x1), data)
 }
mylm3(cyl,mpg, mtcars)

Call:
lm(formula = inject(!!y1 ~ !!x1), data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  
0
On

If you want the lm looking exactly as you would expect, containing the actual variable names and the name of the passed data frame, then you should use do.call:

mylm <- function(y, x, data) {
  do.call('lm',
          list(formula = reformulate(deparse(substitute(x)), 
                                     deparse(substitute(y))), 
               data = substitute(data)))
}

mylm(cyl, mpg, data = mtcars)
#> 
#> Call:
#> lm(formula = cyl ~ mpg, data = mtcars)
#>
#> Coefficients:
#> (Intercept)          mpg  
#>     11.2607      -0.2525  

Or even just call:

mylm <- function(y, x, data) {
  eval(call('lm', call('~', substitute(x), substitute(y)), substitute(data)))
}

which gives the same result