can not use non standard evaluation in lm() function in r

114 Views Asked by At

I want to write a custom function mylm using the nonstandardized evaluation (NSE) in the rlang package, which should output the same thing as running lm(cyl~ mpg, data = mtcars) directly.

But I tried two different methods of NSE is an error, how can I run successfully mylm function.

I know I can use reformatele () function to construct a formula in mylm function, but I want to know if I could use {{}},!! Or inject() to make mylm run successfully.

# use{{}} for NSE
mylm <- function(y, x, data) {
  lm({{ y }}, {{ x }}, data = data)
}

# use inject for NSE
mylm(cyl, mpg, data = mtcars)
#> Error in eval(expr, envir, enclos): object 'cyl' not found

mylm2 <- function(y, x, data) {
  rlang::inject(lm({{ y }}, {{ x }}, data = data))
}

mylm2(cyl, mpg, data = mtcars)
#> Error in xj[i]: invalid subscript type 'language'

Created on 2023-12-15 with reprex v2.0.2

3

There are 3 best solutions below

0
jared_mamrot On BEST ANSWER

I've run into a similar issue and, for me, the best workaround was to use:

mylm <- function(y, x, data) {
  lm(data[[y]] ~ data[[x]], data = data)
}

mylm("cyl", "mpg", data = mtcars)
#> 
#> Call:
#> lm(formula = data[[y]] ~ data[[x]], data = data)
#> 
#> Coefficients:
#> (Intercept)    data[[x]]  
#>     11.2607      -0.2525

Created on 2023-12-15 with reprex v2.0.2

Would that work for your use-case? Or do you want/need to use NSE for some other reason?

0
Onyambu On

Use substitute:

mylm <- function(y, x, data) {
    lm(substitute(y~x), data = data)
}

mylm(cyl,mpg, mtcars)

Call:
lm(formula = substitute(y ~ x), data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  

Notice how the call object is distorted. If you were to try and update this within a call stack, there will be an error.eg try running update(mylm(cyl,mpg, mtcars)). For a good practice, your function should be able to work in these kind of situations. Compare with update(lm(cyl~mpg, mtcars))


mylm2 <- function(y, x, data){
    fm <- reformulate(as.character(substitute(x)), deparse1(substitute(y)))
    lm(fm, data)
}

mylm2(cyl,mpg, mtcars)

Call:
lm(formula = fm, data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  

If you must use rlang, one way:

library(rlang)
mylm3 <- function(y, x, data){
    x1 <- quo_squash(enquo(x))
    y1 <- quo_squash(enquo(y))
    lm(inject(!!y1~!!x1), data)
 }
mylm3(cyl,mpg, mtcars)

Call:
lm(formula = inject(!!y1 ~ !!x1), data = data)

Coefficients:
(Intercept)          mpg  
    11.2607      -0.2525  
0
Allan Cameron On

If you want the lm looking exactly as you would expect, containing the actual variable names and the name of the passed data frame, then you should use do.call:

mylm <- function(y, x, data) {
  do.call('lm',
          list(formula = reformulate(deparse(substitute(x)), 
                                     deparse(substitute(y))), 
               data = substitute(data)))
}

mylm(cyl, mpg, data = mtcars)
#> 
#> Call:
#> lm(formula = cyl ~ mpg, data = mtcars)
#>
#> Coefficients:
#> (Intercept)          mpg  
#>     11.2607      -0.2525  

Or even just call:

mylm <- function(y, x, data) {
  eval(call('lm', call('~', substitute(x), substitute(y)), substitute(data)))
}

which gives the same result