how to use use possibly with lm?

794 Views Asked by At

Consider this simple example

library(dplyr)
library(broom)

dataframe <- data_frame(id = c(1,2,3,4,5,6),
                        value = c(NA,NA,NA,NA,NA,NA))
dataframe

> dataframe
# A tibble: 6 x 2
     id value
  <dbl> <lgl>
1     1    NA
2     2    NA
3     3    NA
4     4    NA
5     5    NA
6     6    NA

I have a function that essentially uses lm to compute the mean of a column in my dataframe.

get_mean <- function(data, myvar){
  col_name <- as.character(substitute(myvar))
  fmla <- as.formula(paste(col_name, "~ 1"))
  tidy(lm(data = data, fmla, na.action = 'na.omit')) %>% pull(estimate)
}

Now

> get_mean(dataframe, id)
[1] 3.5

but because of the missing values,

get_mean(dataframe, value)

returns the dreaded

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases 

I would like the function to return NA or any number specified by me when the all NA case arise. I tried to use purrr:possibly but without success

get_mean <- function(data, myvar){
  col_name <- as.character(substitute(myvar))
  fmla <- as.formula(paste(col_name, "~ 1"))
  model <- purrr::possibly(lm(data = data, fmla, na.action = 'na.omit'), NA)
  if(!is.na(model)) {
  tidy(model) %>%  pull(estimate)
  }
}

get_mean(dataframe, id)

does not work

Error: Can't convert a list to function

What should I do? Thanks!!

2

There are 2 best solutions below

2
On BEST ANSWER

You can wrap the whole function in possibly, so you get NA if the whole function fails anywhere.

get_mean <- possibly(function(data, myvar) {
     col_name <- as.character(substitute(myvar))
     fmla <- as.formula(paste(col_name, "~ 1"))
     model <- lm(data = data, fmla, na.action = 'na.omit')
     tidy(model) %>%  pull(estimate)
}, otherwise = NA)

get_mean(dataframe, id)
[1] 3.5
 get_mean(dataframe, value)
[1] NA

However, unlike tryCatch this isn't focused on the lm part of the code. It is working on the whole function, and will return NA if any error occurs for any reason. For example, if you forgot to load the broom package before running get_mean you would get NA back even if the model worked fine because R wouldn't be able to find tidy.

Setting the quiet argument to FALSE will allow the error message to print, which could help you mitigate errors like the one I outline above.

You can also use possibly as in the documentation examples for safely, making a possibly function specific for lm to use. Then you can use an if else statement to either tidy or return NA.

get_mean <- function(data, myvar) {
     col_name <- as.character(substitute(myvar))
     fmla <- as.formula(paste(col_name, "~ 1"))
     poss_lm <- possibly(lm, otherwise = NA)
     model <- poss_lm(data = data, fmla, na.action = 'na.omit')
     if( !is.na(model[1]) ) {
          tidy(model) %>%  pull(estimate)
     }
     else {
          model
     }
}
0
On

Looks like a regular tryCatch() would do the trick

get_mean <- function(data, myvar){
  col_name <- as.character(substitute(myvar))
  fmla <- as.formula(paste(col_name, "~ 1"))
  tryCatch(lm(data = data, fmla, na.action = 'na.omit') %>% tidy() %>%  pull(estimate),
                    error=function(e) NA)
}
get_mean(dataframe, id)
# [1] 3.5
get_mean(dataframe, value)
# [1] NA