R: Testing and subsetting elements of quosures

215 Views Asked by At

I’m trying to write a function to pass some quosure arguments to an interior dplyr::select. However, I want to be able to apply some conditions to the arguments after they have been provided. In this particular case, because selecting columns that do not exist produces an error, I want the function to check whether the columns provided by the caller exist in the dataframe passed via the tib argument and remove any that do not before I pass the quosure and unquoting operator to select.

The problem is that once something is inside a quosure I don’t know how to manipulate it any more. I could convert the names to strings, eliminate the extra names, and then convert the vector of strings back to symbols with syms, which I think would work in this case because I am separately handing the interior select the dataframe in which I want them to be evaluated, but this basically strips off all the benefits of using a quosure and then artificially supplies them again, which seems roundabout and inelegant. I'd like to avoid a kludgy solution that works in this precise case but doesn't offer any useful principles for next time.

library(tidyverse)

tst_tb <- tibble(A = 1:10, B = 11:20, C=21:30, D = 31:40)

################################################################################
# strip() expects a non-anonymous dataframe, from which it removes the rows
# specified (as a logical vector) in remove_rows and the columns specified (as
# unquoted names) in remove_cols. It also prints a brief report; just the df
# name, length and width, and the remaining column names.
strip <- function(tib, remove_rows = FALSE, remove_cols = NULL){
  remove_rows <- enquo(remove_rows)
  remove_cols <- enquo(remove_cols)
  tib_name    <- deparse(substitute(tib))  
  out <- tib %>%
    filter(! (!! remove_rows))  %>%
    select(- !! remove_cols) %T>% (function(XX = .){
      print(paste0(tib_name,": Length = ", nrow(XX), "; Width = ", ncol(XX)))
      cat("\n")
      cat("     Names: ", names(XX))
    })
  out  
}

The next line will not work because of E in the remove_cols argument. You should think of E not as one name out of four or five, but as 10 or 20 arguments out of several hundred.

out_tb <- strip(tib = tst_df, remove_rows = (A < 3 | D > 36),  remove_cols = c(C, D, E))

out_tb

Desired output:

# A tibble: 4 x 2
      A     B
  <int> <int>
1     3    13
2     4    14
3     5    15
4     6    16
0

There are 0 best solutions below