I’m trying to write a function to pass some quosure arguments to an interior dplyr::select
. However, I want to be able to apply some conditions to the arguments after they have been provided. In this particular case, because selecting columns that do not exist produces an error, I want the function to check whether the columns provided by the caller exist in the dataframe passed via the tib
argument and remove any that do not before I pass the quosure and unquoting operator to select
.
The problem is that once something is inside a quosure I don’t know how to manipulate it any more. I could convert the names to strings, eliminate the extra names, and then convert the vector of strings back to symbols with syms
, which I think would work in this case because I am separately handing the interior select
the dataframe in which I want them to be evaluated, but this basically strips off all the benefits of using a quosure and then artificially supplies them again, which seems roundabout and inelegant. I'd like to avoid a kludgy solution that works in this precise case but doesn't offer any useful principles for next time.
library(tidyverse)
tst_tb <- tibble(A = 1:10, B = 11:20, C=21:30, D = 31:40)
################################################################################
# strip() expects a non-anonymous dataframe, from which it removes the rows
# specified (as a logical vector) in remove_rows and the columns specified (as
# unquoted names) in remove_cols. It also prints a brief report; just the df
# name, length and width, and the remaining column names.
strip <- function(tib, remove_rows = FALSE, remove_cols = NULL){
remove_rows <- enquo(remove_rows)
remove_cols <- enquo(remove_cols)
tib_name <- deparse(substitute(tib))
out <- tib %>%
filter(! (!! remove_rows)) %>%
select(- !! remove_cols) %T>% (function(XX = .){
print(paste0(tib_name,": Length = ", nrow(XX), "; Width = ", ncol(XX)))
cat("\n")
cat(" Names: ", names(XX))
})
out
}
The next line will not work because of E
in the remove_cols
argument. You should think of E
not as one name out of four or five, but as 10 or 20 arguments out of several hundred.
out_tb <- strip(tib = tst_df, remove_rows = (A < 3 | D > 36), remove_cols = c(C, D, E))
out_tb
Desired output:
# A tibble: 4 x 2
A B
<int> <int>
1 3 13
2 4 14
3 5 15
4 6 16