df <- data.frame(
id = rep(letters[1:3], 9),
m1 = ceiling(rnorm(9, 10, 3)),
m2 = ceiling(rnorm(9, 10, 6)),
m3 = 0
)
head(df)
id m1 m2 m3
1 a 12 14 0
2 b 11 9 0
3 c 10 10 0
4 a 16 1 0
5 b 5 15 0
6 c 8 7 0
I have a data frame with metadata in the left-most columns and a raw data matrix attached to the right side. I'd like to remove columns that sum to zero on the right side of the dataframe without breaking into two seperate objects using dplyr::select_if
df %>%
select_if(!(grepl("m",names(.)))) %>%
head()
id
1 a
2 b
3 c
4 a
5 b
6 c
When I attempt to add a second term to evaluate whether the raw data columns (indicated by "m" prefix) sum to zero, I get the following error message:
> df %>%
+ select_if(!(grepl("m",names(.))) || sum(.) > 0)
Error in `select_if()`:
! `.p` is invalid.
✖ `.p` should have the same size as the number of variables in the tibble.
ℹ `.p` is size 1.
ℹ The tibble has 4 columns, including the grouping variables.
Run `rlang::last_error()` to see where the error occurred.
Warning message:
In !(grepl("m", names(.))) || sum(.) > 0 :
'length(x) = 4 > 1' in coercion to 'logical(1)'
> rlang::last_error()
<error/rlang_error>
Error in `select_if()`:
! `.p` is invalid.
✖ `.p` should have the same size as the number of variables in the tibble.
ℹ `.p` is size 1.
ℹ The tibble has 4 columns, including the grouping variables.
I greatly appreciate any assistance with this!

As @akrun already pointed out in the comments
select_if()is deprecated. We canselect()all variables that don't start with "M"!starts_with("M")and which are numeric and whose sum is larger zerowhere(~ is.numeric(.x) && sum(.x) > 0).Here the double
&operator is important. We first check if a column is numeric and only in this case the control flow moves on the check if thesumis greater zero. Without this we will receive an error that we have provided a non-numeric variable tosum().Created on 2023-01-17 with reprex v2.0.2