Why does !! (bang-bang) combined with as.name() give a different output compared to !! or as.name() alone?

1.8k Views Asked by At

I use a dynamic variable (eg. ID) as a way to reference a column name that will change depending on which gene I am processing at the time. I then use case_when within mutate to create a new column that will have values that depend on the dynamic column.

I thought that !! (bang-bang) was what I needed to force eval of the content of the variable; however, I did not get the expected output in my new column. Only the !!as.name gave me the output I was expecting, and I do not fully understand why. Could someone explain why in this case using only !! isn't appropriate and what is happening in !!as.name?

Here is a simple reproducible example that I made up to demo what I am experiencing:

library(tidyverse)

ID <- "birth_year"

# Correct output
test <- starwars %>%
  mutate(FootballLeague = case_when(
    !!as.name(ID) < 10 ~ "U10",
    !!as.name(ID) >= 10 & !!as.name(ID) < 50 ~ "U50",
    !!as.name(ID) >= 50 & !!as.name(ID) < 100 ~ "U100",
    !!as.name(ID) >= 100 ~ "Senior",
    TRUE ~ "Others"
  ))

# Incorrect output
test2 <- starwars %>%
  mutate(FootballLeague = case_when(
    !!(ID) < 10 ~ "U10",
    !!(ID) >= 10 & !!(ID) < 50 ~ "U50",
    !!(ID) >= 50 & !!(ID) < 100 ~ "U100",
    !!(ID) >= 100 ~ "Senior",
    TRUE ~ "Others"
  ))

# Incorrect output
test3 <- starwars %>%
  mutate(FootballLeague = case_when(
    as.name(ID) < 10 ~ "U10",
    as.name(ID) >= 10 & as.name(ID) < 50 ~ "U50",
    as.name(ID) >= 50 & as.name(ID) < 100 ~ "U100",
    as.name(ID) >= 100 ~ "Senior",
    TRUE ~ "Others"
  ))

identical(test, test2)
# FALSE

identical(test2, test3)
# TRUE

sessionInfo()
#R version 4.0.2 (2020-06-22)
#Platform: x86_64-centos7-linux-gnu (64-bit)
#Running under: CentOS Linux 7 (Core)

# tidyverse_1.3.0
# dplyr_1.0.2

Cheers!

2

There are 2 best solutions below

2
On BEST ANSWER

You can wrap your expressions in the function quo() to see the result of the operation after applying the !! operator. For simplicity I will use a shorter expression for demonstration:

Preparations:

library(tidyverse)
ID <- "birth_year"

## Test without quasiquotation:
starwars %>% 
  filter(birth_year < 50)

Experiment 1:

quo(
  starwars %>% 
    filter(ID < 50)
)
## result: starwars %>% filter(ID < 50)

We learn: filter() does not treat ID as variable, but "as is". So we need a mechanism to tell filter() that it should treat ID as variable, and it should use its value.

--> The !! operator can be used to tell filter() it should treat an expression as variable and substitute its value.

Experiment 2:

quo(
  starwars %>% 
    filter(!!ID < 50)
) 
## result: starwars %>% filter("birth_year" < 50)

We learn: The !! operator has indeed worked: ID was replaced with its value. But: The value of ID is the string "birth_year". Note the quotes in the result. But as you probably know, tidyverse functions don't take variable names as strings, they want the raw names, without quotes. Compare with Experiment 1: filter() takes everything "as is", so it looks for a column named "birth_year" (including the quotes!)

What does the function as.name() do?

This is a base R fuction that takes a string (or a variable containing a string) and returns the content of the string as variable name. So if you call as.name(ID) in base R, the result is birth_year, this time without quotes - just like the tidyverse expects it. So let's try it:

Experiment 3:

quo(
  starwars %>% 
    filter(as.name(ID) < 50)
) 
## result: starwars %>% filter(as.name(ID) < 50)

We learn: This did not work, because, again, filter() takes everything "as is". So now it looks for column named as.name(ID), which does of course not exist.

--> We need to combine the two things to make it work:

  1. Use as.name() to convert the string to a variable name.
  2. Use !! to tell filter() it should not take things "as is", but substitute the real value.

Experiment 4:

quo(
  starwars %>% 
    filter(!!as.name(ID) < 50)
) 
## result: starwars %>% filter(birth_year < 50)

Now it works! :)

I have used filter() in my experiments, but it works exactly the same with mutate() and other tidyverse functions.

1
On

To make it easier, you can also use .data[[]] as suggested by @Lionel Henry in this comment. See also rlang 0.4.0 release notes

library(tidyverse)

ID <- "birth_year"

# Correct output
test <- starwars %>%
  mutate(FootballLeague = case_when(
    !!as.name(ID) < 10 ~ "U10",
    !!as.name(ID) >= 10 & !!as.name(ID) < 50 ~ "U50",
    !!as.name(ID) >= 50 & !!as.name(ID) < 100 ~ "U100",
    !!as.name(ID) >= 100 ~ "Senior",
    TRUE ~ "Others"
  ))
test

Using .data

test2 <- starwars %>%
  mutate(FootballLeague = case_when(
    .data[[ID]]   < 10 ~ "U10",
    .data[[ID]]  >= 10 & .data[[ID]]  < 50 ~ "U50",
    .data[[ID]]  >= 50 & .data[[ID]]  < 100 ~ "U100",
    .data[[ID]]  >= 100 ~ "Senior",
    TRUE ~ "Others"
  ))
test2
#> # A tibble: 87 x 15
#>    name               height  mass hair_color    skin_color  eye_color
#>    <chr>               <int> <dbl> <chr>         <chr>       <chr>    
#>  1 Luke Skywalker        172    77 blond         fair        blue     
#>  2 C-3PO                 167    75 <NA>          gold        yellow   
#>  3 R2-D2                  96    32 <NA>          white, blue red      
#>  4 Darth Vader           202   136 none          white       yellow   
#>  5 Leia Organa           150    49 brown         light       brown    
#>  6 Owen Lars             178   120 brown, grey   light       blue     
#>  7 Beru Whitesun lars    165    75 brown         light       blue     
#>  8 R5-D4                  97    32 <NA>          white, red  red      
#>  9 Biggs Darklighter     183    84 black         light       brown    
#> 10 Obi-Wan Kenobi        182    77 auburn, white fair        blue-gray
#> 11 Anakin Skywalker      188    84 blond         fair        blue     
#> 12 Wilhuff Tarkin        180    NA auburn, grey  fair        blue     
#> 13 Chewbacca             228   112 brown         unknown     blue     
#> 14 Han Solo              180    80 brown         fair        brown    
#> 15 Greedo                173    74 <NA>          green       black    
#>    birth_year sex    gender    homeworld species films     vehicles  starships
#>         <dbl> <chr>  <chr>     <chr>     <chr>   <list>    <list>    <list>   
#>  1       19   male   masculine Tatooine  Human   <chr [5]> <chr [2]> <chr [2]>
#>  2      112   none   masculine Tatooine  Droid   <chr [6]> <chr [0]> <chr [0]>
#>  3       33   none   masculine Naboo     Droid   <chr [7]> <chr [0]> <chr [0]>
#>  4       41.9 male   masculine Tatooine  Human   <chr [4]> <chr [0]> <chr [1]>
#>  5       19   female feminine  Alderaan  Human   <chr [5]> <chr [1]> <chr [0]>
#>  6       52   male   masculine Tatooine  Human   <chr [3]> <chr [0]> <chr [0]>
#>  7       47   female feminine  Tatooine  Human   <chr [3]> <chr [0]> <chr [0]>
#>  8       NA   none   masculine Tatooine  Droid   <chr [1]> <chr [0]> <chr [0]>
#>  9       24   male   masculine Tatooine  Human   <chr [1]> <chr [0]> <chr [1]>
#> 10       57   male   masculine Stewjon   Human   <chr [6]> <chr [1]> <chr [5]>
#> 11       41.9 male   masculine Tatooine  Human   <chr [3]> <chr [2]> <chr [3]>
#> 12       64   male   masculine Eriadu    Human   <chr [2]> <chr [0]> <chr [0]>
#> 13      200   male   masculine Kashyyyk  Wookiee <chr [5]> <chr [1]> <chr [2]>
#> 14       29   male   masculine Corellia  Human   <chr [4]> <chr [0]> <chr [2]>
#> 15       44   male   masculine Rodia     Rodian  <chr [1]> <chr [0]> <chr [0]>
#>    FootballLeague
#>    <chr>         
#>  1 U50           
#>  2 Senior        
#>  3 U50           
#>  4 U50           
#>  5 U50           
#>  6 U100          
#>  7 U50           
#>  8 Others        
#>  9 U50           
#> 10 U100          
#> 11 U50           
#> 12 U100          
#> 13 Senior        
#> 14 U50           
#> 15 U50           
#> # ... with 72 more rows

Check if they are the same

identical(test, test2)
#> [1] TRUE

Created on 2020-11-26 by the reprex package (v0.3.0)