How do i unquote a character column name for fable::aggregate_key?

307 Views Asked by At

I am trying to use the aggregate_key function in the fable package to create a hierarchical time series in a shiny flexdashboard. The below code works fine as long as I can hard code in the column name 'value'.

library(tidyverse)
library(tsibble)
library(fable)
library(fpp2)

agg_key <- cbind.data.frame(visnights, year=1900:1975) %>% 
  pivot_longer(NSWMetro:OTHNoMet, names_to=c("State", "Region"), names_sep=c(3)) %>%
  as_tsibble(index=year, key=c(State, Region)) %>%
  aggregate_key(State / Region, value=sum(value))

The problem comes because I'm using a flexdashboard input to get the column name so it comes in as the character string "value". I've tried to following to no avail.

#only repeating the last line in the pipe for brevity
aggregate_key(State / Region, value=sum(!!"value"))
aggregate_key(State / Region, value=sum(!!!"value"))
aggregate_key(State / Region, value=sum(as.name("value")))
aggregate_key(State / Region, value=sum(as_label("value")))

Please help me figure out how to pass a character string to this function.

1

There are 1 best solutions below

0
On BEST ANSWER

The aggregate_key() function has summarise() semantics, so anything that works with non-standard evaluation (NSE) and summarise() should also work here.

To convert a string to a symbol (the name of the column), you can use as.name("value"), as.symbol("value"), or rlang::sym("value"). Your attempts above are very close, and you have all the necessary ingredients for making it work.

A working solution for this is using value = sum(!!as.name("value")):

library(tidyverse)
library(tsibble)
library(fable)
library(fpp2)

cbind.data.frame(visnights, year=1900:1975) %>% 
  pivot_longer(NSWMetro:OTHNoMet, names_to=c("State", "Region"), names_sep=c(3)) %>%
  as_tsibble(index=year, key=c(State, Region)) %>%
  aggregate_key(State / Region, value=sum(!!as.name("value")))
#> # A tsibble: 2,052 x 4 [1Y]
#> # Key:       State, Region [27]
#>     year State        Region       value
#>    <int> <chr>        <chr>        <dbl>
#>  1  1900 <aggregated> <aggregated>  83.4
#>  2  1901 <aggregated> <aggregated>  64.6
#>  3  1902 <aggregated> <aggregated>  71.3
#>  4  1903 <aggregated> <aggregated>  70.0
#>  5  1904 <aggregated> <aggregated>  86.4
#>  6  1905 <aggregated> <aggregated>  66.4
#>  7  1906 <aggregated> <aggregated>  71.4
#>  8  1907 <aggregated> <aggregated>  67.4
#>  9  1908 <aggregated> <aggregated>  84.6
#> 10  1909 <aggregated> <aggregated>  64.0
#> # … with 2,042 more rows

Created on 2020-09-23 by the reprex package (v0.3.0)

As you mention above, writing this code interactively you would use value = sum(value). To do this with a string programmatically, you need to convert "value" to value using rlang::sym("value") (or alternatives above), and then !! is used to tell aggregate_key() to use the column of data value rather than the literal symbol value.

Some further tips on programming with dplyr (and consequently, aggregate_key()) can be found here: https://dplyr.tidyverse.org/articles/programming.html