I am trying to build a GET request one row at a time from a dataframe where the possible parameters are a varied and large list, and the dataframe I'm passing to the function may not have the correctly named column headers.
SAMPLE STARTING DATA
library(tidyverse)
library(httr2)
dfInput <- structure(list(orig_zip = c("17502", "66616", "M1P2T7"),
orig_ctry = c("USA", "MEX", "CAN")),
row.names = c(NA, 3L), class = "data.frame")
> dfInput
orig_zip orig_ctry
1 17502 USA
2 66616 MEX
3 M1P2T7 CAN
Demonstration Function 1
This first function shows the desired output taken from each row of the dataframe, but here I have it hard-coded and not actually using ... as I would like. No flexibility in what parameters are chosen as this is written, but postalcode = df$orig_zip[i], country = df$orig_ctry[i] is the sort of code I need inside req_url_query():
f1 <- function(df, ...){
req <- httr2::request("http://some/base/url")
for (i in 1:nrow(df)){
req %>%
req_url_query(postalcode = df$orig_zip[i], country = df$orig_ctry[i]) %>%
req_dry_run()
}
}
> f1(dfInput, postalcode = orig_zip, country = orig_ctry)
GET /base/url?postalcode=17502&country=USA HTTP/1.1
***OUTPUT TRUNCATED***
GET /base/url?postalcode=66616&country=MEX HTTP/1.1
***OUTPUT TRUNCATED***
GET /base/url?postalcode=M1P2T7&country=CAN HTTP/1.1
***OUTPUT TRUNCATED***
Demonstration Function 2
This function makes use of ..., but I'm passing the actual values in the function call, and obviously this doesn't proceed through the dataframe.
f2 <- function(df, ...){
req <- httr2::request("http://some/base/url")
for (i in 1:nrow(df)){
req %>%
req_url_query(...) %>%
req_dry_run()
}
}
> f2(dfInput, postalcode = "17502", country = "USA")
GET /base/url?postalcode=17502&country=USA HTTP/1.1
***OUTPUT TRUNCATED***
GET /base/url?postalcode=17502&country=USA HTTP/1.1
***OUTPUT TRUNCATED***
GET /base/url?postalcode=17502&country=USA HTTP/1.1
***OUTPUT TRUNCATED***
Other Efforts
I made several efforts (not all shown) to play around with various rlang quoting functions until I got to what you see below, but I feel that this is getting probably too far afield from some more straightforward procedure.
f3 <- function(df, ...){
args <- list2(enexprs(...))
print(args)
a1 <- lapply(args, \(x) paste0(".data$", x, "[i]"))
print(a1)
# req <- httr2::request("http://some/base/url")
# for (i in 1:nrow(df)){
# req %>%
# req_url_query(...) %>%
# req_dry_run()
# }
}
> f3(dfInput, postalcode = orig_zip, country = orig_ctry)
[[1]]
[[1]]$postalcode
orig_zip
[[1]]$country
orig_ctry
[[1]]
[1] ".data$orig_zip[i]" ".data$orig_ctry[i]"
I'm still working on this, but after burning a couple hours, I'm not getting it and could use some help. I appreciate any that you could offer. Thank you.
Objectives (in priority order):
- Write a package-quality function(s) that will accept a dataframe and an unspecified number of arguments that can generate GET requests one row at a time. I believe
...is necessary to do this, but am a padawanR, not a JediR, so I'm open-minded on that point. - More broadly, I am trying to understand rlang, tidyeval, data-masking, quosures, quasiquotation, NSE, etc. I feel some of that is relevant here, but again... padawanR (with an 883 reputation after 9 years ;-) )
- More specifically to this problem as I laid out in my demonstration functions above, I have an odd situation that feels like it should be a somewhat common problem: I need to get
list(postalcode = "17502", country = "USA")one row at time from a dataframe with column namesorig_zipandorig_ctry, but next time I use the function, I might need to passlist(address = "1 Infinite Loop", city = "Redmond", state = "TN")from a dataframe with columnsorig_addr,orig_city, andorig_state. So, I have flexible function parameter (postalcode) that has to point to dataframe's column name (orig_zip) that has to point to one value in that column at a time ("17502"). How do you do that?
The trick is to use the dots to build an intermediate data frame with
transmute()that only contains the columns of interest. And then inject rows with!!!.Edit: Instead of
transmute()you might prefer to useselect()instead. With the former, you can create new columns with expressions. With the latter, you can use the full syntax of tidyselection to select existing columns.