This is my first time posting, so bear with me. I'm using purrr::pmap()
to map a function over 3 columns of a tibble()
, to create a 4th column
library(tidyverse)
set.seed(123)
df <- tibble(a = as.character(1:3), b = sample(LETTERS, 3), c = sample(letters, 3))
df
For the sake of argument, using the below mutate() creates the expected outcome of the simplistic example, but my use case is more complex than str_c(...) - it's a trigger for a SQL query based on the content of columns a, b, c (plus some other data transforms).
df %>%
mutate(str = str_c(a, b, c))
Using an un-named list in pmap(.l)
and ~str_c()
generates the expected outcome as below:
df %>%
mutate(str = pmap_chr(list(a, b, c),
~str_c(..1, ..2, ..3)))
Naming the columns in the .l
argument and assigning the function through function()
also works as expected:
df %>%
mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),
function(list_a, list_b, list_c) str_c(list_a, list_b, list_c)))
But what I can't understand is why the following errors with the warning that 'list_a' not found
?
df %>%
mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),
~str_c(list_a, list_b, list_c)))
Obviously the issue is simple to resolve by using function()
instead of ~
, but I'd like to understand why ~
function assignment doesn't seem to respect the names assigned in .l
for my understanding of R.
We can do this without using
pmap
The idea of using named argument is that it should match the arguments of the function. Here,
str_c
takes a variadic argument i.e. any number of inputsFor that reason, it can be used by specifying the column names one by one and that is a vectorized option. However, with
pmap
it is looping over each row and is not really needed for this functionstr_c
.Just like
paste
, we can either usedo.call
orinvoke
(frompurrr
) to pass variadic arguments -cur_data()
returns the dataset and the data.frame/tibble is alist
with elements/columns of equallength
If we want to check the output from
pmap
, concatenate all the elements and check - it returns a named vector which can be accessed (if needed)ie.