Say you're using the tidyverse to nest() a select group of categorical variables:
library(tidyverse)
library(janitor)
nested_df <- mpg %>%
select(manufacturer, class) %>%
gather(variable, value) %>%
group_by(variable) %>%
nest()
nested_df
# A tibble: 2 x 2
variable data
<chr> <list>
1 manufacturer <tibble [234 x 1]>
2 class <tibble [234 x 1]>
Now we can add a new column which contains the output from janitor::tabyl:
nested_df %>%
mutate(
table_output = map(data, ~ tabyl(.$value))
)
# A tibble: 2 x 3
variable data table_output
<chr> <list> <list>
1 manufacturer <tibble [234 x 1]> <tabyl [15 x 3]>
2 class <tibble [234 x 1]> <tabyl [7 x 3]>
Questions:
- How can we print or walk through the output to get both the
variablename and thetable_output? - Is there a better approach (e.g. using
splitinstead ofgroup_by %>% nest?
Something like printing the following...
Variable is: manufacturer
Tabyl Output:
.$value n percent
audi 18 0.07692308
chevrolet 19 0.08119658
dodge 37 0.15811966
ford 25 0.10683761
...more rows...
mercury 4 0.01709402
nissan 13 0.05555556
pontiac 5 0.02136752
subaru 14 0.05982906
toyota 34 0.14529915
volkswagen 27 0.11538462
Variable is: class
Tabyl Output:
.$value n percent
2seater 5 0.02136752
compact 47 0.20085470
midsize 41 0.17521368
minivan 11 0.04700855
pickup 33 0.14102564
subcompact 35 0.14957265
suv 62 0.26495726
We can use
pwalk,cat, andprint. The input topwalkis a data.frame (list of lists) containing only thevariableandtable_outputcolumns. Similar topmap,pwalkwalks through each element of both columns simultaneously and are being referenced by.xand.yin the anonymous function. Different frompmap,pwalkexecutes the code without returning any output. This is useful when we only want the side-effect of the code execution:To print strings, we use
catto avoid the[1]in front. To print the table output, we useprint."\n"s are added to pad blank lines for readability.Output: