Transforming multivariate data in R to an aggregated table using dplyr and tidyr

96 Views Asked by At

I am agggregating and summarizing some multivariate data, using dplyr and tidyr. How do I present the data in a table-like form like below?

Data set:

year, division, group, count
2016, utensils, forks, 10
2016, utensils, spoons, 5
2016, utensils, knives, 20
2015, utensils, spoons, 4
2015, utensils, knives, 15
2015, utensils, forks, 11
2016, tools, hammer, 10
2016, tools, wrench, 5
2016, tools, awe, 20
2015, tools, hammer, 4
2015, tools, wrench, 15
2015, tools, awe 11

I would like to present the information like this:

          2016       2015
        Utensils  Utensils

Forks   count      count
Spoons  count      count
Knives  count      count

        2016      2015
        Tools    Tools

Hammer   count   count
Wrench   count   count 
Awe      count   count
1

There are 1 best solutions below

2
On BEST ANSWER

You can check this. Basically it is a reshape problem, but you need to split your data frame firstly by division column and then use dcast to transform each subset:

library(reshape2)
lapply(split(df, df$division), function(s) dcast(group ~ year + division, data = s, value.var = "count"))

#$tools
#   group 2015_tools 2016_tools
#1    awe         11         20
#2 hammer          4         10
#3 wrench         15          5

#$utensils
#   group 2015_utensils 2016_utensils
#1  forks            11            10
#2 kinves            15            20
#3 spoons             4             5

Or since each sub data frame contains only one unique division, you can drop it from the column names without adding it the dcast formula as it doesn't add extra information:

lapply(split(df, df$division), function(s) dcast(group ~ year, data = s, value.var = "count"))

#$tools
#   group 2015 2016
#1    awe   11   20
#2 hammer    4   10
#3 wrench   15    5

#$utensils
#   group 2015 2016
#1  forks   11   10
#2 kinves   15   20
#3 spoons    4    5