I have two tables that have the following structures. Table 1, which I will call the Summary Table, is a list of category-values with a count:
| Category | Value | Count |
|---|---|---|
| Cat1 | Val1 | |
| Cat1 | Val2 | |
| Cat1 | Val3 | |
| Cat2 | Val1 | |
| Cat2 | Val2 | |
| Cat3 | Val1 | |
| Cat3 | Val2 |
summary <- data.frame(Category = c('Cat1', 'Cat1', 'Cat1', 'Cat2', 'Cat2', 'Cat3', 'Cat3'),
Value = c('Val1', 'Val2', 'Val3', 'Val1', 'Val2', 'Val1', 'Val2'),
Count = c(NA,NA,NA,NA,NA,NA,NA))
I want to populate this table with counts gathered from Table 2, which we will call Raw Data Table, which has the following structure:
| Entity | Cat1 | Cat2 | Cat3 |
|---|---|---|---|
| Ent1 | Val1 | Val1 | Val2 |
| Ent2 | Val1 | Val1 | Val2 |
| Ent3 | Val2 | Val2 | Val1 |
| Ent4 | Val2 | Val1 | Val2 |
| Ent5 | Val3 | Val1 | Val2 |
| Ent6 | Val3 | Val1 | Val1 |
| Ent7 | Val3 | Val2 | Val2 |
rawdata <- data.frame(Entity = c('Ent1', 'Ent2', 'Ent3', 'Ent4', 'Ent5', 'Ent6', 'Ent7'),
Cat1 = c('Val1', 'Val1', 'Val2', 'Val2', 'Val3', 'Val3', 'Val3'),
Cat2 = c('Val1', 'Val1', 'Val2', 'Val1', 'Val1', 'Val1', 'Val2'),
Cat3 = c('Val2', 'Val2', 'Val1', 'Val2', 'Val2', 'Val1', 'Val2'))
I want to populate the "Count" column from the summary table with the appropriate counts for each category & value pair. Programmatically, what I would want to do would be to have a counter, go through the Raw Data Table and just update the count for each value. I think this would be very inefficient in R. What I thought I would do is filter for the values but because column names are not evaluated as variables, I am at a loss of how to do this.
What I have tried (and I what I think I want something like is):
library(dplyr)
summary$Count <- nrow(rawdata %>% filter(get(summary$Category) == get(summary$Value)))
This isn't working, however. How do I get the filter to take values from another table?
We can reshape to 'long' format with
pivot_longerand usecountto get the frequency count-output
NOTE:
pivot_longerreshapes the data from the 'wide' format to 'long' format. By specifying thecols = -Entity, it is converting the rest of the columns to 'long' format with the column name as "Category" specified bynames_toand the corresponding values as "Value" (values_to)Or using
base Rwithtable