I am currently working with data in R. My dataset looks like this (except with about 3 million observations):
This is the current structure of the data:
And I have two objectives...
Objective 1 is to structure it so that it looks like this:
then my second objective is to go back to the original structure and make it look like this:
I have tried variations of aggregate with dcast (which is apparently being deprecated...?)
so, for example, I have tried this:
df2 <- dcast(df1, Store + Sales~ Year, value.var = "Sales")
or even
df %>%
group_by(Store, Year) %>%
summarise(across(starts_with('Sales'), sum))
And I get a diagonal of sales totals across years, but then I'm unable to summarize them so that it looks like
Store Year1 Year2
A $$ $$
B $$ $$
Since there is so much data, it looks like a bunch of stacked identity matrices ...except, instead of 1's there are sales values for the years (there are many many years, not just two).
I am looking for suggestions on how to proceed. One package I found was 'pivottabler' and I have not used it yet, I wanted to see if anyone had any better suggestions first.
:::Much appreciated:::