I have table in SAS Enterprise Guide like below:
ID | COL1 | VAL1 |
----|------|------|
111 | A | 10 |
111 | A | 5 |
111 | B | 10 |
222 | B | 20 |
333 | C | 25 |
... | ... | ... |
And I need to aggregate above table to know:
- sum of values from COL1 per ID
- sum of values from VAL1 per COL1 per ID
So, as a result I need something like below:
ID | COL1_A | COL1_B | COL1_C | COL1_A_VAL1_SUM | COL1_B_VAL1_SUM | COL1_C_VAL1_SUM
----|--------|--------|---------|-----------------|-----------------|------------------
111 | 2 | 1 | 0 | 15 | 10 | 0
222 | 0 | 1 | 0 | 0 | 20 | 0
333 | 0 | 0 | 1 | 0 | 0 | 25
for example because:
- COL1_A = 2 for ID 111, because ID=111 has 2 times "A" in COL1
- COL1_A_VAL1_SUM = 15 for ID 111, because ID=111 has 10+5=15 in VAL1 for "A" in COL1
How can I do that in SAS Enterpriuse Guide or in PROC SQL ?
First, we'll create the counts that we need by group with SQL:
This produces the following table:
Now we need to transpose this into the way we want it. We'll do this with two transpose steps: one for
count_col1
, and one forsum_val1
.proc transpose
has a few handy options to make this easy, namely theid
,prefix
, andsuffix
options.First, we'll consider our ID variable
col1
. This creates columns namedA
,B
, andC
. For example:The
prefix
andsuffix
options let us add a prefix and suffix to these names.This gives us two tables:
Now we just need to merge them together, then set all missing values to 0 by iterating over each numeric column and checking if it's missing.
Final table: