I have a table with multiple encounters. Each encounter can have multiple encounter groups, and there can be 1 or more activities within the encounter group.
| encounter_id | encounter_group_id | activity |
|---|---|---|
| 100 | 1000 | check in |
| 100 | 1000 | process |
| 100 | 1000 | check out |
| 100 | 1001 | check in |
| 100 | 1001 | check out |
| 100 | 1002 | check in |
| 100 | 1002 | transform |
| 100 | 1002 | process |
| 100 | 1002 | load |
| 100 | 1002 | check out |
| 100 | 1003 | check in |
| 100 | 1003 | terminate |
I'm trying to use the LAG function to get this result
| encounter_id | encounter_group_id | prev_group_id | activity |
|---|---|---|---|
| 100 | 1000 | NULL | check in |
| 100 | 1000 | NULL | process |
| 100 | 1000 | NULL | check out |
| 100 | 1001 | 1000 | check in |
| 100 | 1001 | 1000 | check out |
| 100 | 1002 | 1001 | check in |
| 100 | 1002 | 1002 | transform |
| 100 | 1002 | 1001 | process |
| 100 | 1002 | 1001 | load |
| 100 | 1002 | 1001 | check out |
| 100 | 1003 | 1002 | check in |
| 100 | 1003 | 1002 | terminate |
I'm using this statement, but in each case the lag only gets the encounter group from the previous row.
select encounter_id,
encouter_group_id,
LAG(encounter_group_id) OVER (ORDER BY encntr_group_id),
activity
from encounter_activity
The results set looks like this instead:
| encounter_id | encounter_group_id | prev_group_id | activity |
|---|---|---|---|
| 100 | 1000 | NULL | check in |
| 100 | 1000 | 1000 | process |
| 100 | 1000 | 1000 | check out |
| 100 | 1001 | 1000 | check in |
| 100 | 1001 | 1001 | check out |
| 100 | 1002 | 1001 | check in |
| 100 | 1002 | 1002 | transform |
| 100 | 1002 | 1002 | process |
| 100 | 1002 | 1002 | load |
| 100 | 1002 | 1002 | check out |
| 100 | 1003 | 1002 | check in |
| 100 | 1003 | 1003 | terminate |
Hopefully I'm just missing a minor tweak to be able to show the previous group correctly
Since LAG takes a second argument, you can calculate it in a subquery as the dense_rank() on the encounter_group_id partition: