While defining a variable in BigQuery, how to compare value in a row with the value obtained above (while variable creation is still in process)?
I have a table with ID, start and end dates. I want to create a MaxDate variable which will compare (from 2nd row onwards) if the start date of each row is <= to "lag" end date, and if the condition is true, then MaxDate should be the maximum of the EndDate and the "previous" EndDate immediately above.
| ID | StartDate | EndDate | MaxDate |
|---|---|---|---|
| A | 2019-10-25 | 2019-10-31 | 2019-10-31 |
| A | 2019-10-26 | 2019-10-26 | 2019-10-31 |
| A | 2019-10-28 | 2019-10-30 | 2019-10-31 |
| A | 2019-10-29 | 2019-10-29 | 2019-10-31 |
The below approach is failing at the second lag row:
WITH
S1 AS
(
SELECT ID, 1 AS COUNT
FROM S0
ORDER BY ID, StartDate, EndDate
)
,S2 AS
(
SELECT *,
CASE WHEN StartDate < LAG(EndDate) OVER (PARTITION BY ID ORDER BY StartDate, EndDate)
THEN
MAX(EndDate) OVER (PARTITION BY ID ORDER BY StartDate, EndDate
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ELSE EndDate
END AS EndDate2
FROM S1
)
,S3 AS
(SELECT *,
CASE WHEN StartDate > LAG(EndDate2) OVER (PARTITION BY ID ORDER BY StartDate, EndDate) + 1
THEN
MAX(COUNT) OVER (PARTITION BY ID ORDER BY StartDate, EndDate
ROWS 1 PRECEDING) + 1
ELSE 1
END AS Visit_Count
FROM S2
ORDER BY ID, StartDate, EndDate
)
(
SELECT ID, Visit_Count, MIN(StartDate) AS StartDate, MAX(EndDate) AS EndDate
FROM S3
GROUP BY ID, Visit_Count
ORDER BY ID, Visit_Count
)
;
Thanks!
Figured out: