I am trying to write a SQL query in Presto for a cumulative delivery forecast. However, there aren't always deliveries every quarter, so for a quarter with no deliveries, there is no row at all for that quarter's data. I would like to use the previous quarter's data when this happens.
I have tried creating a CTE with all the quarters I want populated and using some window functioning to fill in the missing data. I was trying to use a CASE statement for if null, then LAG(quarter - 1) but the CASE statement in general seemed not to work.
with forecasts as (
select quarter, type, location, sum(total)
from delivery_db
group by 1,2,3
)
quarters AS (
SELECT
*
FROM (
VALUES
('2024-Q1'),
('2024-Q2'),
('2024-Q3'),
('2024-Q4')
) v(quarter)
)
SELECT
q.quarter,
f.type,
f.location,
--CASE(WHEN f.quarter is not null then SUM(total) else LAG(f.quarter, 1))
SUM(total) OVER(
PARTITION BY
f.type
ORDER BY
q.quarter
) AS total
FROM quarters q
FULL JOIN forecasts f
on f.quarter = q.quarter
DeliveryDB looks like:
| Quarter | Type | Location | Total |
|---|---|---|---|
| 2024-Q1 | TypeA | TBD | 10 |
| 2024-Q1 | TypeA | TBD | 4 |
| 2024-Q4 | TypeA | TBD | 5 |
Actual
| Quarter | Type | Location | Total |
|---|---|---|---|
| 2024-Q1 | TypeA | TBD | 14 |
| 2024-Q2 | null | null | null |
| 2024-Q3 | null | null | null |
| 2024-Q4 | TypeA | TBD | 19 |
Desired
| Quarter | Type | Location | Total |
|---|---|---|---|
| 2024-Q1 | TypeA | TBD | 14 |
| 2024-Q2 | TypeA | TBD | 14 |
| 2024-Q3 | TypeA | TBD | 14 |
| 2024-Q4 | TypeA | TBD | 19 |
See example.
Output
With test data
Demo