How to get adjacent value in an OVER() window

129 Views Asked by At

I have the following data and query to get the season with the MAX(wins) up to the current season:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) SELECT team, season, wins
    ,MAX(wins) OVER (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_wins_thus_far
FROM results;

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, 6
DET, 2010, 6, 6
DET, 2011, 10, 10
DET, 2012, 4, 10
DET, 2013, 7, 10
DET, 2014, 11, 11

Here we can see, for example, that for DET, 2011 has the max wins of 10, and so the "max_wins" column is 10 from 2011 until 2014, when it takes on the greater value of 11. However, I'd like to pull in the season with the highest win total up through that point. For example, here is how the results would look:

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011 <-- 2011 has the most wins for DET
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014 <-- now 2014 is the season with the most wins...

How could this be done in an analytic function? The best I've been able to is to build an object with the data, but not sure where to go from there:

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, {"2010": 6}
DET, 2010, 6, {"2010": 6}
DET, 2011, 10, {"2010": 6, "2011": 10}
DET, 2012, 4, {"2010": 6, "2011": 10, "2012": 4}
DET, 2013, 7, {"2010": 6, "2011": 10, "2012": 4, "2013": 7}
DET, 2014, 11, {"2010": 6, "2011": 10, "2012": 4, "2013": 7, "2014": 11}
3

There are 3 best solutions below

4
On BEST ANSWER

You can use a second level of window functions. Just grab the most recent season where the wins is the maximum wins:

SELECT r.*,
       MAX(CASE WHEN wins = max_wins_thus_far THEN season END) OVER (PARTITION BY team ORDER BY season) as max_season
FROM (SELECT team, season, wins,
             MAX(wins) OVER (PARTITION BY team ORDER BY season) as max_wins_thus_far
      FROM results
     ) r;

Here is a db<>fiddle.

1
On

We can use some gaps-and-islands technique: the idea is to build groups of "adjacent" records with a window sum that increments every time a win is met that is greater than all preceding values. We can then use a window min() to recover the corresponding season (basically, that is the start of each island).

select team, season, wins, 
    greatest(wins, max_wins_1) max_wins_thus_far,
    min(season) over(partition by team, grp order by season) as season_with_max_wins_thus_far
from (
    select r.*,
        sum(case when wins > max_wins_1 then 1 else 0 end) 
            over(partition by team order by season) as grp
    from (
        select r.*,
            max(wins) over (
                partition by team 
                order by season 
                rows between unbounded preceding and 1 preceding
            ) as max_wins_1
        from results r
    ) r
) r

An alternative is a correalted subquery:

select team, season, wins, 
    max(wins) over(partition by team order by season) as max_wins_thus_far,
    (
        select r1.season
        from results r1 
        where r1.team = r.team and r1.season <= r.season
        order by r1.wins desc, r1.season
        limit 1
    ) as season_with_max_wins_thus_far
from results r

Demo on DB Fiddlde - both queries yield:

team | season | wins | max_wins_thus_far | season_with_max_wins_thus_far
:--- | -----: | ---: | ----------------: | ----------------------------:
DAL  |   2010 |    6 |                 6 |                          2010
DET  |   2010 |    6 |                 6 |                          2010
DET  |   2011 |   10 |                10 |                          2011
DET  |   2012 |    4 |                10 |                          2011
DET  |   2013 |    7 |                10 |                          2011
DET  |   2014 |   11 |                11 |                          2014
1
On

This is certainly the hack-iest approach possible, but given that both the season and wins are numeric, we can add them together and get the max of them (something like 2024 by adding 2010 and 14 together) and then retrieve the season by subtracting the max_wins up to that point. Here's an example:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) 
SELECT team, season, wins,
    max(wins) OVER through_current AS max_wins_thus_far
   ,max(wins + season) OVER through_current - max(wins) OVER through_current AS season_with_max_wins_thus_far
FROM results
WINDOW through_current AS (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014

Another approach would be to do a correlated subquery filtering by season <= current_season and team = team. For example:

) SELECT *,
    (SELECT season FROM results AS r_inner
     WHERE r_inner.season <= results.season AND r_inner.team = results.team
     ORDER BY WINS DESC LIMIT 1) best_season
 FROM results;