SQL to calculate cumulative sum that resets based on previous value in a column in Hive

Question

SQL to calculate cumulative sum that resets based on previous value in a column in Hive

2.5k Views Asked by yabtzey At 29 July 2025 at 12:09

I am trying to create a cumulative value with something like this

KEY1    Date_    VAL1    CUMU_VAL2
K1      D1      1       0
K1      D2      1       1
K1      D3      0       2
K1      D4      1       0
K1      D5      1       1

So, the issue is basically to keep on adding the value by 1 in column CUMU_VAL2 based on the previous row in VAL1, but this sum resets when the previous value in VAL1 column is zero. Basically if you do it in excel the formula for say Cell(D3) is

  D3 = IF(C2>0, D2+1, 0)

I believe I should be able to something like this, but how do I add in the Case when previous value is zero then reset the sum?

SELECT
   a1.*,
       SUM(a1.VAL1) OVER (PARTITION BY a1.KEY1 ORDER  BY a1.Date_ ) AS CUMU_VAL2
FROM source_table a1

Original Q&A

There are 2 best solutions below

Gordon Linoff On 24 September 2018 at 21:38

You can assign a group -- which is the sum of 0s after a given row. Then use count():

select t.KEY1, t.Date_, t.VAL1,
       count(*) over (partition by key1, grp, (case when val1 = 0 then 0 else 1 end)
                      order by date_
                     ) as cume_val1
from (select t.*,
             sum(case when a.val1 = 0 then 1 else 0 end) over (partition by key1 order by date_ rows between 1 following and unbounded following) as grp
      from source_table t
     ) t;

If val1 only takes on the values 0 and 1, then use row_number() instead of count().

**MatBailie** · Accepted Answer

My amendment to @GordonLinoff's answer as the OP didn't quite understand what I meant.

SELECT
  t.KEY1, t.Date_, t.VAL1,
  ROW_NUMBER() OVER (PARTITION BY key1, grp
                         ORDER BY Date_
                    )
                    - 1
                         AS CUMU_VAL2
FROM
(
  SELECT
    *,
    SUM(
      CASE WHEN val1 = 0 THEN 1 ELSE 0 END
    )
    OVER (
      PARTITION BY key1
          ORDER BY date_
    )
      AS grp
  FROM
    source_table
)
  t;

SQL to calculate cumulative sum that resets based on previous value in a column in Hive

There are 2 best solutions below

Related Questions in SQL

Related Questions in HIVE

Related Questions in PARTITION-BY

Trending Questions

Popular # Hahtags

Popular Questions