Potential Use of DENSE_RANK() in SQL

64 Views Asked by At

In SQL I'm looking for a potential function which I can use to apply a row number each time a certain criteria is met.

Each time the Unique ID is the same (i.e. same person) and the Dep_Group is WES then give that row the same number if they follow on from one another. What I'm trying to do is eventually squish those rows into one, hopefully by adding in a GROUP BY New_Col clause to the end. So in essence Line 1 & 2 become one line, also Line 5 & 6.

|Line| Uniq_ID | Start_DT      | End_DT        | Sequence | Department | Dep_Group | New_Col|
| 1  |   1     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     1  |
| 2  |   1     | 2023-01-02    | 2023-01-03    |    2     |     WES    |    WES    |     1  |
| 3  |   1     | 2023-01-02    | 2023-01-03    |    3     |     DEM    |    NULL   |     2  |
| 4  |   2     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     3  |
| 5  |   3     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     4  |
| 6  |   3     | 2023-01-02    | 2023-01-03    |    2     |     WES    |    WES    |     4  |
| 7  |   4     | 2023-01-02    | 2023-01-03    |    1     |     DEM    |    NULL   |     5  |

Example of the code I have so far which is not quite working as I wanted with DENSE_RANK():

SELECT Line
       ,Uniq_ID
       ,Start_DT
       ,End_DT
       ,Sequence
       ,Department
       ,Dep_Group
       ,DENSE_RANK() OVER (PARTITION BY Uniq_ID, Dep_Group ORDER BY Sequence) AS New_Col

FROM TblA
2

There are 2 best solutions below

1
On BEST ANSWER

It is fairly straight forward: use running sum which increments if not (previous group = wes and current group = wes):

with cte1 as (
  select *
       , case when lag(dep_group) over (partition by uniq_id order by sequence) = 'WES' and dep_group = 'WES' then 0 else 1 end as incr
  from t
), cte2 as (
  select *
       , sum(incr) over (partition by uniq_id order by sequence) as grp_num
   from cte1
)
select *
from cte2

This gives you the following result:

line uniq_id start_dt end_dt sequence department dep_group incr grp_num
1 1 2023-01-02 2023-01-03 1 WES WES 1 1
2 1 2023-01-02 2023-01-03 2 WES WES 0 1
3 1 2023-01-02 2023-01-03 3 DEM null 1 2
4 2 2023-01-02 2023-01-03 1 WES WES 1 1
5 3 2023-01-02 2023-01-03 1 WES WES 1 1
6 3 2023-01-02 2023-01-03 2 WES WES 0 1
7 4 2023-01-02 2023-01-03 1 DEM null 1 1

The grp_num column can be used to group rows along with uniq_id.

1
On
SELECT Line,
       Uniq_ID,
       Start_DT,
       End_DT,
       Sequence,
       Department,
       Dep_Group,
       SUM(Flag) OVER (PARTITION BY Uniq_ID ORDER BY Sequence) AS New_Col
FROM (
  SELECT *,
         IIF(LAG(Dep_Group) OVER (PARTITION BY Uniq_ID ORDER BY Sequence) = 'WES' AND Dep_Group = 'WES', 0, 1) AS Flag
  FROM TblA
) AS subquery;