Missing dates on Min() while using Over and Partition by in T-SQL

167 Views Asked by At

I am using OVER, and Partition by to get the mindate and max date of dataset.

|ResdetId | bookingdate | Amount | AmountExcl |
-----------------------------------------------
|120106   | 2018-02-04  |  75.00 |  70.7547   |
|120106   | 2018-02-05  |  75.00 |  70.7547   |
|120106   | 2018-02-06  |  90.00 |  84.9057   |
|120106   | 2018-02-08  |  75.00 |  70.7547   |
|120106   | 2018-02-09  |  75.00 |  70.7547   |

I am using this query

select distinct ResDetId, Amount, AmountExcl, 
    min(Bookingdate) OVER(Partition by ResDetId, Amount, AmountExcl) as Mindate,
    max(Bookingdate) OVER(Partition by ResDetId, Amount, AmountExcl) as MaxDate
 from @Cumulatedbookingdetails

And I am getting this result

|ResdetId | Amount | AmountExcl | MinDate    | MaxDate     |
------------------------------------------------------------
|120106   | 75.00  |  70.7547   | 2018-02-04 |  2018-02-09 |
|120106   | 90.00  |  84.9057   | 2018-02-06 |  2018-02-06 |

As we see date 2018-02-07 record is missing from the data set. So, I need result like this

|ResdetId | Amount | AmountExcl | MinDate    | MaxDate     |
------------------------------------------------------------
|120106   | 75.00  |  70.7547   | 2018-02-04 |  2018-02-05 |
|120106   | 75.00  |  70.7547   | 2018-02-08 |  2018-02-09 |
|120106   | 90.00  |  84.9057   | 2018-02-06 |  2018-02-06 |
4

There are 4 best solutions below

0
On BEST ANSWER

One way to approach an "Islands and Gaps" problem, such as this, is to use a recursive CTE to build up the islands. We make the non-recursive portion (above the union) find the row which marks the start of each island, and the recursive part grows each island one match at a time.

The final results of the CTE unfortunately contain all of the intermediate rows used in building the islands, so you need a final GROUP by to select the final island out:

declare @t table (ResdetId int, bookingdate date, Amount decimal(9,3), AmountExcl decimal (9,3))
insert into @t(ResdetId,bookingdate,Amount,AmountExcl) values
(120106,'20180204',75.00,70.7547),
(120106,'20180205',75.00,70.7547),
(120106,'20180206',90.00,84.9057),
(120106,'20180208',75.00,70.7547),
(120106,'20180209',75.00,70.7547)

;With Islands as (
    select ResdetId, Amount, AmountExcl,bookingdate as MinDate,bookingDate as MaxDate
    from @t t
    where not exists (select * from @t t2
        where t2.ResdetId = t.ResdetId
        and t2.Amount = t.Amount
        and t2.AmountExcl = t.AmountExcl
        and t2.bookingdate = DATEADD(day,-1,t.BookingDate))
    union all
    select i.ResdetId, i.Amount,i.AmountExcl,i.MinDate,t.bookingDate
    from Islands i
        inner join
        @t t
        on t.ResdetId = i.ResdetId
        and t.Amount = i.Amount
        and t.AmountExcl = i.AmountExcl
        and t.bookingdate = DATEADD(day,1,i.MaxDate)
)
select
    ResdetId, Amount, AmountExcl,MinDate,MAX(MaxDate) as MaxDate
from
    Islands
group by ResdetId, Amount, AmountExcl,MinDate

Results:

ResdetId    Amount    AmountExcl   MinDate    MaxDate
----------- --------- ------------ ---------- ----------
120106      75.000    70.755       2018-02-04 2018-02-05
120106      75.000    70.755       2018-02-08 2018-02-09
120106      90.000    84.906       2018-02-06 2018-02-06
0
On

You didn't see 2018-02-07 because the bookingdate is not in your Partition so

|ResdetId | Amount | AmountExcl 
--------------------------------
|120106   | 75.00  |  70.7547   
|120106   | 90.00  |  84.9057   

are unique by your Partition. So it is like a Key. You need another attribute to differentiate the same data:

|ResdetId | Amount | AmountExcl 
--------------------------------
|120106   | 75.00  |  70.7547 
1
On

Try this, it uses row numbers differnce technique:

declare @tbl table(ResdetId int, bookingdate date, Amount float, AmountExcl float);
insert into @tbl values
(120106   , '2018-02-04'  ,  75.00 ,  70.7547   ),
(120106   , '2018-02-05'  ,  75.00 ,  70.7547   ),
(120106   , '2018-02-06'  ,  90.00 ,  84.9057   ),
(120106   , '2018-02-08'  ,  75.00 ,  70.7547   ),
(120106   , '2018-02-09'  ,  75.00 ,  70.7547   );

select MIN(bookingDate), MAX(bookingDate), Amount, AmountExcl
from (
    select *,
           ROW_NUMBER() over (order by bookingDate) -
           ROW_NUMBER() over (partition by amount, AmountExcl order by bookingDate) rn
    from @tbl
) a group by Amount, AmountExcl, rn
2
On

This would be much easier to do with GROUP BY. OVER and DISTINCT are much "harder" ways to do the same query:

WITH VTE AS(
    SELECT ResdetId,
           CONVERT(date,bookingdate) AS bookingdate,
           Amount,
           AmountExcl
    FROM (VALUES (120106,'20180204',75.00,70.7547),
                 (120106,'20180205',75.00,70.7547),
                 (120106,'20180206',90.00,84.9057),
                 (120106,'20180208',75.00,70.7547),
                 (120106,'20180209',75.00,70.7547)) V(ResdetId,bookingdate,Amount,AmountExcl))
SELECT ResdetId,Amount,AmountExcl,
       MIN(bookingdate) AS MinBookingDate,
       MAX(bookingdate) AS MaxBookingDate
FROM VTE
GROUP BY ResdetId,Amount,AmountExcl;

As noted my Sami, I had read the results the wrong way round, this is a Gaps and Island question:

WITH VTE AS(
    SELECT ResdetId,
           CONVERT(date,bookingdate) AS bookingdate,
           Amount,
           AmountExcl
    FROM (VALUES (120106,'20180204',75.00,70.7547),
                 (120106,'20180205',75.00,70.7547),
                 (120106,'20180206',90.00,84.9057),
                 (120106,'20180208',75.00,70.7547),
                 (120106,'20180209',75.00,70.7547)) V(ResdetId,bookingdate,Amount,AmountExcl)),
Grps AS(
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY ResdetId ORDER BY V.bookingdate) - 
           ROW_NUMBER() OVER (PARTITION BY ResdetId, Amount ORDER BY V.bookingdate) AS Grp
    FROM VTE V)
SELECT ResdetId,
       Amount,
       AmountExcl,
       MIN(bookingdate) AS MinBookingDate,
       MAX(bookingdate) AS MaxBookingDate
FROM Grps
GROUP BY ResdetId,
         Amount,
         AmountExcl,
         Grp
ORDER BY ResdetId,
         Amount,
         MinBookingDate;