SQL Server - Counting number of times an attribute in a dataset changes (non-concurrently)

1k Views Asked by At

I have a query that returns either a 1 or 0 based on whether or not an event occurred on a given date. This is ordered by date. Basically, a simple result set is:

Date         |  Type
---------------------
2010-09-27      1
2010-10-11      1
2010-11-29      0
2010-12-06      0
2010-12-13      1
2010-12-15      0
2010-12-17      0
2011-01-03      1
2011-01-04      0

What I would now like to be able to do is to count the number of separate, non-concurrent instances of '0's there are - i.e. count how many different groups of 0s appear.

In the above instance, the answer should be 3 (1 group of 2, then another group of 2, then finally 1 to end with).

Hopefully, the above example illustrates what I am trying to get at. I have been searching for a while, but am finding it difficult to succinctly describe what I am looking for, and hence haven't found anything of relevance.

Thanks in advance,

Josh

3

There are 3 best solutions below

0
On BEST ANSWER

This is a variant of the "islands" problem. My first answer uses Itzik Ben Gan's double row_number trick to identify contiguous groups of data efficiently. The combination of Type,Grp identifies each individual island in the data.

You can read more about the different approaches to tackling this problem here.

;WITH T AS (
    SELECT  *,
            ROW_NUMBER() OVER(ORDER BY Date) -
            ROW_NUMBER() OVER(PARTITION BY Type ORDER BY Date)  AS Grp
    FROM    YourTable
)
SELECT  COUNT(DISTINCT Grp)
FROM    T
WHERE Type=0

My second answer requires a single pass through the data. It is not guaranteed to work but is on the same principle as a technique that many people successfully use to concatenate strings without problems.

DECLARE @Count int = 0

SELECT @Count = CASE WHEN Type = 0 AND @Count <=0 THEN -@Count+1 
                     WHEN Type = 1 AND @Count > 0 THEN - @Count
                     ELSE @Count END

FROM YourTable
ORDER BY Date

SELECT ABS(@Count)
2
On

You could give each row a number in a CTE. Then you can join the table on itself to find the previous row. Knowing the previous row, you can sum the number of times the previous row is 1 and the current row is 0. For example:

; with NumberedRows as
        (
        select  row_number() over (order by date) as rn
        ,       type
        from    YourTable
        )
select  sum(case when cur.type = 0 and IsNull(prev.type,1) = 1 then 1 end)
from    NumberedRows cur
left join    
        NumberedRows prev
on      cur.rn = prev.rn + 1
2
On

Have a look at this example, using Sql Server 2005+

DECLARE @Table TABLE(
        Date DATETIME,
        Type INT
)

INSERT INTO @Table SELECT '2010-09-27',1
INSERT INTO @Table SELECT '2010-10-11',1
INSERT INTO @Table SELECT '2010-11-29',0
INSERT INTO @Table SELECT '2010-12-06',0
INSERT INTO @Table SELECT '2010-12-13',1
INSERT INTO @Table SELECT '2010-12-15',0
INSERT INTO @Table SELECT '2010-12-17',0
INSERT INTO @Table SELECT '2011-01-03',1
INSERT INTO @Table SELECT '2011-01-04',0

;WITH Vals AS (
    SELECT  *,
            ROW_NUMBER() OVER(ORDER BY Date) ROWID
    FROM    @Table
)
SELECT  v.*
FROM    Vals v LEFT JOIN
        Vals vNext ON   v.ROWID + 1 = vNext.ROWID
WHERE   v.Type = 0
AND     (vNext.Type = 1 OR vNext.Type IS NULL)