Count all rows while not counting any row after a negative value

Question

Count all rows while not counting any row after a negative value

138 Views Asked by Vega At 18 August 2025 at 16:10

I have a table t with:

PLACE	LOCATION	TS	ID	AMOUNT	GOING_IN	GOING_OUT
1	10	2020-10-01	1	100	10	0
1	10	2020-10-02	1	110	5	-50
1	10	2020-10-03	1	75	0	-100
1	10	2020-10-04	1	-25	30	0
1	10	2020-10-05	1	5	0	0
1	10	2020-10-06	1	5	38	-300
1	10	2020-10-07	1	-257	0	0
1	10	2020-10-01	2	1	10	0
1	10	2020-10-02	2	11	0	-12
1	10	2020-10-03	2	-1	0	-100
1	10	2020-10-04	2	-101	0	0
2	20	2020-11-15	1	18	20	0
2	20	2020-11-16	1	38	0	0
2	20	2020-11-15	3	-9	20	-31
2	20	2020-11-16	3	-20	0	0

So due to SAP legacy stuff some logistic data is mangled which may lead to negative inventory.

To check how severe the error is I need to count for each PLACE, LOCATION, ID

the number of rows that have a positive AMOUNT AND which do not have a negative AMOUNT before
the number of rows that have a negative AMOUNT AND any positive AMOUNT that has a negative AMOUNT anywhere before

As you can see in my table there are (for PLACE=1, LOCATION=10, ID=1) 3 rows with a positive AMOUNT without any negative AMOUNT before. But then there is a negative AMOUNT and some positive AMOUNTS afterwards --> those 4 rows should not be counted for COUNT_CORRECT but should count for COUNT_WRONG.

So in this example table my query should return:

PLACE	LOCATION	TOTAL	COUNT_CORRECT	COUNT_WRONG	RATIO
1	10	11	5	6	0.55
2	20	4	2	2	0.5

My code so far:

CREATE OR REPLACE TABLE ANALYTICS.t (
    PLACE INT NOT NULL
  , LOCATION INT NOT NULL
  , TS DATE NOT NULL
  , ID INT NOT NULL
  , AMOUNT INT NOT NULL
  , GOING_IN INT NOT NULL
  , GOING_OUT INT NOT NULL
, PRIMARY KEY(PLACE, LOCATION, ID, TS)
);

INSERT INTO ANALYTICS.t
    (PLACE, LOCATION, TS, ID, AMOUNT, GOING_IN, GOING_OUT)
VALUES
    (1, 10, '2020-10-01', 1, 100, 10, 0)
  , (1, 10, '2020-10-02', 1, 110, 5, -50)
  , (1, 10, '2020-10-03', 1, 75, 0, -100)
  , (1, 10, '2020-10-04', 1, -25, 30, 0)
  , (1, 10, '2020-10-05', 1, 5, 0, 0)
  , (1, 10, '2020-10-06', 1, 5, 38, 300)
  , (1, 10, '2020-10-07', 1, -257, 0, 0)
  , (1, 10, '2020-10-04', 2, 1, 10, 0)
  , (1, 10, '2020-10-05', 2, 11, 0, -12)
  , (1, 10, '2020-10-06', 2, -1, 0, -100)
  , (1, 10, '2020-10-07', 2, -101, 0, 0)
  , (2, 20, '2020-11-15', 1, 18, 12, 0)
  , (2, 20, '2020-11-16', 1, 30, 0, 0)
  , (2, 20, '2020-11-15', 3, -9, 20, -31)
  , (2, 20, '2020-11-16', 3, -20, 0, 0)
;

Then

SELECT PLACE
     , LOCATION
     , SUM(CASE WHEN AMOUNT >= 0 THEN 1 ELSE 0 END) AS 'COUNT_CORRECT'
     , SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) AS 'COUNT_WRONG'
     , ROUND((SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) / COUNT(AMOUNT)) * 100, 2) AS 'ratio'
FROM t
GROUP BY PLACE, LOCATION
ORDER BY PLACE, LOCATION
;

But I don't know how I can filter for "AND which do not have a negative AMOUNT before" and counting by PLACE, LOCATION, ID as an intermediate step. Any help appreciated.

Original Q&A

There are 2 best solutions below

**Aditya Ray** · Answer 1

I have edited the query. Please let me know if this works.

ALL_ENTRIES query has all the row numbers for the table t partitioned by place,location and ID and ordered by timestamp.

TABLE1 is used to compute the first negative entry. This is done by joining with ALL_ENTRIES and selecting the minimum row number where amount < 0.

TABLE2 is used to compute the last correct entry. Basically ALL_ENTRIES is joined with TABLE1 with the condition that the row numbers should be lesser than the row number in TABLE1. This will give us the row number corresponding to the last correct entry.

TABLE1 and TABLE2 are joined with ALL_ENTRIES to calculate the max row number, which gives the total entries.

In the final select statement I have used case when statement to account for IDs where there are no negative amount values. In those scenarios all the entries should be correct. Hence, the max row number is considered for those cases.

WITH ALL_ENTRIES AS (
          SELECT 
               PLACE,
               LOCATION,
               ID,
               TIMESTAMP,
               AMOUNT,
               ROW_NUMBER() OVER(PARTITION BY PLACE,LOCATION,ID ORDER BY TIMESTAMP) AS 'ROW_NUM'
         FROM t) 
        

SELECT 
    PLACE, 
    LOCATION, 
    ID, 
    TOTAL, 
    COUNT_CORRECT, 
    TOTAL - COUNT_CORRECT AS COUNT_WRONG, 
    COUNT_CORRECT / TOTAL AS RATIO
FROM

    (SELECT 
         ae.PLACE, 
         ae.LOCATION, 
         ae.ID, 
         MAX(ae.ROW_NUM) as TOTAL, 
         MAX (CASE WHEN table2.LAST_CORRECT_ENTRY IS NULL THEN ae.ROW_NUM ELSE table2.LAST_CORRECT_ENTRY END) AS COUNT_CORRECT,

     FROM 
     ALL_ENTRIES ae 
                                
     LEFT JOIN

           (SELECT 
                 ae.PLACE, 
                 ae.LOCATION, 
                 ae.ID, 
                 MAX(ae.ROW_NUM) as LAST_CORRECT_ENTRY 
            FROM 
            ALL_ENTRIES ae
            INNER JOIN 
                   ( SELECT 
                          t.PLACE, 
                          t.LOCATION, 
                          t.ID, MIN(ae.ROW_NUM) as FIRST_NEGATIVE_ENTRY
                     FROM t t
                     INNER JOIN 
                     ALL_ENTRIES ae ON t.PLACE = ae.PLACE
                                   AND t.LOCATION = ae.LOCATION
                                   AND t.ID = ae.ID
                                   AND t.TIMESTAMP = ae.TIMESTAMP
                                   AND t.AMOUNT = ae.AMOUNT
                                   AND ae.AMOUNT < 0 
                     GROUP BY t.PLACE, t.LOCATION           
                    )  table1

            ON ae.PLACE = table1.PLACE
                    AND ae.LOCATION = table1.LOCATION
                    AND ae.ID = table1.ID
                    AND ae.ROW_NUM < table1.FIRST_NEGATIVE_ENTRY
            GROUP BY ae.PLACE, ae.LOCATION, ae.ID
            ) table2

       ON ae.PLACE = table2.PLACE
                    AND ae.LOCATION = table2.LOCATION
                    AND ae.ID = table2.ID
                    
       GROUP BY ae.PLACE, ae.LOCATION, ae.ID
       )

**sirain** · Answer 2

I'm not sure if I understand your question correctly, but the following gives you the number of rows before the first negative amount per (place, location) partition.

The subselect computes the row numbers of all rows with a negative amount. Then we can select the minimum of this as the first row with a negative amount.

SELECT
    place,
    location,
    COUNT(*) - NVL(MIN(pos) - 1, COUNT(*)) AS COUNT_WRONG,
    COUNT(*) - local.COUNT_WRONG           AS COUNT_CORRECT,
    ROUND(local.COUNT_WRONG / COUNT(*),2)  AS RATIO
FROM
    (   SELECT
            amount,
            place,
            location,
            CASE
                WHEN amount < 0
                THEN ROW_NUMBER() over (
                                    PARTITION BY
                                        place,
                                        location
                                    ORDER BY
                                        "TIMESTAMP")
                ELSE NULL
            END pos -- Row numbers of rows with negative amount, else NULL
        FROM
            t)
GROUP BY
    place,
    location;

Count all rows while not counting any row after a negative value

There are 2 best solutions below

Related Questions in SQL

Related Questions in EXASOL

Trending Questions

Popular # Hahtags

Popular Questions