Can't combine 'NOT IN' with CTE in a 'WHERE' clause

131 Views Asked by At

INPUT: this is the Order_Tbl table with each row equivalent to a record of a transaction:

ORDER_DAY ORDER_ID PRODUCT_ID QUANTITY PRICE
2015-05-01 ODR1 PROD1 5 5
2015-05-01 ODR2 PROD2 2 10
2015-05-01 ODR3 PROD3 10 25
2015-05-01 ODR4 PROD1 20 5
2015-05-02 ODR5 PROD3 5 25
2015-05-02 ODR6 PROD4 6 20
2015-05-02 ODR7 PROD1 2 5
2015-05-02 ODR8 PROD5 1 50
2015-05-02 ODR9 PROD6 2 50
2015-05-02 ODR10 PROD2 4 10

EXPECTED OUTPUT: the task is to write a T-SQL query to get products that was ordered on 02-May-2015 but not on any other days before that:

ORDER_DAY ORDER_ID PRODUCT_ID QUANTITY PRICE
2015-05-02 ODR6 PROD4 6 20
2015-05-02 ODR8 PROD5 1 50
2015-05-02 ODR9 PROD6 2 50

I did try 2 solutions with a same approach: using a (completely identical) subquery to get a list of distinct products that were ordered before 02-May-2015, and then somehow putting it after the NOT IN operator inside the WHERE clause of the main query.

Solution 1: the subquery was passed in as an CTE. It throws a syntax error ...

WITH CTE AS
(
    SELECT DISTINCT PRODUCT_ID
    FROM Order_Tbl
    WHERE ORDER_DAY < '2015-05-02'
)
SELECT *
FROM Order_Tbl
WHERE ORDER_DAY = '2015-05-02'
  AND PRODUCT_ID NOT IN CTE

Solution 2: the subquery was embedded into the WHERE clause of the main query. This worked!

SELECT *
FROM Order_Tbl
WHERE ORDER_DAY = '2015-05-02'
  AND PRODUCT_ID NOT IN (SELECT DISTINCT PRODUCT_ID
                         FROM Order_Tbl
                         WHERE ORDER_DAY < '2015-05-02')

What was the nuance that made SQL behave and return different results? I would appreciate it if you guys could give me a clear explanation as well as some useful notes for further SQL implementations.

2

There are 2 best solutions below

0
GMB On BEST ANSWER

The task is to write a T-SQL query to get products that was ordered on 02-May-2015 but not on any other days before that

It has been explained by Thom A in the comments that NOT IN does not accept the first syntax you tried, and that, even with the right syntax, NOT EXISTS is in general preferable to NOT IN:

select *
from order_tbl o
where order_day = '2015-05-02'
  and not exists (
    select 1 
    from order_tbl o1 
    where o1.product_id = o.product_id and o1.order_day < o.order_day 
)

The query ensures that there is no row in the table for the same product and an erlier order date. For performance, consider an index on order_tbl(product_id, order_day).

But overall, it is probably simpler and more efficient to use window functions ; the subquery can just be replaced with a window min():

select *
from (
    select o.*, min(order_day) over(partition by product_id) min_order_day
    from order_tbl o
) o
where order_day = '2015-05-02' and min_order_day = order_day
0
abolfazl  sadeghi On

In fact, CTE A CTE allows you to define a temporary named result set that available temporarily, you should treat it like a table and query and write your code in this way


select *
from Order_Tbl
where ORDER_DAY = '2015-05-02'
    and PRODUCT_ID not in (
        select distinct PRODUCT_ID
        from Order_Tbl
        where ORDER_DAY < '2015-05-02'
        )

Cte