Windowed Functions and Query Optimizer

55 Views Asked by At

I have a table with the following structure.

|anId|     aDate|aNumber|
-------------------------
|   1|2018-01-20|    100|
|   1|2019-01-01|   -100|
|   1|2019-02-01|     10|
|   2|2019-01-02|     40|

I have a query to return, on a specific date, whether or not previous (inclusive) aNumbers sum is > 0 for each .

select
  anId,
  aDate,
  1 as aStatus
from (
  select
    anId,
    aDate,
    sum(aNumber) OVER (
      PARTITION BY anId
      ORDER BY aDate
      ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING
    ) as aSum
  from
    myTable
)
where
  aSum > 0
;

So this query would return

|anId|     aDate|aStatus|
-------------------------
|   1|2018-01-20|      1|
|   2|2019-01-02|      1|
|   1|2019-02-01|      1|

Now I've turned the query into a view myView. I'd like to query this view for date ranges. I may query the table daily/monthly/yearly whatever, but I want to be able to export the query results from one date range, and then export/append the results for the next date range.

select
  anId,
  aDate,
  aStatus
from
  myView
where
  aDate between (2018-01-01) and (2018-12-31)
;

Would return

|anId|     aDate|aStatus|
-------------------------
|   1|2018-01-20|      1|

And the next year

select
  anId,
  aDate,
  aStatus
from
  myView
where
  aDate between (2019-01-01) and (2019-12-31)
;

Should return

|anId|     aDate|aStatus|
-------------------------
|   2|2019-01-02|      1|
|   1|2019-02-01|      1|

Allowing me to stitch together the results to get the original, unfiltered, view records.


Ok, now that the stage is set, my concern with this approach is that when I filter the date from the view, it will impact the windowed function.

When I filter on 2019, will the windowed sum still include the 2018 aNumbers? Will my date range filter be applied in the inner select, prior to the sum?

1

There are 1 best solutions below

0
On BEST ANSWER

After creating this question, I realized it should be simple enough to test it.

CREATE TABLE [dbo].[myTable](
    [anId] [char](36) NOT NULL,
    [aDate] [datetime2](7) NULL,
    [aNumber] [int] NULL
) ON [PRIMARY]
GO

insert into myTable(anId,aDate,aNumber) values ('1','2018-01-20',100);
insert into myTable(anId,aDate,aNumber) values ('1','2019-01-01',-100);
insert into myTable(anId,aDate,aNumber) values ('1','2019-02-01',10);
insert into myTable(anId,aDate,aNumber) values ('2','2019-01-20',40);

Using a sub-select instead of creating the actual view

select
    *
from (
    select
      anId,
      aDate,
      1 as aStatus
    from (
      select
        anId,
        aDate,
        sum(aNumber) OVER (
          PARTITION BY anId
          ORDER BY aDate
          ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING
        ) as aSum
      from
        myTable
    ) a
    where
      a.aSum > 0
) b
where
    b.aDate < '2019-01-01'
;

Returns:

|anId|     aDate|aStatus|
-------------------------
|   1|2018-01-20|      1|

And

select
    *
from (
    select
      anId,
      aDate,
      1 as aStatus
    from (
      select
        anId,
        aDate,
        sum(aNumber) OVER (
          PARTITION BY anId
          ORDER BY aDate
          ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING
        ) as aSum
      from
        myTable
    ) a
    where
      a.aSum > 0
) b
where
    b.aDate >= '2019-01-01'
;

Returns:

|anId|     aDate|aStatus|
-------------------------
|   2|2019-01-02|      1|
|   1|2019-02-01|      1|

This confirms that the date filter doesn't impact the sums. However it leaves me with some concern that the subquery is sub-optimal, as it may be running sums across significantly more data than necessary. ie. when I want 2018 data, is it still calculating sums for 2019 data?