Stored procedure to find Peaks Separated by Seven Days

Question

Stored procedure to find Peaks Separated by Seven Days

78 Views Asked by Haya Alhuraib At 24 January 2024 at 17:13

What could be the SQL Server stored procedure to select the highest three peaks separated by seven days.

I have tried the below code:

CREATE PROCEDURE SelectPeaksAndInsertIntoPeakFact 
AS
BEGIN
    -- Initialize starting date
    DECLARE @currentDate DATE = (SELECT MIN(dateTime) FROM Peak);   

    -- Initialize peak type
    DECLARE @peakType VARCHAR(5) = 'Peak1'; 

    WHILE @currentDate <= (SELECT MAX(dateTime) FROM Peak)
    BEGIN
        INSERT INTO Peak_fact (DataDate, peak, peakType)
            SELECT
                p.dateTime AS DataDate,
                MAX(p.SA_data) OVER (ORDER BY p.dateTime ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS peak,
                @peakType AS peakType
            FROM 
                Peak p
            WHERE 
                p.dateTime BETWEEN @currentDate AND DATEADD(DAY, 6, @currentDate);

        SET @currentDate = DATEADD(DAY, 7, @currentDate); -- Skip 7 days 
        SET @peakType = CASE @peakType WHEN 'Peak1' THEN 'Peak2' WHEN 'Peak2' THEN 'Peak3' ELSE 'Peak1' END; -- Cycle peak types
    END;
END;

But the result is incorrect, because for the Table 1

Value	Date	Notes
12	1/1/2023	Start 1
15	1/2/2023
13	1/3/2023
11	1/4/2023
10	1/5/2023
5	1/6/2023
31	1/7/2023	End 1 / Peak
9	1/8/2023
31	1/9/2023
21	1/10/2023
22	1/11/2023
34	1/12/2023
8	1/13/2023
5	1/14/2023
33	1/15/2023	Start 2 / Peak
22	1/16/2023
9	1/17/2023
18	1/18/2023
19	1/19/2023
22	1/20/2023
29	1/21/2023	End 2
10	1/22/2023
9	1/23/2023	Start 3
8	1/24/2023
29	1/25/2023
33	1/26/2023
55	1/27/2023
81	1/28/2023	Peak
3	1/29/2023	End 3
88	1/30/2023
44	1/31/2023
10	2/1/2023
20	2/2/2023
44	2/3/2023
83	2/4/2023
69	2/5/2023	Start 4
99	2/6/2023	Peak
20	2/7/2023
40	2/8/2023
10	2/9/2023
32	2/10/2023
33	2/11/2023	End 4
83	2/12/2023
63	2/13/2023
92	2/14/2023	Start 5 / Peak
65	2/15/2023
42	2/16/2023
84	2/17/2023
21	2/18/2023
23	2/19/2023
12	2/20/2023	End 5
92	2/21/2023
65	2/22/2023	Start 6
74	2/23/2023
83	2/24/2023	Peak
10	2/25/2023

The output should be what is in Table 2:

date	Value	Peak
1/7/2023	31	peak1
1/15/2023	33	peak2
1/28/2023	81	peak3
2/6/2023	99	peak1
2/14/2023	92	peak2
2/24/2023	83	peak3

Original Q&A

There are 1 best solutions below

**T N** · Answer 1 · 2024-01-25T21:27:16.730000

I have two approaches that you can consider. One fixes up your current stored procedure to produce your posted desired results. The second uses a more traditional definition of a local maximum.

Your original stored procedure had several issues that I have fixed up below.

The original logic did not have a WHERE condition to exclude rows that were not peak rows. This caused all dates in the selected range to be inserted into the results, instead of just the selected peak. To fix this, the MAX() window function logic was moved to a subselect and a WHERE s.SA_data = peak condition was added to the outer. I also added a TOP 1 and ORDER BY to pick just the first peak date if there are multiple peak value matches.
The @CurrentDate logic was adding 7 to the current value instead of calculating a new start based on skipping 7 days following each selected peak. Fixing this requires capturing the selected peak date and using it for both the results insert and the updated @CurrentDate calculation - SET @currentDate = DATEADD(DAY, 8, @Datadate).
The (SELECT MAX(dateTime) FROM Peak) ending value calculation for the WHILE loop was moved to a variable to avoid repeated/redundant queries. This doesn't affect the results, but improves performance slightly.
The MAX() window function logic is already contained in a query that limits the rows to the desired 7-day range, so a separate ROWS BETWEEN ... qualifier is unnecessary and has been dropped. The remaining empty OVER() qualifier then just refers to the entire 7-day range. This doesn't affect the results, but simplifies the code.

The updated stored procedure would then be:

CREATE PROCEDURE SelectPeaksAndInsertIntoPeakFact 
AS
BEGIN
    DECLARE @currentDate DATE = (SELECT MIN(dateTime) FROM Peak); -- Initialize starting date
    DECLARE @lastDate DATE = (SELECT MAX(dateTime) FROM Peak);
    DECLARE @peakType VARCHAR(5) = 'Peak1'; -- Initialize peak type

    WHILE @currentDate <= @lastDate
    BEGIN
        DECLARE @Datadate DATE;
        DECLARE @peak INT;
      
        SELECT TOP 1
            @Datadate = s.dateTime,
            @peak = s.peak
        FROM (
            SELECT
                p.*,
                MAX(p.SA_data) OVER() AS peak
            FROM Peak p
            WHERE p.dateTime BETWEEN @currentDate AND DATEADD(DAY, 6, @currentDate)
        ) s
        WHERE s.SA_data = peak
        ORDER BY s.dateTime;

        INSERT INTO Peak_fact (DataDate, peak, peakType)
        VALUES (@Datadate, @peak, @peakType);

        SET @currentDate = DATEADD(DAY, 8, @Datadate); -- Skip 7 days 
        SET @peakType = CASE @peakType WHEN 'Peak1' THEN 'Peak2' WHEN 'Peak2' THEN 'Peak3' ELSE 'Peak1' END; -- Cycle peak types
    END;
END

Results:

DataDate	peak	peakType
2023-01-07	31	Peak1
2023-01-15	33	Peak2
2023-01-28	81	Peak3
2023-02-06	99	Peak1
2023-02-14	92	Peak2
2023-02-24	83	Peak3

As an alternative more traditional local peak calculation, you can define a peak as a point where the current value equals or exceeds all other values within a sliding +/- 7-day window (or chose your window size). In the case of level plateaus having one or more adjacent or closely-spaced equal peak values, the earliest can be chosen.

SELECT
    s.dateTime AS DataDate,
    s.SA_data AS peak,
    CONCAT(
        'Peak',
        (((ROW_NUMBER() OVER(ORDER BY s.dateTime)) - 1) % 3 + 1)
        ) AS peakType
FROM (
    SELECT
        *,
        MAX(p.SA_data) OVER(
           ORDER BY p.dateTime ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING
           ) AS Max7Before,
        MAX(p.SA_data) OVER(
           ORDER BY p.dateTime ROWS BETWEEN 1 FOLLOWING AND 7 FOLLOWING
           ) AS Max7After
    FROM Peak p
) s
WHERE s.SA_data > s.Max7Before
AND s.SA_data >= s.Max7After
ORDER BY s.dateTime;

The above logic will not select a peak from among the middle of a (mostly) continuously ascending or descending series of values. Only when the values form a true local maximum (within a +/- 7 day window) will a peak value be selected.

Results:

DataDate	peak	peakType
2023-01-12	34	Peak1
2023-02-06	99	Peak2
2023-02-14	92	Peak3

Notes:

2023-01-07 (31) was not selected because a larger value (34) exists for 2023-01-12 (+5 days).
2023-01-12 (34) was selected because that is a true local maximum.
2023-01-15 (33) was not selected because a larger value (34) exists for 2023-01-12 (-3 days).
2023-01-28 (81) was not selected because a larger value (88) exists for 2023-01-30 (+2 days).
2023-01-30 (88) was not selected because a larger value (99) exists for 2023-02-06 (+7 days).
2023-02-06 (99) was selected because that is a true local maximum.
2023-02-14 (92) was selected because that is a true local maximum.
2023-02-21 (92) was not selected bacause it is equal to the 2023-02-14 value (-7 days).
2023-02-24 (83) was not selected because a larger value (92) exists for 2023-02-21 (-3 days).

Adjusting the window size to a range other than +/- 7 days may increase or decrease the number of detected local maximums.

See this db<>fiddle for a demo of both approaches.

Stored procedure to find Peaks Separated by Seven Days

There are 1 best solutions below

Related Questions in SQL

Related Questions in SQL-SERVER

Related Questions in STORED-PROCEDURES

Related Questions in SQLCODE

Trending Questions

Popular # Hahtags

Popular Questions