How to use a conditional to evaluate a paired variable in case statement

77 Views Asked by At

I'm working with a dataset (in SSMS) that looks something like this:

CaseID State EventTimeStamp Cleared
1 OrderReceived 2024-03-20 0
1 Completed 2024-03-21 0
1 Completed 2024-03-22 1
2 OrderReceived 2024-03-24 0
2 Completed 2024-03-25 0
2 Completed 2024-03-26 1
2 Completed 2024-03-27 0
3 OrderReceived 2024-03-28 0
3 OrderReceived 2024-03-29 1
3 Completed 2024-03-31 0

EDIT FOR CONTEXT: The actual dataset has ~40 States, but I'd like to track about 25. Some states REQUIRE other states to be marked done, but others don't and some can be in progress simultaneously. Any state can have an event Clear = 1, but it would require a Clear = 0 event first i.e. an event that didn't happen can't be cleared. Events for any state will happen in 0, 1, 0,... Cleared order every single time.

For the purpose of this dummy dataset the MAX(EventTimeStamp) for an OrderReceived could be Cleared = 1, and the MAX(EventTimeStamp) for the Completed event for the same CaseID could be Cleared = 0. There is also potential for a case to have MAX(EventTimeStamp) for OrderReceived to be NULL and MAX(EventTimeStamp) for Completed to have a value. Additionally, it is possible for MAX(EventTimeStamp) for OrderReceived to be AFTER MAX(EventTimeStamp) for Completed. In that case the desired result would simply look like

CaseID OrderReceived Completed
X 2024-03-20 2024-03-18

In that case the last EventTimeStamp would display for each. The engineer who made the database made didn't hardcode a lot of restrictions because a lot of changes are still being made to the workflow.

CREATE TABLE CaseTracking 
(
    [CaseID]    bigint,
    [State]     VARCHAR(512),
    [EventTimeStamp]    date,
    [Cleared]   bit
);

INSERT INTO CaseTracking ([CaseID], [State], EventTimeStamp, Cleared) VALUES
    ('1', 'OrderReceived', '2024-03-20', '0'),
    ('1', 'Completed', '2024-03-21', '0'),
    ('1', 'Completed', '2024-03-22', '1'),
    ('2', 'OrderReceived', '2024-03-24', '0'),
    ('2', 'Completed', '2024-03-25', '0'),
    ('2', 'Completed', '2024-03-26', '1'),
    ('2', 'Completed', '2024-03-27', '0'),
    ('3', 'OrderReceived', '2024-03-28', '0'),
    ('3', 'OrderReceived', '2024-03-29', '1'),
    ('3', 'Completed', '2024-03-31', '0');

SELECT * FROM CaseTracking

I'd like to take a look at how long it takes between certain states and so I organized the table by column State using "group by". I was hoping to make a table that evaluated the MAX(EventTimeStamp) for each state, but if Cleared = 1 then the value would return NULL instead. The desired result would look like this. For this database, states can be cleared and it's recorded the same as any other event, but with Cleared = 1. If the MAX(TimeEventStamp) has a Cleared = 1 value that means it still needs to be done which is why I'd like a NULL return. The value for CaseID = 2 Completed is 2024-3-27 since I'd want the value associated with the MAX(EventTimeStamp). If Cleared happens in 0, 1, 0 order I'd like the timestamp of the last 0. If Cleared happens in 0, 1, 0, 1 order I'd like a NULL return

CaseID OrderReceived Completed
1 2024-03-20 NULL
2 2024-03-24 2024-03-27
3 NULL 2024-03-31

I've been playing around in https://dbfiddle.uk/JG38HeOH

I tried an AND conditional within the CASE clause, but ran into an issue where instead of returning the NULL I want for CaseID = 1, I instead get the MAX(TimeEventStamp) for any time the conditions are met

SELECT CaseID 
  ,MAX(CASE WHEN State = 'OrderReceived' AND Cleared = 0 THEN EventTimeStamp 
            ELSE NULL END) AS OrderReceived
  ,MAX(CASE WHEN State = 'Completed' AND Cleared = 0 THEN EventTimeStamp 
            ELSE NULL END) AS Completed

FROM CaseTracking
GROUP BY CaseID

Results look like this

CaseID OrderReceived Completed
1 2024-03-20 2024-03-21
2 2024-03-24 2024-03-27
3

There are 3 best solutions below

1
MatBailie On BEST ANSWER

This gets the latest row for each state of each case, then pivots the rows to one row per case.

WITH
  sorted AS
(
  SELECT
    *,
    ROW_NUMBER()
      OVER (
        PARTITION BY CaseID, State
            ORDER BY EventTimeStamp DESC
      )
        AS row_id
  FROM
    CaseTracking
)
SELECT
  CaseID,
  MAX(CASE WHEN State = 'OrderReceived' THEN EventTimeStamp END) AS OrderReceived,
  MAX(CASE WHEN State = 'Completed'     THEN EventTimeStamp END) AS Completed
FROM
  sorted
WHERE
  row_id = 1
  AND
  Cleared = 0
GROUP BY
  CaseID

CaseID OrderReceived Completed
1 2024-03-20 null
2 2024-03-24 2024-03-27

fiddle

1
Bohemian On

You can't do it in one pass over the table; you must self join for each type of State.

Use a CTE to find the last EventTimeStamp for each State and apply that to each join.

WITH last AS (
  SELECT
    CaseID,
    State,
    MAX(EventTimeStamp) as EventTimeStamp
  FROM CaseTracking
  GROUP BY CaseID, State
)
SELECT
  i.CaseID,
  CASE WHEN c1.Cleared = 0 THEN c1.EventTimeStamp END AS OrderReceived,
  CASE WHEN c2.Cleared = 0 THEN c2.EventTimeStamp END AS Completed
FROM last i
LEFT JOIN last o ON o.CaseID = i.CaseID
  AND o.State = 'OrderReceived'
LEFT JOIN CaseTracking c1 ON c1.CaseID = i.CaseID
  AND c1.State = o.State
  AND c1.EventTimeStamp = o.EventTimeStamp
LEFT JOIN last c ON c.CaseID = i.CaseID
  AND c.State = 'Completed'
LEFT JOIN CaseTracking c2 ON c2.CaseID = c1.CaseID
  AND c2.State = c.State
  AND c2.EventTimeStamp = c.EventTimeStamp
WHERE i.State = 'OrderReceived'

See live demo, showing logic working with OrderReceived too.

This query can be extended to work with as many other column as needed by copying the pattern.

11
Hogan On

NOTE: There is some discussion about there be multiple created events. If this is true then the following answer needs to be updated.

We can take the dataset and get the data for the max and the min like this

SELECT CaseID, Min(EventTimeStamp) as MinTS, Max(EventTimeStamp) as MaxTS
FROM CaseTracking 
GROUP BY CaseID

Now we need to join back to get the data for the max item

SELECT grouped.CaseID, grouped.MinTS as OrderRecieved,
       CASE WHEN last.Cleared = 1 THEN NULL ELSE last.EventTimeStamp END AS Completed
FROM (
  SELECT CaseID, Min(EventTimeStamp) as MinTS, Max(EventTimeStamp) as MaxTS
  FROM CaseTracking 
  GROUP BY CaseID
) AS grouped
LEFT JOIN CaseTracking AS last ON grouped.CaseID = last.CaseID AND grouped.MaxTS = last.EventTimeStamp

https://dbfiddle.uk/KtXt7c6T