Using LAG to Find Previous Value in Oracle

6k Views Asked by At

I'm trying to use the LAG function in Oracle to find the previous registration value for donors.

To see my data, I started with this query to find all registrations for the particular donor:

select registration_id, registration_date from registration r where r.person_id=52503290 order by r.registration_date desc;

Registrations

Then I used the LAG function to return the previous value along with the most recent value:

select registration_id as reg_id, registration_date as reg_date, lag(registration_date,1) over (order by registration_date) as prev_reg_date from registration where person_id=52503290 order by registration_date desc;

And the results are as expected:

LAG Function

So I thought I should be good to place the LAG function within the main query to get the previous value but for some reason, the previous value returns NULL or no value at all.

SELECT P.Person_Id AS Person_ID, R.Registration_Date AS Drive_Date, LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date, P.Abo AS Blood_Type, DT.Description AS Donation_Type FROM Person P JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT' LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999 LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID WHERE TRUNC(R.Registration_Date) = TRUNC(SYSDATE)-1 AND R.Person_Id=52503290 ORDER BY R.Registration_Date DESC;

Here is the result set:

Main Query

Any suggestions on what I am missing here? Or why this query isn't returning the values expected?

Based on @Alex Poole's suggestions, I changed the query to look like:

SELECT * FROM (
SELECT
       P.Person_Id AS Person_ID,
       R.Registration_Date AS Drive_Date,
       LAG(R.Registration_Date,1) OVER (partition by p.person_id ORDER BY r.registration_date) AS Previous_Drive_Date,
       P.Abo AS Blood_Type,
       DT.Description AS Donation_Type
FROM
       Person P
       JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
       LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
       LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
       LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
       LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
--WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;

It takes about 85 seconds to pull back the first 30 rows:

Query Results

My original query (before the LAG function was added) took about 2 seconds to pull back approximately 2100 records. But it was nothing but a SELECT with a couple of JOINS and one item in the WHERE clause.

Looking at the record counts, Person has almost 5.5 million records and Registration has 9.1 million records.

1

There are 1 best solutions below

4
On BEST ANSWER

The lag is only applied within the rows that match the where clause filter, so you would only see the previous value if that was also yesterday.

You can apply the lag in a subquery, and then filter in an outer query:

SELECT * FROM (
    SELECT
           P.Person_Id AS Person_ID,
           R.Registration_Date AS Drive_Date,
           LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date,
           P.Abo AS Blood_Type,
           DT.Description AS Donation_Type
    FROM
           Person P
           JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
           LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
           LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
           LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
           LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
    WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;