How to get previous record by date in case of no match in a join condition

63 Views Asked by At

I have two tables:

dates:

date
2024-01-01
2024-01-02
2024-01-03
2024-01-04
2024-01-05

actual:

business_date id
2024-01-01 1
2024-01-02 2
2024-01-04 4
2024-01-05 5

I want to join these tables on the condition date = business_date, but in the case where there's no match on dates it would get the record of the previous date. So resulting table would look like:

business_date id date
2024-01-01 1 2024-01-01
2024-01-02 2 2024-01-02
2024-01-02 2 2024-01-03
2024-01-04 4 2024-01-04
2024-01-05 5 2024-01-05

How would I achieve this?

2

There are 2 best solutions below

2
MatBailie On BEST ANSWER
SELECT
  *
FROM
  dates
CROSS JOIN LATERAL
(
  SELECT
    *
  FROM
    actual
  WHERE
    actual.business_date <= dates.date
  ORDER BY
    actual.business_date DESC
  LIMIT
    1
)
  AS actual

Or...

WITH
  actual_ranged AS
(
  SELECT
    *,
    LEAD(business_date, 1, 'infinity')
      OVER (
        ORDER BY business_date
      )
        AS next_business_date
  FROM
    actual
)
SELECT
  *
FROM
  actual_ranged AS a
LEFT JOIN
  dates         AS d
    ON  d.date >= a.business_date
    AND d.date <  a.next_business_date

Demo : https://dbfiddle.uk/tzjjhZJu

0
Zegarek On

An example of full outer join with coalesce() and lag() window function: demo

  1. It takes all dates from both tables, with the id if there's a matching business_date.
  2. Using coalesce(), if it finds that the date doesn't have an id, it switches to lag(id)over w1, which looks back at the previous date and takes it from that.
  3. Does the same thing for business_date
  4. Note that it treats your criteria strictly: if the previous date is also empty, you'll get a null.
with cte as (select business_date, coalesce(a.date,b.business_date) date, b.id 
             from dates a full outer join business_dates b 
                     on a.date=b.business_date)
select coalesce(business_date,lag(business_date)over w1) as business_date, 
       coalesce(id,lag(id)over w1) as id, 
       date
from cte
window w1 as (order by date)
order by 1;
business_date id date
2024-01-01 1 2024-01-01
2024-01-02 2 2024-01-02
2024-01-02 2 2024-01-03
2024-01-04 4 2024-01-04
2024-01-05 5 2024-01-05

If that's a gaps-and-islands problem and you'd like lag() to skip over nulls until it finds the most recent id - making that not just the previous date but rather the most recent with an id - that'd be a skip nulls clause that's currently missing from Postgres. Luckily, you can emulate it:

with cte as (select business_date as bd,coalesce(a.date,b.business_date) d,b.id
             from dates a full outer join business_dates b
                     on a.date=b.business_date)
select (array_agg(bd)filter(where bd is not null)over w1)[1] as business_date,
       (array_agg(id)filter(where id is not null)over w1)[1] as id,
       d as date
from cte
window w1 as (order by d
              desc rows between current row and unbounded following)
order by date;

Aggregate functions can be used as window functions, and those have a filter clause you can use to skip null values. The window specification tells it to order them so that the most recent non-null ends up first in the array, which you then take with arr[1]. If you do it often, you can declare a nicer-looking aggregate function like that.