I have a large dataset where we keep track of all revenue collected by client for each month. So each client gets a row for each month they were charged, resulting in a lot of rows. I am trying to find a way to pull the most recent record for each client.
Here is what i have:
| merchant name | Merchant id | revenue date | revenue amount |
|---|---|---|---|
| fish | 1234 | 2022-03-01 | 200 |
| fish | 1234 | 2022-04-01 | 200 |
| fish | 1234 | 2022-05-01 | 200 |
| fish | 1234 | 2022-06-01 | 200 |
| dog | 5678 | 2022-01-01 | 200 |
| dog | 5678 | 2022-02-01 | 200 |
| dog | 5678 | 2022-03-01 | 200 |
| dog | 5678 | 2022-04-01 | 200 |
| cat | 1011 | 2022-10-01 | 200 |
| cat | 1011 | 2022-11-01 | 200 |
My desired result is:
| merchant name | Merchant id | revenue date | revenue amount |
|---|---|---|---|
| fish | 1234 | 2022-06-01 | 200 |
| dog | 5678 | 2022-04-01 | 200 |
| cat | 1011 | 2022-11-01 | 200 |
I have tried this:
Select distinct
merchant_name,
merchant_id,
revenue_date,
revenue_amount
from table
where revenue_date=(select max(revenue_date) from table)
but that is only returning rows that match the maximum date listed (2022-11-01).
Any help is much appreciated!
The general solution takes the form:
Or... in PostgreSQL you can just do: