I am new to Postgres, I used to work with MySQL.
My problem is simple, it is also never happens in MySQL. t1
is my main table and t2
holds the history of updates that occurred to a field in t1
called status. So whenever an entry in t1
changes it's status field, a new record is added to t2
with the current timestamp and the new status value.
Now I want to to get the latest status and timestamp of a specific entry in t1
, for example that with id 1271. I can get the latest timestamp with the following query.
SELECT
t1.id,
t1.message,
MAX(t2.creation_timestamp)
FROM table_1 t1
LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id
WHERE t1.id = 1271
GROUP BY t1.id,t1.message
But when I try to also get the status of that specific row with the MAX timestamp.
SELECT
t1.id,
t1.message,
t2.status,
MAX(t2.creation_timestamp)
FROM table_1 t1
LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id
WHERE t1.id = 1271
GROUP BY t1.id,t1.message
I get the following error:
ERROR: column "t2.status" must appear in the GROUP BY clause or be used in an aggregate function
How can I get the status of the record with the MAX timestamp ?
Your query is rejected by Postgres because it's invalid SQL. The problem exists in MySQL as well, you were just lucky so far because MySQL chooses to return random values rather then rejecting the invalid group by (you might want to read this or this for details on the MySQL implementation)
The most efficient solution in Postgres is to use
distinct on ()
However, if
id
is defined as the primary key oftable_1
and there is a proper foreign key relationship between the two tables, Postgres will accept the partial group by as it knows that theid
is unique.SQLFiddle example: http://sqlfiddle.com/#!15/7cfc8/1