Using PostgreSQL 16.2, I have two tables:
task_pool table
create table task_pool (
task_id serial primary key
, order_id integer
, user_id integer
, expire_time timestamp
, status integer default 0
);
Sample data:
| task_id | order_id | user_id | expire_time | status |
|---|---|---|---|---|
| 50 | 80 | 1 | 2024-03-00 11:52:44 | -1(expired) |
| 51 | 81 | 1 | 2024-03-00 16:00:44 | -1 (expired) |
| 52 | 80 | 2 | 2024-04-19 12:31:22 | 0 (active) |
| 53 | 81 | 4 | 2024-03-00 16:52:44 | -1(expired) |
| 54 | 81 | 6 | 2024-03-00 16:51:44 | -1(expired) |
reactivate_task table this table is actually a temporary table using WITH AS in a longer query, but for simplicity here is code to create the table:
create table reactivate_task(
, order_id integer
, successfully_added
, total_amount_requested
, amount_to_reactivate integer
);
Sample data:
| order_id | successfully_added | total_amount_requested | amount_to_reactivate |
|---|---|---|---|
| 80 | 0 | 3 | 3 |
| 81 | 0 | 3 | 2 |
What I'm trying to do is, for each row in reactivate_task, update the x oldest expired tasks with the related order_id, and x being that row's amount_to_reactivate
So in the example tables above, looking at reactivate_task table, the 3 oldest expired tasks with order_id 80 should be "reactivated". And the 2 oldest expired_tasks with order_id 81 should be "reactivated".
There are the intended results in table task_pool:
| task_id | order_id | user_id | expire_time | status |
|---|---|---|---|---|
| 50 | 80 | 1 | 2024-03-21 12:00:00 | 0(active) --status set to 0 and expire_time to current_timestamp |
| 51 | 81 | 1 | 2024-03-21 12:00:00 | 0(active) ----status set to 0 and expire_time to current_timestamp |
| 52 | 80 | 2 | 2024-04-19 12:31:22 | 0 (active) --NOT updated (even though amount_to_reactive was 3 for order_id 80, there was only ONE expired task (task_id 50) that was expired. |
| 53 | 81 | 4 | 2024-03-19 16:52:44 | -1(expired) --NOT updated (amount_to_reactive was only 2 for order 81, so only the 2 rows with oldest expire_time were updated (task_id 50 & 54) |
| 54 | 81 | 6 | 2024-03-21 12:00:00 | 0 (active) ----status set to 0 and expire_time to current_timestamp |
This is an example query for ONE row:
UPDATE task_pool
SET status=0,
expire_time = current_timestamp + (2 * interval '1 day')
WHERE task_id IN (
SELECT task_id
FROM task_pool
WHERE order_id = *--order_id from CURRENT ROW OF REACTIVATE_TASK--*
AND status=-1
ORDER BY expire_time
LIMIT *--amount_to_reactivate from current row of reactivate_task--*
But with this query I have to iterate over each row in reactivate_task outside of Postgres (I'm using node and node-postgres, using a for loop on the results.rows[])
Is there a way to do this in Postgres in one query, instead of iterating over each row?
fiddle
Basics for
UPDATEsyntax in the manual.Identify the maximum number of candidate rows in a
LATERALsubquery. See:Additional relations in the
FROMclause to an update cannot join laterally to the target table, so you have to use a second instance of the same table in this special situation. Related:CROSS JOINeliminates rows fromreactivate_taskwhere no qualifying rows to activate can be found.If
amount_to_reactivateis null (which it probably should never be), there is effectively no limit.If
(order_id, expire_time)oftask_poolis notUNIQUE, ties will be broken arbitrarily, unless you add more expressions to theORDER BYto make it deterministic.If there can be concurrent, competing writes on the table
task_poolyou may have to add a locking clause. But not in the same subquery, you'd add another subquery level above the one with theLIMIT. Related: