Postgres autovacuum stops working after relfrozenxid age hits autovacuum_freeze_max_age

Question

Postgres autovacuum stops working after relfrozenxid age hits autovacuum_freeze_max_age

139 Views Asked by vjgorla At 03 October 2023 at 04:46

We use postgres 11 in Azure cloud as a temporary store. Two tables in particular receive large number of inserts and deletes (hundreds of millions) in lots of small transactions with heavy lock contention during our batch processing. The tables will eventually be empty when all the data is processed. Because of the amount of traffic going through these two tables we have tuned autovacuum to be aggressive on them autovacuum_vacuum_cost_limit=2000, autovacuum_vacuum_cost_delay=1, autovacuum_vacuum_scale_factor=0.02, autovacuum_vacuum_threshold=10000

Autovacuum generally keeps up with the load and keeps the table size down. We also sometimes truncate or run vacuum full between loads. But once in a while we are noticing that the autovacuum process stops running completely. Around the same time we start seeing xid wraparound warnings in the logs. When this happens relfrozenxid age for some pg_catalog tables is over 200 million but not for the tables we are using.

Autovacuum doesn't come back even if we leave the database in a quiet state for a few hours. pg_stat_progress_vacuum shows no activity. pg_stat_activity does not show any long running transactions. There are no abandoned replication slots or prepared transactions that could block autovacuum. We can't vacuum tables in pg_catalog because it is owned by azure super user. We also don't have access to see if autovacuum threads are running at the os level.

Restarting postgres gets the autovacuum going again but it is not a viable option for us in the long run.

What could be going wrong here?

Original Q&A

There are 1 best solutions below

**Laurenz Albe** · Answer 1 · 2023-10-03T06:22:57.660000

When your tables stop being vacuumed, you should check how many autovacuum workers you see in pg_stat_activity. If there are as many as autovacuum_max_workers, the explanation is that there is no room for your tables being VACUUMed. Anti-wraparound autovacuum may well cause that, since those runs are normally longer and more resource-intense than normal autovacuum runs.

You can either speed up the anti-wraparound autovacuum workers by setting the parameters appropriately, or you can increase autovacuum_max_workers to avoid starving your important tables. Both options will make autovacuum use more of your system's resources.

Postgres autovacuum stops working after relfrozenxid age hits autovacuum_freeze_max_age

There are 1 best solutions below

Related Questions in POSTGRESQL

Related Questions in AUTOVACUUM

Trending Questions

Popular # Hahtags

Popular Questions