I am using pg_restore
to restore the Postgresql database backed up using pg_dump
. (both the backup and restore is happening on PostgreSQL 12). The backup is taken using the custom format option --format=custom
.
The restore for 10 GB backup is taking almost 30 mins. The time to restore increases significantly as the size of the backup increases. So I tried out the --jobs
parameter for improving the restore times.
As per the documentation, concurrent connections will be used to restore the database objects. I verified the output of restore and I could verify that there were parallel threads started equal to the value of the --jobs
parameter. However the time to restore has not improved with any value of the --jobs
parameter.
I am aware that the performance depends on the hardware infrastructure. But the machine has 16 vcpus and 32GB RAM.
I have also tried tuning Postgres as mentioned in the blog with following configurations but still no improvements in restore times.
work_mem = 32MB shared_buffers = 4GB maintenance_work_mem = 2GB full_page_writes = off autovacuum = off wal_buffers = -1
Is there anything that I have missed? How do I get an improvement in the restore time?
There are several things that could be the problem:
pg_restore
parallelizes by running severalCOPY
andCREATE INDEX
commands concurrently.Now if your database has one large table with a single large index, parallelization won't help you.
Perhaps your I/O system is at its limit. Then running processes in parallel won't improve the performance.
If you have lots of large objects, that is known to slow down processing, and I am not sure if parallelization helps or not.
Don't ever set
full_page_writes
orautovacuum
tooff
unless you set them back toon
after restore and are ready to dump the database cluster in the event of a crash. I doubt that the performance gain is worth it, particularly forfull_page_writes
.One parameter that you forgot is
max_wal_size
. If you raise that, it will help write performance.Apart from that, you have to find out where the bottleneck is before you can fix it.
How about using a different backup method like
pg_basebackup
that is normally faster?