I'm trying to speed up the process of dumping and restoring databases. At the moment, we do pg_dump --no-privileges --no-owner -w source_db | psql target_db
(the DB is created before this). I read that the custom format is preferable, and I expected a binary format to be faster than a textual representation.
First test
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative | Resources |
---|---|---|---|---|---|
pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
693.527 ± 5.741 | 688.635 | 699.848 | 1.03 ± 0.03 | User: 165.459 s, System: 18.331 s |
pg_dump --no-privileges --no-owner -w source_db | psql target_db |
673.223 ± 8.535 | 663.433 | 679.100 | 1.00 | User: 12.348 s, System: 25.697 s |
→ psql is 20 s faster
Compression
I tested with different levels of compression because I read that compression is slow.
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative | Resources |
---|---|---|---|---|---|
pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
702.692 ± 6.316 | 698.376 | 709.942 | 1.05 ± 0.03 | User: 18.090 s, System: 28.934 s |
pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
672.341 ± 15.539 | 655.448 | 686.025 | 1.00 | User: 89.223 s, System: 21.904 s |
pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
693.527 ± 5.741 | 688.635 | 699.848 | 1.03 ± 0.03 | User: 165.459 s, System: 18.331 s |
pg_dump --no-privileges --no-owner -w source_db | psql target_db |
673.223 ± 8.535 | 663.433 | 679.100 | 1.00 ± 0.03 | User: 12.348 s, System: 25.697 s |
→ same performance in the best case
Dump / restore separated
As asked by @Bergi and @Nick I measured dumping and restoring separately.
Only dumping
hyperfine --runs=3 -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db > ./1' \
'pg_dump --compress=1 --no-privileges --format=custom source_db > ./2' \
'pg_dump --no-privileges --no-owner -w source_db > ./3'
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
pg_dump --compress=0 --no-privileges --format=custom source_db > ./1 |
77.648 ± 8.720 | 71.724 | 87.660 | 1.09 ± 0.12 |
pg_dump --compress=1 --no-privileges --format=custom source_db > ./2 |
104.646 ± 0.911 | 103.804 | 105.614 | 1.47 ± 0.02 |
pg_dump --no-privileges --no-owner -w source_db > ./3 |
71.412 ± 1.037 | 70.321 | 72.385 | 1.00 |
→ psql is 7s s faster
Only restoring
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'cat ./1 | pg_restore --dbname=target_db' \
'cat ./2 | pg_restore --dbname=target_db' \
'cat ./3 | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
cat ./1 | pg_restore --dbname=target_db |
658.715 ± 24.712 | 637.183 | 685.697 | 1.11 ± 0.04 |
cat ./2 | pg_restore --dbname=target_db |
591.001 ± 8.025 | 582.334 | 598.172 | 1.00 |
cat ./3 | psql target_db |
691.756 ± 15.320 | 680.285 | 709.154 | 1.17 ± 0.03 |
→ restoring is 10x slower than dumping, but is slower when separated from dumping.
Is there anything I can do to make this faster without writing the dump to disk (= using --format=directory
which would enable parallelization)?
You answered the question yourself: use
That is, don't compress the dump. If plain format dump is slightly faster for you, use that.
Since the restore is slow for you, tune the target database for that job:
increase
max_wal_size
a lotincrease
maintenance_work_mem
a lotset
wal_level = minimal
and use--single-transaction
withpsql
orpg_restore
, so that PostgreSQL can skip writing WAL