Postgresql bottleneck neither CPU, network nor I/O

1.1k Views Asked by At

We are testing our application for performance, which is using Postgresql 13 as a database. It is very insert and update heavy and we cannot get more than 65k TPS on the database. But none of the most common bottlenecks do apply (CPU/network/disk IO).

We have also run multiple combinations with pgbench but also those cannot go past 65k. E.g.

pgbench -i -s 50 -U postgres -h <DB_HOST> -d <DB_NAME>
pgbench -c 64 -j 32 -t 100000 -h <DB_HOST> -U postgres <DB_NAME>

starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 64
number of threads: 32
number of transactions per client: 100000
number of transactions actually processed: 6400000/6400000
latency average = 0.976 ms
tps = 65584.664360 (including connections establishing)
tps = 65594.330678 (excluding connections establishing)

(Using more clients resulted in fewer TPS)

The database server has the following specs:

  • 64 Cores (Intel Xeon Gold 6130 (Skylake, 2.10GHz, 2 CPUs/node, 16 cores/CPU))
  • 192 GiB RAM (12 * 16GiB DIMM DDR4 Synchronous Registered (Buffered) 2666 MHz (0.4 ns))
  • 2 * SSD SATA Samsung MZ7KM240HMHQ0D3 (one is used for the WAL and the other for the data)
  • 10 Gbps network link
  • OS: Debian 11

And we are using the following configuration:

shared_buffers=65551953kB
effective_cache_size=147491895kB
huge_pages=on
min_wal_size=4GB
max_wal_size=16GB
wal_buffers=1GB
work_mem=2GB
maintenance_work_mem=4GB

checkpoint_completion_target=0.9
checkpoint_timeout = 15min
random_page_cost=1.1
bgwriter_flush_after = 2MB
effective_io_concurrency = 200

# Disabled just for performance experiments
fsync = off
synchronous_commit = off
full_page_writes = on

max_worker_processes=64
max_parallel_workers=64
max_parallel_workers_per_gather=10
max_parallel_maintenance_workers=12

When we reach the 65k TPS we have the following resource usage on the server:

Our application:

  • CPU: 25%
  • I/O Utilization: 30% resp. 60% (WAL)
  • Network: in 200Mbps out 125Mbps

pgbench:

  • CPU: 75%
  • I/O Utilization: 2% resp. 5% (WAL)
  • Network: in/out ~500Mbps

We have also temporarily placed the WAL and data into RAM to verify that the disks are not the issue, but this had no impact either.

Does anyone have a guess where our bottleneck might be?

(We have thought about sharding already but would like to use the single server as much as possible)

Edit:

vmstat 1 -S M while running pgbench:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0  92009     25  18316    0    0     3   318  405   22 26 11 63  0  0
 0  0      0  92009     25  18316    0    0     0     0  261  320  0  0 100  0  0
 0  0      0  92009     25  18316    0    0     0 46616  216  257  0  0 100  0  0
45  0      0  92260     25  17996    0    0     0     0 509124 781029 35 13 52  0  0
44  1      0  92252     25  17998    0    0     0 82192 743920 1140285 51 20 30  0  0
50  0      0  92247     25  18003    0    0     0 65564 740366 1152212 52 18 30  0  0
41  0      0  92243     25  18005    0    0     0     0 736052 1138409 50 20 30  0  0
35  0      0  92240     25  18009    0    0     0     0 734265 1138483 50 20 30  0  0
36  0      0  92238     25  18012    0    0     0   264 741315 1152409 52 19 29  0  0
43  0      0  92236     25  18016    0    0     0    36 740374 1147759 51 20 29  0  0
46  0      0  92231     25  18020    0    0     0 163856 737491 1166498 53 19 28  0  0
47  0      0  92228     25  18023    0    0     0     0 741582 1171814 52 19 28  0  0
43  0      0  92223     25  18027    0    0     0 61584 741537 1168119 53 19 28  0  0
46  0      0  92220     25  18030    0    0     0    36 739695 1167456 53 19 28  0  0
43  0      0  92216     25  18034    0    0     0   228 741992 1150333 52 20 29  0  0
38  0      0  92214     25  18036    0    0     0 147464 740589 1166289 52 19 28  0  0
41  0      0  92209     25  18042    0    0     0     0 737148 1162946 52 19 28  0  0
44  0      0  92207     25  18044    0    0     0  2480 741757 1173128 53 19 28  0  0
39  0      0  92205     25  18049    0    0     0    48 740404 1170644 53 19 28  0  0
52  0      0  92201     25  18051    0    0     0   292 739032 1159037 52 19 28  0  0
42  1      0  92198     25  18054    0    0     0 20072 740101 1165594 52 20 28  0  0
51  0      0  92194     25  18059    0    0     0 45464 738055 1165382 53 19 28  0  0
41  0      0  92190     25  18062    0    0     0     0 742838 1172377 53 19 28  0  0
45  0      0  92185     25  18067    0    0     0    36 740704 1174534 53 19 28  0  0
50  0      0  92182     25  18069    0    0     0    92 741691 1150716 52 19 28  0  0
42  0      0  92177     25  18073    0    0     0    28 740220 1168488 53 18 28  0  0
44  0      0  92174     25  18077    0    0     0     0 738818 1164769 53 19 28  0  0
46  0      0  92172     25  18080    0    0     0     0 740720 1169902 53 19 28  0  0
46  0      0  92166     25  18083    0    0     0 90404 592524 945810 42 15 43  0  0
45  0      0  92161     25  18087    0    0     0  3884 746310 1159898 52 19 28  0  0
49  0      0  92157     25  18090    0    0     0    20 747242 1177750 53 19 28  0  0
36  0      0  92152     25  18094    0    0     0     0 744477 1173832 53 19 28  0  0
46  0      0  92149     25  18098    0    0     0     0 746194 1172700 53 19 28  0  0
39  0      0  92147     26  18101    0    0     0 49768 745651 1177462 54 18 28  0  0
43  0      0  92143     26  18105    0    0     0   212 744968 1161110 53 19 28  0  0
43  0      0  92138     26  18109    0    0     0     0 743223 1176960 54 19 28  0  0
43  0      0  92135     26  18112    0    0     0 81920 745168 1173574 53 19 28  0  0
48  0      0  92132     26  18116    0    0     0     0 743174 1169255 53 19 28  0  0
48  0      0  92129     26  18120    0    0     0    68 592295 933445 42 15 43  0  0
41  0      0  92124     26  18123    0    0     0    76 740354 1162221 52 19 28  0  0
49  0      0  92120     26  18125    0    0     0     0 738456 1158291 53 19 28  0  0
39  0      0  92117     26  18129    0    0     0 147536 740735 1162479 52 20 28  0  0
49  0      0  92113     26  18133    0    0     0     0 737209 1165532 53 20 28  0  0
49  0      0  92111     26  18137    0    0     0    40 741185 1168133 54 19 28  0  0
45  0      0  92110     26  18140    0    0     0  4000 740693 1141945 52 20 28  0  0
42  0      0  92105     26  18144    0    0     0     0 741857 1168830 53 19 28  0  0
43  0      0  92102     26  18147    0    0     0     8 742546 1168867 54 18 28  0  0
43  0      0  92101     26  18150    0    0     0 147456 741941 1166646 53 19 28  0  0
41  1      0  92097     26  18154    0    0     0 64192 740052 1169040 53 19 28  0  0
48  0      0  92094     26  18158    0    0     0 27484 737224 1139511 52 20 28  0  0
47  0      0  92087     26  18162    0    0     0     0 740821 1165037 53 19 28  0  0
54  0      0  92059     26  18164    0    0     0     4 737109 1155098 53 19 27  0  0
38  0      0  92051     26  18170    0    0     0 147456 701847 1075069 55 20 25  0  0
35  0      0  92064     26  18174    0    0     0    44 723153 1125736 54 19 27  0  0
48  0      0  92056     26  18179    0    0     0 53008 734590 1134838 52 19 29  0  0
46  0      0  92053     26  18183    0    0     0     0 741595 1166891 53 19 28  0  0
46  0      0  92049     26  18186    0    0     0     0 740196 1170838 54 19 27  0  0
31  1      0  92045     26  18191    0    0     0 98304 741800 1170076 54 18 28  0  0
44  0      0  92043     26  18194    0    0     0 49188 733352 1173652 53 20 27  0  0
43  0      0  92028     26  18198    0    0     0   116 733522 1151497 53 20 27  0  0
44  0      0  92037     26  18201    0    0     0     0 730364 1137665 53 19 27  0  0
37  0      0  92035     26  18204    0    0     0     0 742348 1164945 53 19 28  0  0
44  0      0  92031     26  18208    0    0     0     0 739273 1165044 53 19 28  0  0
52  0      0  92028     26  18211    0    0     0 147888 739274 1164496 53 19 28  0  0
50  0      0  92024     26  18215    0    0     0   144 739684 1145210 53 19 28  0  0
50  0      0  92020     26  18219    0    0     0     0 742847 1167779 54 18 28  0  0
38  0      0  92016     26  18223    0    0     0     0 738079 1166580 53 19 28  0  0
36  0      0  92013     26  18226    0    0     0     0 742687 1171101 54 18 27  0  0
48  0      0  92009     26  18229    0    0     0 147500 741536 1166846 53 19 28  0  0
40  0      0  92006     26  18233    0    0     0 94600 740746 1147102 52 20 28  0  0
45  0      0  92001     26  18238    0    0     0     0 741119 1163851 53 19 28  0  0
48  0      0  91999     26  18241    0    0     0     0 740995 1167197 53 19 28  0  0
35  0      0  91996     26  18244    0    0     0     0 742235 1165666 53 19 28  0  0
44  1      0  91993     26  18248    0    0     0 49192 741392 1164506 53 19 28  0  0
43  1      0  91990     26  18251    0    0     0 124876 743695 1144639 52 19 29  0  0
48  0      0  91987     26  18255    0    0     0 24864 737759 1159383 52 20 28  0  0
44  0      0  91983     26  18258    0    0     0     0 740224 1164983 53 19 28  0  0
43  0      0  91980     26  18262    0    0     0     0 741742 1168140 54 19 27  0  0
18  0      0  91976     26  18267    0    0     0    36 737449 1162293 53 19 28  0  0
49  0      0  91973     26  18269    0    0     0 147576 741462 1148048 52 20 28  0  0
43  0      0  91969     26  18274    0    0     0     0 742408 1168332 54 19 27  0  0
43  0      0  91966     26  18277    0    0     0     0 738803 1164992 53 19 28  0  0
39  0      0  91963     26  18280    0    0     0     4 737891 1159372 52 19 28  0  0
43  0      0  91962     26  18283    0    0     0    40 741888 1166835 53 19 28  0  0
48  0      0  91958     26  18287    0    0     0 164144 738677 1145900 52 20 28  0  0
46  0      0  91955     26  18291    0    0     0     0 740956 1165789 53 19 28  0  0
44  0      0  91952     26  18295    0    0     0     0 741055 1166460 53 19 28  0  0
44  0      0  91948     26  18299    0    0     0     8 739414 1165698 53 19 28  0  0
46  0      0  91945     26  18301    0    0     0    48 743218 1165277 53 19 28  0  0
36  0      0  91941     26  18305    0    0     0   208 736320 1134425 51 20 29  0  0
47  0      0  91936     26  18309    0    0     0 239096 739799 1159730 52 19 28  0  0
45  0      0  91932     26  18312    0    0     0     0 742477 1167618 53 20 28  0  0
45  0      0  91928     26  18316    0    0     0     0 736442 1159690 52 20 28  0  0
47  0      0  91926     26  18319    0    0     0    76 737145 1157620 52 20 28  0  0
48  0      0  91921     26  18323    0    0     0    64 739999 1146323 52 19 29  0  0
50  0      0  91918     26  18326    0    0     0 197176 739590 1159797 52 19 28  0  0
50  0      0  91915     26  18330    0    0     0     0 740533 1166111 53 19 28  0  0
52  0      0  91911     26  18334    0    0     0     0 739776 1161328 52 20 28  0  0
42  0      0  91907     26  18338    0    0     0    60 590783 929545 41 16 43  0  0
39  0      0  91904     26  18341    0    0     0  4248 744434 1161062 52 19 29  0  0
41  1      0  91900     26  18345    0    0     0 114688 741817 1163511 53 19 28  0  0
14  0      0  91928     26  18349    0    0     0 32768 598242 996868 43 15 42  0  0
 0  0      0  91951     26  18349    0    0     0     0 41914 84357  3  1 96  0  0
 0  0      0  91952     26  18349    0    0     0    36  174  204  0  0 100  0  0
 0  0      0  91954     26  18349    0    0     0   276 7897 13403  0  0 99  0  0
 0  0      0  91954     26  18349    0    0     0     0 1911 3678  0  0 100  0  0
 0  0      0  91954     26  18349    0    0     0 147456  330  351  0  0 100  0  0

Edit:

The same pgbench benchmark as above gave about 68k when it was run on the same system as the database and select only 540k (507k from a remote system).

1

There are 1 best solutions below

1
On

Is CPU:75% user or user + system? Anyway, 75% of OS CPU with hyper-threading means all cores 100% busy.

You use simple protocol tpc-b which is stressing network and context switches. See https://franckpachot.medium.com/do-you-know-what-you-are-measuring-with-pgbench-d8692a33e3d6