Analyzing performance of an insert in postgres

3.2k Views Asked by At

I have a statement which is inserting a bunch of rows (doesn't really matter what or where for this question) into a Postgres DB and it isn't as fast as I'd like. I can run an explain query to see what it's doing and I get something like this:

 Insert on dpdb.datapoints  (cost=0.00..6917.76 rows=44184 width=1786) (actual time=15558.623..15558.623 rows=0 loops=1)
   Buffers: shared hit=34670391 read=98370 dirtied=48658 written=39875
   I/O Timings: read=704.525 write=242.915
   ->  Seq Scan on public.fred  (cost=0.00..6917.76 rows=44184 width=1786) (actual time=0.018..197.853 rows=44184 loops=1)
         Output: nextval('datapoints_id_seq'::regclass), fred.company_id, fred.tag, ... lots more columns ...
         Buffers: shared hit=44186 read=6253 dirtied=1
         I/O Timings: read=29.176
 Planning time: 0.110 ms
 Trigger RI_ConstraintTrigger_c_14845718 for constraint datapoints_tag_source_fkey: time=236.677 calls=44184
 Trigger RI_ConstraintTrigger_c_14845723 for constraint datapoints_sheet_type_fkey: time=536.367 calls=44184
 Trigger RI_ConstraintTrigger_c_14845728 for constraint datapoints_subcontext_fkey: time=178.200 calls=44184
 Trigger RI_ConstraintTrigger_c_14845733 for constraint datapoints_source_type_fkey: time=467.619 calls=44184
 Trigger RI_ConstraintTrigger_c_14845738 for constraint datapoints_doc_type_fkey: time=302.256 calls=44184
 Trigger RI_ConstraintTrigger_c_14845743 for constraint datapoints_comment_type_fkey: time=88.740 calls=44184
 Trigger RI_ConstraintTrigger_c_14845748 for constraint datapoints_preferred_dpid_fkey: time=33.313 calls=44184
 Execution time: 17432.381 ms
(16 rows)

This is great in that I see the cost of the various triggers and the select part of the query but if I add up all those parts it's about 2s (less than 10% of the total time). It doesn't tell me much about the actual insert other than it took about 24 seconds. I'm guessing that a fair part of the cost is in updating indexes but I don't know this for sure. How I can get a breakdown of which indexes are being updated and the relative costs of each? Is this even possible?

(I originally asked this on dba.stackexchange.com but didn't any answers - it falls half way between a DBA and a dev question so I figured I'd ask here)

1

There are 1 best solutions below

4
On

Those look like foreign key constraints. Each insert is performing a SELECT to check to see if the row is present, hopefully using an index assuming that index is present.

The time discrepancy is because time on each trigger is showing you the amount of time spent executing the function and excludes IO wait time. In other words, it's not showing you wall-clock time, but CPU-time (approximately).

You can use pg_stat_statements to try to look into the queries against those tables and find the foreign key ones, but at the end of the day this looks pretty normal for working with foreign keys.

If you need these inserts to execute faster the best option would be to remove the foreign key constraints.