I've been tasked with profiling a postgresql database. The first requirement is to see how fast records can be added, with all possible external bottlenecks removed, in order to find our theoretical limit.
At first I created a csv file with sample data and read it in with the COPY function. Now, all records are added via a function update_or_add()
. Is it possible to use COPY along with update_or_add()
or is there a better solution that I haven't considered?
Rather than "for each row call update_or_add() on the row", your preferred approach should be to generalize "update_or_add" to work on all rows in a set.
COPY
the data from the external source into aTEMPORARY
orUNLOGGED
table. Then write a query that merges the data from the copied table into your main table, likeupdate_or_add()
does but for all rows at once. If concurrent modification of the main table is possible this may require locking the main table withLOCK TABLE main_table IN EXCLUSIVE MODE
so that other transactions can onlySELECT
from it while the merge is going on.Without your schema or knowing what
update_or_add
does it's hard to say more. At a guess it's an upsert/merge function, in which case you can do its job massively more efficiently by locking the main table and doing a whole-set merge.