The following quote is from here: https://pganalyze.com/blog/5mins-postgres-15-merge-vs-insert-on-conflict
The comment that he essentially made is that the downside of MERGE's handling of concurrency is that when you concurrently INSERT, so at the same time as you're executing the MERGE statement, there is another INSERT going on, then MERGE might not notice that. MERGE would go into its INSERT logic, and then it would get a unique violation.
I recall that, back when he initially designed the INSERT ON CONFLICT feature, that because of this constraint of how MVCC works in Postgres, you cannot implement MERGE with its general syntax and options to work in the same reliable way. If you want the generality of MERGE, you have to accept the fact that you might get unique constraint violations, when there are concurrent inserts, versus with INSERT ON CONFLICT, the way it's designed with its speculative insertions, guarantees that you either get an INSERT or an UPDATE and that is true even if there are concurrent inserts. You might want to choose INSERT ON CONFLICT if you need the guarantee.
Pretend then that we're doing a really huge merge
that will take a long time, and in the process of the merge, there happens this concurrent insert described in the quote above that causes unique constraint violation during the insert logic. What exactly happens in this case? Does it lead to the whole merge
query failing and throwing a unique constraint exception?
Or, perhaps, if the unique constraint violation does occur for a row or rows in the final insert logic, only those rows that have the unique constraint violation are just silently ignored, and the rest of the insert proceeds as normal?
I would reckon that postgres' merge
functionality doesn't support this (the latter - silently ignoring unique constraint violations), but I do think this would be a valuable addition to the merge
functionality, because there will definitely be users (me currently) who wouldn't want their entire merge insert (or update) to fail just because one or a couple of rows had a unique constraint violation.
I did run some tests using the following code, and I discovered, as expected, that if a unique violation exception occurs during the insert of the
merge
command, this unique violation exception bubbles up and tanks the wholemerge
query.That means, for example, if you're doing a long running batch
merge
operation, themerge
might perform 99% of the work, but then if even the last insert row causes a unique violation exception, the wholemerge
query gets rolled back and all the work lost with it.In many foreseeable uses cases for
merge
I would expect this to be very disappointing and far less than ideal. In my opinion, it would be much better formerge
to offer the ability to simply ignore the rows that cause unique violation exceptions instead of tanking the whole query.I am of the opinion that this additional functionality (to ignore rows the cause unique violation exception during the insert query) should be added to both the sql standard and to postgres.
Code used for testing
In one sql process run the following:
In a second sql process, run the following immediately after running the
merge
command in the first sql process: