I am trying to do an Insert/Select but I am getting a duplicate key error.
INSERT INTO dbo.DESTINATION_TABLE
(
DocNumber,
TenantId,
UserId,
OtherField
)
SELECT
po.DocNumber,
po.TenantId,
po.CreatedById,
po.OtherField
FROM dbo.SOURCE_TABLE po
WHERE
po.DeletedById IS NULL AND
NOT EXISTS(
SELECT * FROM dbo.DESTINATION_TABLE poa
WHERE
poa.DocNumber = po.DocNumber AND
poa.TenantId = po.TenantId
)
DESTINATION_TABLE has a composite primary key of DocNumber and TenantId. DESTINATION_TABLE is empty at the time of running.
SOURCE_TABLE has a primary key of 'SourceTableId'.
But I keep getting an error
Violation of PRIMARY KEY constraint 'PK_dbo.DESTINATION_TABLE'. Cannot insert duplicate key in object 'dbo.DESTINATION_TABLE'. The duplicate key value is (DOC-99, some-tenant).
I have also tried
MERGE INTO DESTINATION_TABLE poa
USING (
SELECT
x.DocNumber,
x.TenantId,
x.CreatedById,
x.OtherField
FROM dbo.SOURCE_TABLE x
WHERE x.DeletedById IS NULL
) po
ON (poa.TenantId = po.TenantId AND poa.DocNumber = po.DocNumber)
WHEN NOT MATCHED THEN INSERT (DocNumber, TenantId, UserId, OtherField)
VALUES(po.DocNumber, po.TenantId, po.CreatedById, po.OtherField);
But I get the exact same result.
How is this happening? Is it because it is checking for 'NOT EXISTS' before running the insert? How do I fix this?
It seems that the flaw here is your understanding of how SQL works. SQL is a set-based language, so it works with the data in sets. For the above, this means that you define the data you want to
INSERTwith yourSELECTand then all the rows that you define areINSERTed. For yourEXISTSthis means that it only checks against rows that exist in the table prior to any of the rows being inserted, it does not insert each row one at a time and validate theEXISTSprior to each row.Let's take a very basic example with a single column table:
Now, let's say we want to insert the following dataset:
If you performed the following query, this will generate the same error you had:
This is because there are two rows with the value
2forIDthat are trying to beINSERTed. Both rows are trying to beINSERTed because there are no rows in the table with the value2at the time theINSERToccurs. If you want, you can validate what rows were tried to beINSERTed by commenting out theINSERTclause:For your scenario, one method would be to use a CTE with
ROW_NUMBERto limit the results to a single row for a single primary key value.