How to keep the 1st record of duplicate records

31 Views Asked by At

I know there are many questions over this and many solutions, most of them specific to each individual's case. However, in my case I cannot seem to get just the sets of records that are to be deleted, to leave only the 1st ID or the lowest-value ID.

This is a sample of my table:

enter image description here

ID          COL_A                               COL_B   what I need to do
24556552    KD PT PERSONAL TRAINING, AND 58 MOU 5691212 
24556924    KD PT PERSONAL TRAINING, AND 58 MOU 5691212 <-- remove
24556869    KHETSHI DHARAMSHI & CO LTD,         5690443 
24556951    KHETSHI DHARAMSHI & CO LTD,         5690443 <-- remove
24528435    KINNY TINT LDA                      5689791 
24528436    KINNY TINT LDA                      5689791 <-- remove
24528437    KINNY TINT LDA                      5689791 <-- remove
24528438    KINNY TINT LDA                      5689791 <-- remove
24528439    KINNY TINT LDA                      5689791 <-- remove

Is there a way to link this table back to itself, so I can only get the first occurrence of it, keep it and delete the rest? I am not familiar with PARTITION OVER etc.

Thanks

1

There are 1 best solutions below

3
Tim Biegeleisen On BEST ANSWER

If you just want to view your data sans the duplicates, use ROW_NUMBER():

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY COL_A, COL_B ORDER BY ID) rn
    FROM yourTable
)

SELECT ID, COL_A, COL_B
FROM cte
WHERE rn = 1;

If you really need to delete the duplicate records, then use exists logic:

DELETE
FROM yourTable t1
WHERE EXISTS (
    SELECT 1
    FROM yourTable t2
    WHERE t2.COL_A = t1.COL_A AND
          t2.COL_B = t1.COL_B AND
          t2.ID < t1.ID
);