find duplicated rows and update them

47 Views Asked by At

I have a lot of duplicated rows in my database i want edit all of them, so i write a query like that to find all duplicated rows

SELECT a.journal_id
     , a.article_id
     , a.submission_file_id
     , a.editor_file_id
  FROM articles_migration a
  JOIN
     ( SELECT submission_file_id 
         FROM articles_migration 
        GROUP 
           BY submission_file_id 
       HAVING COUNT(article_id) >1
     )temp 
    ON a.submission_file_id = temp.submission_file_id 

this query return 1170 record that are duplicated! but now i want remove which records that have same journal_id for this purpose

 journal_id     article_id  submission_file_id  editor_file_id  
16                845            1848           2402
16                6409           1848           2402
54                5317           1848           2402

in first step, i should remove one of the rows that journal_id has the same value for example row 1 or 2 and in step 2 i should have result like this:

 journal_id     article_id  submission_file_id  editor_file_id  
16                845            1848           2402
54                5317           1848           2402

so i want update one of the row with this formula: submission_file_id value => (value * 10,000)

and finally i should have result like this

 journal_id     article_id  submission_file_id  editor_file_id  
16                845            1848           2402
54                5317           18480000       2402

thanks in advance

1

There are 1 best solutions below

0
On

this is right answer

DELETE a FROM articles_migration as a INNER JOIN articles_migration as b ON a.submission_file_id = b.submission_file_id and a.journal_id = b.journal_id WHERE a.article_id < b.article_id

above code remove duplicate rows and now i should update on of them

update articles_migration  as t1 inner join articles_migration as t2 on (t1.submission_file_id = t2.submission_file_id) and t1.article_id <> t2.article_id set t1.submission_file_id=(t1.submission_file_id * 1000);