I have a big table in SQL server which contains 136.651.894 of rows. One month ago were inserted 7.668.309 new rows with an error in one field. In the moment of the insert I create a copy table to ensure that this does not happen, but nobody saw the error and I dropped the table one week after the load process.
To delete the rows I create a procedure which select the max values per myID from the original table (this is because the error was that we add more zeros in the field value, so the max value is wrong and this register has to be deleted), and in a backup table I search this registers for myID and value and I delete the row.
For example one row is:
ID
myId Value Other fields...
2 2345 25948238400 Other values...
The procedure is:
CREATE PROCEDURE [dbo].[P_DELETE_ROWS]
AS BEGIN
DECLARE @myId VARCHAR(22)
DECLARE @value VARCHAR(20)
DECLARE c_max CURSOR
FOR
SELECT myId,max ([value]) as maxValue
FROM t1Original
group by myId
order by maxValuedesc
OPEN c_max
FETCH NEXT FROM c_max
INTO @myId ,@value
WHILE @@FETCH_STATUS = 0
BEGIN
print 'Deleting '+@myId+' y caudal '+@value +''
DELETE FROM [t1OriginalCopy] WHERE myId=@myIdAND value=@value
FETCH NEXT FROM c_max
INTO @myId,@value
END
CLOSE c_maxs
DEALLOCATE c_max
END
The problem of this is that it takes too much time, now is over 1 day of execution...
How can I improve the performance of this process?
Use
DENSE_RANK
instead ofCURSOR
.Try this
If the table(
t1OriginalCopy
) will be used in parallel in some other operation then you may have to split the delete into batches to avoid locks