Replacing with a copy of a table with subset of data, instead of deleting from table - best method

279 Views Asked by At

We have a table with 800M rows and 93 columns that needs historical data removed. The data to be removed makes up > 95%. So far, attempts to delete in batches have failed, estimating at least one week to complete. The existing indexes are the main cause of the slowness. Removing the indexes and rebuilding after deletion has also proved too slow in testing.

It was decided that we should create a new table containing only the data we need, before renaming the old table as backup and renaming new to original.

The problem is that there are at least a hundred keys, constraints and indexes.

We are considering generating a script from Object Browser, renaming all the object names using find/replace, creating the identically structured table, populating with data, renaming original as backup, then renaming the new table and all the objects to those contained by the original table.

This is a critical table on a live database with only a few overnight hours to complete the process. So many manual renamings make us nervous as there cannot be any problems when the client begins trading.

What is the most reliable method? Is there an automated process to undertake this procedure? Any recommendations appreciated.

2

There are 2 best solutions below

3
On

I would do the following during your downtime (i.e. the database is not used by anyone):

  1. Script out the current table, let's call it [X], and rename it to [Y].
  2. Run the script in step 1 and create a new table, still as [X]
  3. Insert into this [X] the records you need from [Y], it should be <5% of your 800M rows. (it is still ~40M rows because 800M X 5% = 40M). Of course, to quicken this step, you may consider switching recovery model to SIMPLE and/or disable triggers/NC indexes etc. Once the work is done, restore back to original settings.
  4. Once this is done, set up a regular job that will maintain the new [X] table by moving un-needed records to [Y] every day.

But the final solution would be to upgrade your sql server 2008 to SQL Server 2016+, in SQL Server 2016 SP1, even standard edition has the partition capability and that partition method will make your life much easier (another topic though)

0
On

Being nervous of the huge number of renames required and the critical nature of the table, I have been able to test the below script to rename all constraints, indexes, triggers, defaults etc. so will create a copy script, rename original, then run copy script:

select o.name as [Name],  o1.type as [Type], 'EXEC sp_rename N''' + o1.name + ''', '''  + o1.name + N'_OLD'''
from sys.objects o
inner join sys.objects o1 on o1.parent_object_id = o.object_id
where o.name = 'Product_history'
union
select o.name as [Name], 'I' as [Type], 'EXEC sp_rename N''PRODUCT_HISTORY.' + i.name + ''', '''  + i.name + N'_OLD'''
from sys.objects o
inner join sys.indexes i on i.object_id = o.object_id
where o.name = 'Product_history' and i.name is not null  AND i.is_primary_key = 0
order by [Type], [Name]


select o.name as [Name],  o1.type as [Type], 'EXEC sp_rename N''' + o1.name + ''', '''  + LEFT(o1.name,LEN(o1.name)-4) + N''''
from sys.objects o
inner join sys.objects o1 on o1.parent_object_id = o.object_id
where o.name = 'Product_history'
union
select o.name as [Name], 'I' as [Type], 'EXEC sp_rename N''PRODUCT_HISTORY.' + i.name + ''', '''  + LEFT(i.name,LEN(i.name)-4) + N''''
from sys.objects o
inner join sys.indexes i on i.object_id = o.object_id
where o.name = 'Product_history' and i.name is not null  AND i.is_primary_key = 0
order by [Type], [Name]