I have read a bunch of information that processing a table RBAR is terribly inefficient, and while I know that was not what SQL was optimized for, I'm interested to see if there is a way that I can accomplish this task and avoid the loop or a set of nested CASE statements (since I can only nest 10 at a time).
Problem: I have a master inventory file (mif table) that contains all of our part numbers in it with a flag to denote if the part has been superseded/replaced by another part. There is another table that contains what that part was replaced with. The problem comes in that level of the supersession is a self-contained unit, if you will. For example, if part 123 was replaced with 456 but 456 was replaced with 789, there is nothing so far that tells us directly that part 123 was ultimately replaced with part 789.
I'm trying to create a table that tells records the initial number and the end-of-chain part number that ultimately replaced part 123.
I'm sorry this is long, but here are two simplified versions of the queries that both work (a LOOP and a CASE statement) but both have limitations. Is there a way to do this both effectively and will allow me to expand past 10 CASE statements?
Two tables being used: mif and dc_dim:
mif structure:
mfr_code | part_number | superseded
ABC | 123 | Y
DEF | 456 | Y
GHI | 789 | N
dc_dim structure:
MFR_CODE | PART_NUM | REPLACED | REPLACED_MFR_CODE | REPLACED_PART_NUM
ABC | 123 | Y | DEF | 456
DEF | 456 | Y | GHI | 789
GHI | 789 | N | (null) | (null)
First, using the LOOP:
/***Create dbo.MifSuperseded since such a large amount of records.
Dropped at end.
This table contains the next link for each part number that has been superseded.***/
SELECT id = IDENTITY(int,1,1)
,a.mfr_code, a.part_number
,b.REPLACED, b.REPLACED_MFR_CODE, b.REPLACED_PART_NUM
INTO dbo.MifSuperseded
FROM dbo.mif a
LEFT OUTER JOIN dc_dim b
ON a.mfr_code = b.MFR_CODE AND a.part_number = b.PART_NUM
LEFT OUTER JOIN mif c
ON b.REPLACED_MFR_CODE = c.mfr_code AND b.REPLACED_PART_NUM = c.PART_NUM
WHERE a.superseded = N'Y';
--Declare variables
DECLARE @id int = (SELECT MIN(id) FROM dbo.MifSuperseded);
DECLARE @Replaced nvarchar(3);
DECLARE @NumOfSupers int;
DECLARE @MfgCode nvarchar(5);
DECLARE @PartNum nvarchar(30);
DECLARE @ReplacedMfgCode nvarchar(5);
DECLARE @ReplacedPartNum nvarchar(30);
--Outside loop that goes row-by-row
WHILE @id IS NOT NULL
BEGIN
SET @ReplacedMfgCode = (SELECT REPLACED_MFR_CODE FROM dbo.MifSuperseded WHERE id = @id);
SET @ReplacedPartNum = (SELECT REPLACED_PART_NUM FROM dbo.MifSuperseded WHERE id = @id);
SET @Replaced = (SELECT REPLACED FROM dbo.MifSuperseded WHERE mfr_code = @MfgCode AND part_number = @PartNum);
SET @NumOfSupers = 1;
--Inside loop that discovers end-of-chain replacement
WHILE @Replaced = 'Y'
BEGIN
SET @MfgCode = @ReplacedMfgCode;
SET @PartNum = @ReplacedPartNum;
SET @ReplacedMfgCode = (SELECT REPLACED_MFR_CODE
FROM dbo.MifSuperseded
WHERE mfr_code = @MfgCode
AND part_number = @PartNum);
SET @ReplacedPartNum = (SELECT REPLACED_PART_NUM
FROM dbo.MifSuperseded
WHERE mfr_code = @MfgCode
AND part_number = @PartNum);
SET @Replaced = (SELECT REPLACED
FROM dbo.MifSuperseded
WHERE mfr_code = @ReplacedMfgCode
AND part_number = @ReplacedPartNum);
SET @MfgCode = @ReplacedMfgCode;
SET @PartNum = @ReplacedPartNum;
SET @NumOfSupers = @NumOfSupers + 1;
END
--Inserts final superseded part number into table
INSERT INTO dbo.SUPERSESSION_EOC
SELECT mfr_code
,part_number
,@NumOfSupers AS NUMBER_OF_SUPERSESSIONS
,@MfgCode AS REPLACED_MFR_CODE
,@PartNum AS REPLACED_PART_NUM
FROM dbo.MifSuperseded
WHERE id = @id
SELECT @id = MIN(id) FROM dbo.MifSuperseded WHERE id > @id;
END;
DROP TABLE dbo.MifSuperseded;
Now the use of the CASE statements:
/***Create dbo.MifSuperseded since such a large amount of records.
Dropped at end.
This table contains the next link for each part number that has been superseded.***/
SELECT id = IDENTITY(int,1,1)
,a.mfr_code, a.part_number
,b.REPLACED, b.REPLACED_MFR_CODE, b.REPLACED_PART_NUM
INTO dbo.MifSuperseded
FROM dbo.mif a
LEFT OUTER JOIN dc_dim b
ON a.mfr_code = b.MFR_CODE AND a.part_number = b.PART_NUM
LEFT OUTER JOIN mif c
ON b.REPLACED_MFR_CODE = c.mfr_code AND b.REPLACED_PART_NUM = c.PART_NUM
WHERE a.superseded = N'Y';
/***Select Into dbo.mif_supersession_final_replacement.***/
INSERT INTO dbo.SUPERSESSION_EOC
SELECT mif_1.mfr_code
,mif_1.part_number
,(CASE WHEN mif_10.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_9.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_8.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_7.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_6.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_5.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_4.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_3.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_2.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_1.REPLACED_MFR_CODE IS NULL THEN ''
ELSE 1 END
ELSE 2 END
ELSE 3 END
ELSE 4 END
ELSE 5 END
ELSE 6 END
ELSE 7 END
ELSE 8 END
ELSE 9 END
ELSE 10 END) AS NUMBER_OF_SUPERSESSIONS
,(CASE WHEN mif_10.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_9.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_8.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_7.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_6.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_5.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_4.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_3.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_2.REPLACED_MFR_CODE IS NULL THEN
CASE WHEN mif_1.REPLACED_MFR_CODE IS NULL THEN ''
ELSE mif_1.REPLACED_MFR_CODE END
ELSE mif_2.REPLACED_MFR_CODE END
ELSE mif_3.REPLACED_MFR_CODE END
ELSE mif_4.REPLACED_MFR_CODE END
ELSE mif_5.REPLACED_MFR_CODE END
ELSE mif_6.REPLACED_MFR_CODE END
ELSE mif_7.REPLACED_MFR_CODE END
ELSE mif_8.REPLACED_MFR_CODE END
ELSE mif_9.REPLACED_MFR_CODE END
ELSE mif_10.REPLACED_MFR_CODE END) AS REPLACED_MFR_CODE
,(CASE WHEN mif_10.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_9.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_8.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_7.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_6.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_5.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_4.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_3.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_2.REPLACED_PART_NUM IS NULL THEN
CASE WHEN mif_1.REPLACED_PART_NUM IS NULL THEN ''
ELSE mif_1.REPLACED_PART_NUM END
ELSE mif_2.REPLACED_PART_NUM END
ELSE mif_3.REPLACED_PART_NUM END
ELSE mif_4.REPLACED_PART_NUM END
ELSE mif_5.REPLACED_PART_NUM END
ELSE mif_6.REPLACED_PART_NUM END
ELSE mif_7.REPLACED_PART_NUM END
ELSE mif_8.REPLACED_PART_NUM END
ELSE mif_9.REPLACED_PART_NUM END
ELSE mif_10.REPLACED_PART_NUM END) AS REPLACED_PART_NUM
FROM dbo.MifSuperseded AS mif_1
LEFT OUTER JOIN dbo.MifSuperseded AS mif_2
ON mif_1.REPLACED_MFR_CODE = mif_2.mfr_code AND mif_1.REPLACED_PART_NUM = mif_2.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_3
ON mif_2.REPLACED_MFR_CODE = mif_3.mfr_code AND mif_2.REPLACED_PART_NUM = mif_3.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_4
ON mif_3.REPLACED_MFR_CODE = mif_4.mfr_code AND mif_3.REPLACED_PART_NUM = mif_4.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_5
ON mif_4.REPLACED_MFR_CODE = mif_5.mfr_code AND mif_4.REPLACED_PART_NUM = mif_5.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_6
ON mif_5.REPLACED_MFR_CODE = mif_6.mfr_code AND mif_5.REPLACED_PART_NUM = mif_6.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_7
ON mif_6.REPLACED_MFR_CODE = mif_7.mfr_code AND mif_6.REPLACED_PART_NUM = mif_7.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_8
ON mif_7.REPLACED_MFR_CODE = mif_8.mfr_code AND mif_7.REPLACED_PART_NUM = mif_8.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_9
ON mif_8.REPLACED_MFR_CODE = mif_9.mfr_code AND mif_8.REPLACED_PART_NUM = mif_9.part_number
LEFT OUTER JOIN dbo.MifSuperseded AS mif_10
ON mif_9.REPLACED_MFR_CODE = mif_10.mfr_code AND mif_9.REPLACED_PART_NUM = mif_10.part_number
DROP TABLE dbo.MifSuperseded;
Both of these will correcly insert a record for part ABC 123 as follows in the SUPERSESSION_EOC table:
mfr_code | part_number | NUMBER_OF_SUPERSESSIONS | REPLACED_MFR_CODE | REPLACED_PART_NUM
ABC | 123 | 2 | GHI | 789
The insert query using the CASE statements took just over 2 minutes to insert 358,000 records. The insert using the loop ran for 20 minutes and had only inserted 9,000 records when I stopped the query. Unless I nest nested CASE statements to get above 10 statements at a time, the insert using CASE statements is limited to 10 supersessions, while the loop insert query is unusable as is.
Any thoughts on how I could go about making this process expandable and effecient? Any help at all will be greatly appreciated. And again, I'm sorry it's so long, but wanted to give you all that I was working with.
EDIT: I just tried using a recursive CTE with an OUTER APPLY, which worked, but still was very slow:
/***Create dbo.MifSuperseded since such a large amount of records.
Dropped at end.
This table contains the next link for each part number that has been superseded.***/
SELECT id = IDENTITY(int,1,1)
,a.mfr_code, a.part_number
,b.REPLACED, b.REPLACED_MFR_CODE, b.REPLACED_PART_NUM
INTO dbo.MifSuperseded
FROM dbo.mif a
LEFT OUTER JOIN dc_dim b
ON a.mfr_code = b.MFR_CODE AND a.part_number = b.PART_NUM
LEFT OUTER JOIN mif c
ON b.REPLACED_MFR_CODE = c.mfr_code AND b.REPLACED_PART_NUM = c.PART_NUM
WHERE a.superseded = N'Y';
WITH cteMifSupersededRecursive AS
(
SELECT 1 AS NUMBER_OF_SUPERSESSIONS
,mfr_code, part_number
,REPLACED_MFR_CODE, REPLACED_PART_NUM
FROM dbo.MifSuperseded
UNION ALL
SELECT NUMBER_OF_SUPERSESSIONS + 1, cur.mfr_code, cur.part_number
,nxt.REPLACED_MFR_CODE, nxt.REPLACED_PART_NUM
FROM cteMifSupersededRecursive cur
INNER JOIN dbo.MifSuperseded nxt
ON cur.REPLACED_MFR_CODE = nxt.mfr_code AND cur.REPLACED_PART_NUM = nxt.part_number
)
SELECT a.mfr_code, a.part_number
,b.NUMBER_OF_SUPERSESSIONS ,b.REPLACED_MFR_CODE, b.REPLACED_PART_NUM
INTO dbo.SUPERSESSION_EOC
FROM dbo.MifSuperseded a
CROSS APPLY
(SELECT TOP 1 REPLACED_MFR_CODE, REPLACED_PART_NUM, NUMBER_OF_SUPERSESSIONS
FROM cteMifSupersededRecursive
WHERE mfr_code = a.mfr_code AND part_number = a.part_number
ORDER BY NUMBER_OF_SUPERSESSIONS DESC
) b
DROP TABLE dbo.MifSuperseded;