SSIS Lookup Alternatives

2.5k Views Asked by At

I have been using the LookUp Transformation in a SSIS package. The Input table has about 75 million rows, while the lookup table has about 70 million rows.

I need to capture both the matching and non-matching outputs, so that they can be processed as necessary.

I am running into a roadblock with the error "The attempt to add a row to data flow task buffer failed"

Inner join is a good alternative, however i don't think it can give two outputs i.e. matching and non-matching (which is why I did not use it in the first place. If that is not the case, please enlighten me!)

Partial Cache and No Cache are alternatives, however they are slower than the hour hand on a clock!

Can you please suggest any alternatives to the LookUp transformation that are able to give both matched and unmatched outputs?

Thanks in advance!

[Source_Table]
SELECT prop_code        --[varchar](6) NULL
    ,conf_nbr           --[varchar](20) NULL
    ,arrival_date       --[date] NULL
    ,system_source      --[varchar](5) NULL
    ,net_revenue        --[float] NULL
    ,net_room_nights    --[int] NULL
    ,srp_code           --[varchar](10) NULL
    ,corp_client_id     --[varchar](10) NULL
    ,rac_code           --[varchar](10) NULL
    ,ta_client_id       --[varchar](10) NULL
FROM Account_360_Stage_Prd_Reservations_CRS
ORDER BY prop_code
    ,conf_nbr

[Reference_Table]
SELECT DISTINCT property_code                                                   --[varchar](6) NOT NULL
    ,CAST(host_confirmation_number AS VARCHAR) AS 'host_confirmation_number'    --[bigint] NULL
FROM Account_360_Stage_Guest
ORDER BY property_code
    ,host_confirmation_number

[LookUp]
prop_code = property_code
conf_nbr  = host_confirmation_number
2

There are 2 best solutions below

1
On BEST ANSWER

You could use a LEFT OUTER JOIN as this will give you both 'matched' and 'unmatched' . Then redirect all the NULLs from joined table as 'unmatched'

4
On

I would probably use an inner join, but with a case statement to identify the two conditions. Then use a conditional split task to split the data based on the output of the case statement