I have two data sources 1) ATG_Data (Data Source 1) and 2) Text Record. After joining both the sources the output is not coming as desired.
For Example,
There are two records (present in both sources). Both the records have these three properties but they definalty have other properties as well.
Item Id Vendor Id Ranking(P_CommPtp)
Record 1 703595 2560 10 Record 2 703595 5638 11
But the Final Record after joining (left join) is
Item Id Vendor Id Ranking(P_CommPtp)
Record 1 703595 2560 10
Record 2 703595 5638 11
Record 3 703595 2560 10 11
Record 4 703595 5638 10 11
Two more records are getting created, with the ranking merged.
In the pipeline, we are caching the data based on the following index.
ATG Data - 1) Item Number
2) Vendor Id
Text File - 1) Item Number
We are using the left join.
I am not able to understand why 2 more records are getting created. we are doing indexing at sku level. and these three properties doesn't signify uniqueness of the records. Can u please help me in this?
In the Record Assembler screen in the pipeline diagram (where you configure the join type, etc), I would experiment with the 2 check boxes "Multi-sub records" and "Remove Duplicate property values". I think the first one might help with this.