One to Many merge row level

1.2k Views Asked by At

I have run into a data problem that I sense many have encountered. I Currently have a data set which contains transaction information. Based on the Transaction_Number I will find how long each person involved in transaction spent on their part.

The final output looks like this

 Transaction_Number        Created On             Created_By      Time_Diff

  27327920232            2014-12-08 03:29:08      jake            NaN
  27323232336            2017-04-28 18:39:03       James           0
  27323232336            2017-04-28 18:26:23      Tony            760.0

Now the problem that I am facing is that I am missing one column. Which I solved by querying from the DB and pulled a field named "Product_Number". I am joining the two tables on Transaction_Number. Here is where the problem arises the second data frame name D has the Transaction_Number listed only once matching the Product_Number. The master data frame has multiple occurrences of the Transactions. Therefore when I run the following code my result doesn't make sense. Outputting only 4 matches based on 12K+ possibilities.

         X                       D                        D
   Transaction_Number     Transaction_Number          Product_Number
     27327920232            27327920232                Cw-xs-re-89
     27327920232            72732332323                pw-ts-sf-89
     27327920232            32338033383                Cw-ns-tn-86

    results = pd.merge(X, D, how='inner', on= "Transaction_Number")

What can I do to find the correct matchings. I have also attempted a left join, any type of tips will help. - Thanks

1

There are 1 best solutions below

0
On

Are you looking for outer merge?

results = pd.merge(X, D, how='outer', on= "Transaction_Number")