I have two large data frames (gpr1 = 94991 rows and 12 vars, and eff1 = 946 rows and 21 vars). I need to join them where the VesselID matches, and where the POSITION_DATETIME from the gpr df lies inside the TripStartDateTime and the TripEndDateTime from the eff df. I am struggling to work out how to do this as I usually only join by an identifiing key and no other conditions. I have tried to add my script in all the correct ways i.e. indent all code by 4 spaces using the code toolbar button or the CTRL+K keyboard shortcut, but I can't for the life of me get it to accept my code in the question. I will include it in a comment. In summary, however, I receive the following error with the script I used: Warning: Detected an unexpected many-to-many relationship between x and y. Ultimately I end up with the error and a dataframe that has 20075981 rows and 32 vars. Any tips on adding script to stack questions would also be great! I was also unable to attach a repro data set without getting the error and stack thinking it was badly formatted script but am happy to forward this on.
Join data frames with multiple conditions
24 Views Asked by Angela Russell At
0