I'm having issues with semi_join from dplyr. Ideally I would like to do a semi join on dfA against dfB. dfA has duplicate values, and so does dfB. I want to pull back all values from dfA that have any matches against dfB even duplicates in dfA.
dfA dfB >> dfC
x y z x g x y z
1 r 5 1 lkm 1 r 5
1 b 4 1 pok 1 b 4
2 4 e 2 jij 2 4 e
3 5 r 2 pop 3 5 r
3 9 g 3 hhg 3 9 g
4 3 0 5 trt
What I would like to get is the dfC output above. Because there is AT LEAST 1 match of x, it pulls back all x's in dfA
semi_join(dfA, dfB, by = "x")
dfC
x y z
1 r 5
2 4 e
3 5 r
inner_join(dfA, dfB, by = "x")
x y z g
1 r 5 lkm
1 r 5 pok
1 b 4 lkm
1 b 4 pok
2 4 e jij
2 4 e pop
3 5 r hhg
3 9 g hhg
Neither of which give me the right result. Any help would be great! Thanks in advance
not sure why you need a
join
: just use %in%