Imagine I have a dataframe like this: With lists of elements in a single string.
data = {'Col1': ["apple, banana, orange", "dog, cat", "python, java, c++"],
'Col2': ["banana, lemon, blueberry", "bird, cat", "R, fortran"]
}
df = pd.DataFrame(data)
df
How can I create a Col3 with the intersection of elements in Col1 and Col2
Expected output:
data = {'Col1': ["apple, banana, orange", "dog, cat", "python, java, c++"],
'Col2': ["banana, lemon, blueberry", "bird, cat", "R, fortran"],
'Col3': ["banana", "cat", NA]
}
df = pd.DataFrame(data)
df
Using a list comprehension and
setintersection:Output:
If you want NAs on empty intersections:
Output: