Here is a minimal example:
import pandas as pd
import numpy as np
np.random.seed(0)
idx = pd.MultiIndex.from_product([[1,2,3], ['a', 'b', 'c'], [6, 7]])
df = pd.DataFrame(np.random.randn(18), index=idx)
selection = [(1, 'a'), (2, 'b')]
I would like to select all the rows in df that have as index that starts with any of the items in selection. So I would like to get the sub dataframe of df with the indices:
(1, 'a', 6), (1, 'a', 7), (2, 'b', 6), (2, 'b', 7)
What is the most straightforward/pythonian/pandasian way of doing this? What I found:
sel = [id[:2] in selection for id in df.index]
df.loc[sel]
You could use boolean indexing with
isin:Output:
If you want to select other levels, drop the unused leading levels:
Output: