Boolean indexing with MultiIndex df (pandas)

Question

Boolean indexing with MultiIndex df (pandas)

1.3k Views Asked by dan_g At 19 December 2016 at 16:17

I have a MultiIndex dataframe that I'm trying to index in to based on value ranges in my columns and the outermost index levels. So, using the example below, e.g. I'm trying to select the values from v2 that are index l2 where v1 > 12

I can achieve this using multiple index statements, such as: df[df.v1>12].loc['l2', 'v2'], but this seems less than ideal. Is there a way to condense this in to a single statement?

I've been trying to figure out how to use pd.IndexSlice, but can't seem to wrap my head around what the examples in the MultiIndex section of the docs are doing.

df = pd.concat([pd.DataFrame({'v1': range(10, 15), 'v2':range(5, 0, -1)}) 
                for i in range(2)], keys=['l1', 'l2'])

      v1  v2
l1 0  10   5
   1  11   4
   2  12   3
   3  13   2
   4  14   1
l2 0  10   5
   1  11   4
   2  12   3
   3  13   2
   4  14   1

Original Q&A

There are 1 best solutions below

**jezrael** · Answer 1 · 2016-12-19T16:23:36.423000

You can use slicers for selecting and then modified boolean indexing with loc for select column v2:

idx = pd.IndexSlice
df1 = df.loc[idx['l2', :], :]
print (df1)
      v1  v2
l2 0  10   5
   1  11   4
   2  12   3
   3  13   2
   4  14   1

print (df1.loc[df1.v1 > 12, 'v2'])
l2  3    2
    4    1
Name: v2, dtype: int32

Another solution with xs:

df1 = df.xs('l2')
print (df1)
   v1  v2
0  10   5
1  11   4
2  12   3
3  13   2
4  14   1

print (df1.loc[df1.v1 > 12, 'v2'])
3    2
4    1
Name: v2, dtype: int32

df1 = df.xs('l2', drop_level=False)
print (df1)
      v1  v2
l2 0  10   5
   1  11   4
   2  12   3
   3  13   2
   4  14   1

print (df1.loc[df1.v1 > 12, 'v2'])
l2  3    2
    4    1
Name: v2, dtype: int32

Solution with selecting first level of index by get_level_values, last if need remove first level use droplevel or reset_index:

df1 = df.loc[(df.v1 > 12) & (df.index.get_level_values(0) == 'l2'), 'v2']
df1.index = df1.index.droplevel(0)
#df1 = df1.reset_index(level=0, drop=True)
print (df1)
3    2
4    1
Name: v2, dtype: int32

Example with IndexSlice:

Select all values in first level and in second from 1 to 3 (thanks piRSquared):

idx = pd.IndexSlice
print (df.loc[idx[:, 1:3], :])
      v1  v2
l1 1  11   4
   2  12   3
   3  13   2
l2 1  11   4
   2  12   3
   3  13   2

Boolean indexing with MultiIndex df (pandas)

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in MULTI-INDEX

Trending Questions

Popular # Hahtags

Popular Questions