Pandas groupby ignores column created with apply function

107 Views Asked by At

Executing this code below gives a strange result. Essentially the column inserted through apply gets ignored when the dataframe is stitched back together to give the final result. Why is that and how can i get the result I want?

# initialization and dataframe generation
import pandas as pd
from IPython.display import display

index = pd.MultiIndex.from_tuples(list(zip(*[['one', 'one', 'two', 'two'],
                                             ['foo', 'bar', 'foo', 'bar']])))
df = pd.DataFrame(np.arange(12).reshape((3,4)), columns=index)
# actual code starts here
def new_zero(df):
    df.loc[:,(df.columns[0][0],'zero')] = 0      # MultiIndex column label necessary
    display(df) 
    return df

dd = df.groupby(level=0, axis=1).apply(new_zero)
dd

enter image description here

If I switch rows and columns, it works. (Though level 0 of the index is duplicated):

def new_zero(df):
    df.loc[(df.index[0][0],'zero'),:] = 0
    display(df)
    return df

dd = df.T.groupby(level=0, axis=0).apply(new_zero)
dd
1

There are 1 best solutions below

2
On

I'm not sure what is up with your code, but I would take a different approach:

  1. You can loop through the one and two column names
  2. Then, create a new multi-index column with df[col, 'zero'] = 0.
  3. Finally, to reorganize the columns as desired, you can use .sort_index() and pass axis=1:

    import pandas as pd
index = pd.MultiIndex.from_tuples(list(zip(*[['one', 'one', 'two', 'two'],
                                             ['foo', 'bar', 'foo', 'bar']])))
df = pd.DataFrame(np.arange(12).reshape((3,4)), columns=index)
cols = list(set([col[0] for col in df.columns]))
for col in cols:
    df[col, 'zero'] = 0
df = df.sort_index(axis=1, level=[0, 1])
df = df[[('one',  'foo'),
         ('one',  'bar'),
         ('one', 'zero'),
         ('two',  'foo'),
         ('two',  'bar'),
         ('two', 'zero')]]
df
Out[1]: 
  one          two         
  foo bar zero foo bar zero
0   0   1    0   2   3    0
1   4   5    0   6   7    0
2   8   9    0  10  11    0