Executing this code below gives a strange result. Essentially the column inserted through apply gets ignored when the dataframe is stitched back together to give the final result. Why is that and how can i get the result I want?
# initialization and dataframe generation
import pandas as pd
from IPython.display import display
index = pd.MultiIndex.from_tuples(list(zip(*[['one', 'one', 'two', 'two'],
['foo', 'bar', 'foo', 'bar']])))
df = pd.DataFrame(np.arange(12).reshape((3,4)), columns=index)
# actual code starts here
def new_zero(df):
df.loc[:,(df.columns[0][0],'zero')] = 0 # MultiIndex column label necessary
display(df)
return df
dd = df.groupby(level=0, axis=1).apply(new_zero)
dd
If I switch rows and columns, it works. (Though level 0 of the index is duplicated):
def new_zero(df):
df.loc[(df.index[0][0],'zero'),:] = 0
display(df)
return df
dd = df.T.groupby(level=0, axis=0).apply(new_zero)
dd
I'm not sure what is up with your code, but I would take a different approach:
one
andtwo
column namesdf[col, 'zero'] = 0
..sort_index()
and passaxis=1
: