I have the following DataFrame:
import pandas as pd
data = {
"cat": ['A'] * 3 + ['B'] * 2,
"val1": [10, 9, 12, 20, 25],
"val2": [1, 3, 2, 6, 7],
"idx": [0, 1, 5, 1, 2]
}
df = pd.DataFrame(data)
df.set_index('idx', inplace=True)
which gives
cat val1 val2
idx
0 A 10 1
1 A 9 3
5 A 12 2
1 B 20 6
2 B 25 7
I want to transform it into a DataFrame with a column MultiIndex
A B
val1 val2 val1 val2
idx
0 10 1 NA NA
1 9 3 20 6
2 NA NA 25 7
5 12 2 NA NA
Is there a way to do this without manually splitting and horizontally concatenating the table?
You can try with
df.pivot,swaplevel, andsort_index. Adddf.rename_axisto get rid of the added column name (i.e. 'cat').If
idxis not yet your index from the start, you can skipdf.set_index('idx', inplace=True)and simply use:df.pivot(index='idx', columns='cat').