Transform(many row -> one row) dataframe as variables (pandas)

857 Views Asked by At

I have dataframe which has many rows.

enter image description here

How can I make this upper dataframe as below which has one rows. enter image description here

import pandas as pd 

# source dataframe
df_source = pd.DataFrame({
    'ID': ['A01', 'A01'],
    'Code': ['101', '102'],
    'amount for code': [10000, 20000],
    'count for code': [4, 3]
})

# target dataframe
df_target = pd.DataFrame({
    'ID': ['A01'],
    'Code101': [1],
    'Code102': [1],
    'Code103': [0],
    'amount for code101': [10000],
    'count for code101': [4],
    'amount for code102': [20000],
    'count for code102': [3],
    'amount for code103': [None],
    'count for code103': [None],
    'count for code': [None],
    'sum of amount': [30000],
    'sum of count': [7]
})

I tried to use method 'get.dummies' but It can be used only for there was that code or not.

How can I handle dataframe to make my dataset?

2

There are 2 best solutions below

0
On

You can iterate through the rows of your existing dataframe and populate (using .at or .loc) your new dataframe (df2). df2 will have the index ID, which is now unique.

import pandas as pd

df = pd.DataFrame({
    'ID': ['A01', 'A01'],
    'Code': ['101', '102'],
    'amount for code': [10000, 20000],
    'count for code': [4, 3]
})

df2 = pd.DataFrame()
for idx, row in df.iterrows():
    for col in df.columns: 
        if col !='ID' and col !='Code': 
                    df2.at[row['ID'],col+row['Code']]=row[col]
        
0
On

You can use pivot_table:

df_result = df.pivot_table(index='ID', columns='Code', values=['amount for code', 'amount for code'])

This will return a data frame with multi-level column index, for example ('101', 'amount for code') Then you can add other calculated columns like sum of amount and so on.