I have a df as follows:
df_in
G1 G2 TPE QC
A S1 td 2
A S1 ts 4
A S2 td 6
A S2 ts 3
B S1 td 20
B S1 ts 40
B S2 td 60
B S2 ts 30
C S1 td 90
D S2 ts 7
So the output should be grouped by Columns G1 & G2 and for each such group, do a row wise ratio for the column QC like (ts/td) where values are td and ts for the column TPE and rename the new variable in column TPE as ratio. It should also contain the original rows as it is. Also it should be noted that for the TPE column some groups may not have both ts and td values. In such cases there will be no ratio or the ratio should be kept as blank.
So the output should be this:
df_out
G1 G2 TPE QC
A S1 td 2
A S1 ts 4
A S2 td 6
A S2 ts 3
B S1 td 20
B S1 ts 40
B S2 td 60
B S2 ts 30
C S1 td 90
D S2 ts 7
A S1 ratio 2
A S2 ratio 0.5
B S1 ratio 2
B S2 ratio 0.5
C S1 ratio
D S2 ratio
I tried the following, but its omitting the blank values for group C & D with blank ratios:
def calculate_ratio(group):
td_row = group[group['TPE'] == 'td']
ts_row = group[group['TPE'] == 'ts']
if not td_row.empty and not ts_row.empty:
ratio = ts_row['QC'].values[0] / td_row['QC'].values[0]
return pd.DataFrame({'G1': [group['G1'].iloc[0]],
'G2': [group['G2'].iloc[0]],
'TPE': ['ratio'],
'QC': [ratio]})
return pd.DataFrame()
grouped = df_in.groupby(['G1', 'G2']).apply(calculate_ratio).reset_index(drop=True)
df_out = pd.concat([df_in, grouped], ignore_index=True)
Any help will be immensely appreciated.
Code
out:
Intermedate
tmp: