I have a dataframe df which i need to groupby multiple column based on a condition.
Input
Employee Name | Subjects | Billable | Hours | Date |
---|---|---|---|---|
Anu | Java | Yes | 8 | 01-03-2021 |
Anu | Python | Yes | 9 | 02-03-2021 |
Anu | SQL | No | 6 | 03-03-2021 |
Anu | React | Yes | 5 | 03-03-2021 |
Anu | .Net | No | 8 | 04-03-2021 |
Bala | SQL | No | 5 | 01-03-2021 |
Bala | Python | Yes | 4 | 01-03-2021 |
Bala | Java | Yes | 2 | 02-03-2021 |
Bala | .Net | No | 8 | 03-03-2021 |
Bala | React | Yes | 7 | 04-03-2021 |
Columns should be add in output file: Utilization, Billable Utilization, Non-Billable Utilization
- Calculation for Utilization columns : Utilization = df[Hours]/9*100
-> If billable column value as "YES"" - create column as Billable utilization
- Calculation for Billable Utilization in column: Billable Utilization = df[Hours]/9 * 100
-> If billable column value as "NO"" - create column as Non-Billable utilization
- Calculation for Non-Billing Utilization in column: Non-Billable Utilization = df[Hours]/9*100
Code
available =9
utilization= (df['Hours']/available*100).round(2)
df = df.assign(Utilization = utilization)
df1 = df.groupby(['Date','Employee Name'])['Utilization'].sum()
df['Billable'] = np.where(df['Billable'] == 'Yes', 'Billable Utilization','Non Billable Utilization')
df2 = (df.groupby(['Date','Employee Name']).agg({'Hours':sum}).div(available).mul(100)).round(2)
My output
You can "pivot"
Hours
intoBillable
andNon-Billable
columns usingpivot_table()
and then convert to utilization: