I have this table:
| Date | User ID | Calls | Hours |
|---|---|---|---|
| 3-9-2022 | 2234 | 10 | 01:06:19 |
| 3-9-2022 | 2234 | 4 | 00:32:54 |
That I want to group into (resetting index as well):
| Date | User ID | Calls | Hours |
|---|---|---|---|
| 3-9-2022 | 2234 | 14 | 01:39:13 |
I am getting stuck into aggregating the hours, I read that you have to convert it first into timedelta but I am not sure how to put the function into group by (perhaps with lambda?). In addition I have Calls applied with normal sum function. Here is my current code:
df.groupby(['Date','UserID'])['Calls','Hours'].apply(lambda x : x.astype(int).sum()).reset_index()
Data type:
Date - Datetime
UserID - Int
Calls - Int
Hours - Datetime (Time)
Any help appreciated!
ANSWER
df['Hours'] = pd.to_timedelta(df['Hours'])
df_group = df.groupby(['Date', 'User ID']).agg({'Calls':'sum', 'Hours':'sum'})
First convert column
Hoursto timedeltas, aggregatesumand then convert back toHH:MM:SSformat: