I've a dataframe like this,
Name COST Timestamp
0 c 8 2023-09-20 15:14:46
1 a 8 2023-09-20 15:14:48
2 c 9 2023-09-20 15:14:55
3 b 10 2023-09-20 15:15:00
4 c 4 2023-09-20 15:15:02
5 a 9 2023-09-20 15:15:04
6 b 3 2023-09-20 15:15:12
7 a 3 2023-09-20 15:15:17
8 c 6 2023-09-20 15:15:20
9 c 6 2023-09-20 15:15:29
What I want is to create a new dataframe. It will look for the sequence of a,b,c (order doesn't matter here) and sum_cost is sum of cost of a,b,c and it's time_stamp will be the last time_stamp of a,b,c whichever you get. Incase of multiple entries of a same name say like b,c,c,b,b,a then take the last c then then the last b and a.
An detailed example:
output should be from the given dataframe will be constructed like this, for first one it will take 'a' of index 1, 'c' of index 2 (as this came after 'c' of index 0) and 'b' of index 3, the sum_cost will be 27 for this group and timestamp will be 2023-09-20 00:14:26 which is of 'b' of index 3 as it came last for this group. And next group it will take 'c' of index 4, 'a' of index 5 and 'b' of index 6 and sum_cost will be 16 and timestamp will be 2023-09-20 00:06:51 which is of 'b' of index 6 as it came last of this sequence.
The output will look like this,
sum_cost Timestamp
0 27 2023-09-20 15:15:00
1 16 2023-09-20 15:15:12
Please help me with this. Thank you!
You need to build a custom grouper, for that a loop is required, here using a custom function:
Output:
Intermediate with group: