I have click-stream data. Below, I have provided sample data for one user:
user_id page time duration
1 A 12:15 5
1 B 12:21 3
1 C 12:25 22
1 D 12:48 5
1 B 12:54 2
1 A 12:57 5
What I want to do per user is if duration on a page is more than 22, then they should be identified as different sessions, which should be then displayed as different column, as follows for example for user #1:
user_id page time duration session
1 A 12:15 5 1
1 B 12:21 3 1
1 C 12:25 22 1
1 D 12:48 5 2
1 B 12:54 2 2
1 A 12:57 5 2
The same should be done for all users, creating sessions if the duration on a page is more than 20, and then naming them incrementally starting from 1. I honestly could not find any example to start from. I appreciate any guidance.
We can calculate cumulative sum and divide it by 22
and the output would be