How to get rows within specific sequence of value?

1k Views Asked by At

I need to get sessions where I one sequence of values in specific order.

Now I have this query which returns sessions for each user from raw data

select user_id, page, happened_at 
from db as u1 
where  exists 
  (select 1 from db as u2 
   where u1.user_id = u2.user_id 
   and u2.happened_at < (u1.happened_at + interval '1 hour') )
ORDER BY user_id, happened_at
LIMIT 100

I need to get result set in format which is result of subquery on group of sessions I get by request above.

Subquery conditions is get a session where user opened the page in specific order. E.g. page value can be - start, work, buy, landing, - the user should go trough each page in this order during the session. He\she still can meet any other pages between each of them, but have to go through all this pages in this order during the session.

How is it possible to get the output which meet such condition? How can I get understanding the page value changes in specific order during the session?

Dataset on first query run:

    user_id     page    happened_at     
3,230 start 2017-03-01 15:10
3,230 work 2017-03-01 15:16
3,230 start 2017-03-01 15:16 
3,230 preview  2017-03-01 17:12
3,230 work 2017-03-01 17:12
3,230 buying 2017-03-01 17:13
3,230 landing 2017-03-01 17:51
3,230 smt else 2017-03-01 17:52
3,230 any page 2017-03-01 17:56
3,230 lanidng 2017-03-01 18:03

Output (what I am looking now)

    user_id     page    happened_at     
3,230 start 2017-03-01 15:16 
3,230 preview  2017-03-01 17:12
3,230 work 2017-03-01 17:12
3,230 buying 2017-03-01 17:13
3,230 landing 2017-03-01 17:51

Final result

user_id session_start session_end
3,230   2017-03-01 15:16 2017-03-01 18:03
1

There are 1 best solutions below

3
On

use window function for 1st output

select user_id,page,happened_at from 
(
select user_id,page,happened_at, row_number() over(partition by user_id,page order by happened_at desc) rn from table
) t where rn=1

then take max() and min() of this query output

    with t1 as 

     (
    select user_id,page,happened_at from 
        (
        select user_id,page,happened_at, row_number() over(partition by user_id,page order by happened_at desc) rn from table
        ) t where rn=1
    ) select user_id,min(happened_at) session_start,
max(happened_at) as session_end from t1 group by user_id