SQL - Count of Sessions and transactions between session

437 Views Asked by At

I am trying to write a HiveQL (or even ANSI SQL) query that can answer the following:

I have a database with events (transactions), each event is stored with a user_id and a trans_time (transaction time). A user_id can have unlimited events, even possible to have multiple events with the same trans_time.

If I establish an activity window of time (i.e. 10 seconds), how many windows had how many events within that window? I'm thinking the result would be something like this (obviously without the words, just numbers)...
100 windows had only 1 event
50 windows had 2 events
.
.
1 window that had 30 events

To describe the first result...There were 100 instances in the data set where only 1 event took place within the 10 second window.

Is there a way to pull another metric out of the same data that says, 50 user_ids had only 1 event
25 user_ids had 2 events
.
.
1 user_id had 30 events

To describe the first result...There were 50 user_ids that had only 1 event within the 10 second window.

I hope this is not too nebulous. As always, thank you!!

1

There are 1 best solutions below

0
On BEST ANSWER

if your windows are static, for example :
window 1 - 12:00:00 - 12:00:09
window 2 - 12:00:10 - 12:00:19 etc.

You can do this :

-- this counts number of windows per with same number of events
select events_in_window,count(*) windows 
from (
    -- this counts events in each window
    select window ,count(*) events_in_window 
    from (
        -- this rounds all the transactions to windows
        select user_id,from_unixtime(floor(unix_timestamp(trans_time)/10)*10) window
        from table) a 
    group by window) b 
group by events_in_window 

Note the "10" in the trans_time "rounding". you can change it to any size of windows you need.