I have some data about when, how long, and what channel people are listening to the radio. I need to make a variable called sessions
that groups all entries which occur while the radio is on. Because the data may contain some errors I would like to say that if less than five minutes passes from the end of one channel period to the next then it is still the same session. Hopefully a brief example will clarify.
obs Entry_date Entry_time duration(in secs) channel
1 01/01/12 23:25:21 6000 2
2 01/03/12 01:05:64 300 5
3 01/05/12 12:12:35 456 5
4 01/05/12 16:45:21 657 8
I want to create the variable sessions so that
obs Entry_date Entry_time duration(in secs) channel session
1 01/01/12 23:25:21 6000 2 1
2 01/03/12 01:05:64 300 5 1
3 01/05/12 12:12:35 456 5 2
4 01/05/12 16:45:21 657 8 3
for defining 1 session i need to use entry_time
(and date
if it goes from 11pm into the next morning) so that if entry_time+duration + (5minutes) < entry_time(next channel)
then the session changes. This has been killing me and simple arrays wont do the trick, or my attempt using arrays has not worked. Thanks in advance
Aside from the comments I made in the OP, here's how I would do it using a SAS data step. I've changed the date and time values for row 2 to what I suspect they should be (in order to get the same result as in the OP). This avoids having to perform a self join, which is likely to be performance intensive on a large dataset.
I've used the DIF and LAG functions, so care needs to be taken if you're adding in extra code (particularly IF statements).