I have the following dataframe.
padel start_time end_time duration
38 Padel 10 08:00:00 09:00:00 60
40 Padel 10 10:00:00 11:30:00 90
42 Padel 10 10:30:00 12:00:00 90
44 Padel 10 11:00:00 12:30:00 90
46 Padel 10 11:30:00 13:00:00 90
49 Padel 10 16:00:00 17:30:00 90
51 Padel 10 16:30:00 18:00:00 90
53 Padel 10 17:00:00 18:30:00 90
55 Padel 10 17:30:00 19:00:00 90
57 Padel 10 18:00:00 19:30:00 90
59 Padel 10 18:30:00 20:00:00 90
61 Padel 10 19:00:00 20:30:00 90
63 Padel 10 19:30:00 21:00:00 90
65 Padel 10 20:00:00 21:30:00 90
67 Padel 10 20:30:00 22:00:00 90
I want to chose the longest timespans in between. The output I want should look like this
padel start_time end_time duration
38 Padel 10 08:00:00 09:00:00 60
40 Padel 10 10:00:00 13:00:00 180
49 Padel 10 16:00:00 22:00:00 360
I not care about duration. I can do that. but how will i merge the time spans which overlap. Thanks
shift()to create groups ifstart_timeisgreater thanend_timeof row above (i.e. overlapping).fillnawith'24:00:00'so that we return 'True' for first value as nothing can be greater than 24 hours for a day. That's becauseNaNis the output in first row withshift()which would returnFalseif we didn't do this.booleanseries ofTrueandFalse(i.e.1and0,. respectively), so you just take the cumulative sum withcumsum.grpobject, which we can include ingroupby.Full Code with input dataframe