I need to merge overlapping intervals in my CSV file. Here is the sample data I have
| Tasks | Start | End |
|---|---|---|
| Read | 10:00 | 12:00 |
| Read | 11:10 | 13:00 |
| Read | 13:50 | 14:00 |
| Write | 14:00 | 14:30 |
| Write | 11:00 | 15:00 |
The sample output should be like
Read - [10:00,13:00], [13:50,14:00]
Write - [14:00,15:00]
I see a similar post that does the same work with pandas but it's not allowed to use here
Efficient merge overlapping intervals in same pandas dataframe with start and finish columns
I am trying to use either file reader or CSV reader to parse that file and get the rows in sorted order and compare the indices for overlapping intervals but I am facing challenges in parsing this file
import csv
dict = {}
with open('sample.csv', mode='r') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
name = row["Tasks"]
start_time = row["Start"]
end_time = row["End"]
dict[tasks] = [start_time,end_time]
print(dict)
Any help will be appriciated
Assuming the times are sorted, you can compute custom groups and aggregate:
output:
NB. I used timdelta as comparison objects, you could also use strings with: