This is a sample of the data
| Occasion | Date Range |
|---|---|
| EVENT 1 | 2 / 1 / 1445 هـ - 17 / 6 / 1445 هـ - 20 / 7 / 2023 - 30 / 12 / 2023 م |
| EVENT 2 | 13 \ 1 \ 1445 هـ - 16 \ 5 \ 1445 هـ - 31 \ 7 \ 2023 م - 30 \ 11 \ 2023 م |
| EVENT 3 | 1445/4/11-1445/3/30 هـ - 15-2023/10/26 م |
As you see the patterns differ depending on whether the event would last for few months so it looks like the first two examples, or if it will last for a few days such as the last example those two examples is what I found manually so if there is a way to detect other patterns first before separating them that would be the way to go. Any suggestions ?
I tried this to extract different patterns in the Date Range Column
import pandas as pd
import re
# Load the Excel file
file_path = 'Local Festivals.xlsx'
df = pd.read_excel(file_path)
date_range_column = 'Date Range'
# Extract unique date patterns from the column
unique_date_patterns = set()
for date_range in df[date_range_column]:
# Use a regular expression to extract date patterns
date_pattern = re.search(r'(\d+ \/ \d+ \/ \d+ \s?[هـم]\s?-?[^0-9]*\s?\d+ \/ \d+ \/ \d+ \s?[مهـ]\s?)', date_range)
if date_pattern:
unique_date_patterns.add(date_pattern.group(0))
# Print the unique date patterns and their counts
for i, pattern in enumerate(unique_date_patterns, start=1):
print(f"Pattern {i}: {pattern}")
print(f"Total unique date patterns found: {len(unique_date_patterns)}")
But it didn't work with all the date ranges in the data