Existing library/algorithm for episodic frequency detection and prediction in a time series?

46 Views Asked by At

I'm working with podcast RSS feeds in Python. Are there any existing libraries or algorithms to detect and predict periodic release schedules, given a series in time?

For example, if five items in an RSS feed had the following timestamps:

Fri, 20 Nov 2020 02:16:14 +0000
Fri, 13 Nov 2020 17:51:58 +0000
Fri, 6 Nov 2020 03:08:04 +0000
Fri, 30 Oct 2020 19:09:29 +0000
Fri, 23 Oct 2020 01:23:10 +0000

is there an algorithm to determine "Weekly on Fridays"? Or if they were:

Tue, 24 Nov 2020 10:00:00 -0000
Fri, 20 Nov 2020 09:00:00 -0000
Tue, 17 Nov 2020 10:00:00 -0000
Fri, 13 Nov 2020 10:00:00 -0000
Tue, 10 Nov 2020 10:00:00 -0000

to determine "Twice a week, next episode Friday the 27th"? I believe Pocket Casts has a feature like this, but it remains proprietary.

1

There are 1 best solutions below

1
On BEST ANSWER

For easy ones you can use pd.infer_freq in this way

import numpy as np
import pandas as pd

date_range = [
    "Fri, 20 Nov 2020",
    "Fri, 13 Nov 2020",
    "Fri, 6 Nov 2020",
    "Fri, 30 Oct 2020",
    "Fri, 23 Oct 2020"]

date_range_2 = [
    "Tue, 24 Nov 2020",
    "Fri, 20 Nov 2020",
    "Tue, 17 Nov 2020",
    "Fri, 13 Nov 2020",
    "Tue, 10 Nov 2020"]

def get_frequency(date_range):
    ts = pd.Series(index=date_range)
    return pd.infer_freq(ts.index)

print(f"First Time Series: {get_frequency(date_range)}")
print(f"Second Time Series: {get_frequency(date_range_2)}")

Giving you no output for the second, but for the first one

First Time Series: -1W-FRI
Second Time Series: None