How to prepare my data to avoid being unable to infer frequency

43 Views Asked by At

The following code from pandas/tseries/frequencies.py is causing my code to fall over:

if not self.is_monotonic or not self.index._is_unique:
    return None

delta = self.deltas[0]
ppd = periods_per_day(self._creso)
if delta and _is_multiple(delta, ppd):
    return self._infer_daily_rule()

# Business hourly, maybe. 17: one day / 65: one weekend
if self.hour_deltas in ([1, 17], [1, 65], [1, 17, 65]):
    return "BH"

# Possibly intraday frequency.  Here we use the
# original .asi8 values as the modified values
# will not work around DST transitions.  See #8772
if not self.is_unique_asi8:
    return None

The first test, self.index._is_unique, passes fine; the second, not self.is_unique_asi8, fails, and returns None.

I have looked at this issue and the corresponding PR but

My code, it its current form, looks like this:

db = Database()
df, last_trade_time = db.fetch_trades()

# Convert the time column to a datetime object with the unit of seconds
df['time'] = pd.to_datetime(df['time'], unit='s')

# Localize the timestamps to UTC
df['time'] = df['time'].dt.tz_localize('UTC')

# Ensure uniqueness by adding the index as nanoseconds
df['time'] = df['time'] + pd.to_timedelta(df.index, unit='ns')

# Set DataFrame index
df.set_index('time', inplace=True)

dataset = PandasDataset(df, target="price")

These times are in seconds, with sub-nanometer precision (from Kraken).

How can I prepare my data? Only a month or so of Python experience here...

I have asked this question in another form here

0

There are 0 best solutions below