Talib indicators not working on .resample()'d dataframes; returns all NaN's

354 Views Asked by At

To reproduce, download my 15 second SPY stock OHLCV dataset from here, and place in the same folder as this script. Should have Talib and pandas installed. Running Python 3.7.9, Windows 10.

Talib indicators don't seem to work after a df is upsampled to a higher timeframe, even though the dataframe remains numeric. I'm trying to upsample this 15 second dataset to a 5 minute timeframe. I attempt to add a moving average both before and after the resampling, before works fine, after returns all NaN's. Any idea why? (Will be back online tomorrow to review answers)

from pandas import read_csv, to_numeric, to_datetime
import talib

# Import the 15 second dataframe
df = read_csv("./SPY_15_Second.csv")
df['time'] = to_datetime(df['time'], format="%Y-%m-%d %H:%M:%S")
df = df.set_index('time')

# Apply an SMA now and it works
df['SMA1'] = talib.SMA(df['close'], timeperiod=15)
print("First DF")
print(df)

# Upsample it to 5 minute dataset
ohlc_dict = {                                                                                                             
    'open': 'first',                                                                                                    
    'high': 'max',                                                                                                       
    'low': 'min',                                                                                                        
    'close': 'last',                                                                                                    
    'volume': 'sum',
}

df = df.resample('5min', closed='left', label='left').apply(ohlc_dict).apply(to_numeric)
print("Upsampled DF")
print(df)

# Try to apply a second simple moving average after resampling, see nans
df['SMA2'] = talib.SMA(df['close'], timeperiod=15)
print("Problem DF")
print(df)

# Why entire column nan? Attempted to apply to_numeric. Something
# to do with the resample function.
1

There are 1 best solutions below

0
On

It works if you use the .ohlc() method on it, and not reassign it to a variable, so in other words, use this:

Change this:

df = df.resample('5min', closed='left', label='left').apply(ohlc_dict).apply(to_numeric)

...to this:

df.resample('5min', closed='left', label='left').ohlc()