With similiar syntax, here is original of pandas dataframe function to resample and interpolate:
def resample_and_interpolate(df, datetime_column):
# Convert datetime column to datetime format
df[datetime_column] = pd.to_datetime(df[datetime_column])
df = df.set_index(datetime_column)
# Resample to 1-minute intervals and interpolate
df_resampled = df.resample('1T').interpolate(method='linear')
# Resample to 1-hour intervals and forward fill and backward fill
resampled_df = df_resampled.resample('1H').first().ffill().bfill()
# Reset index to get datetime back as a column
resampled_df = resampled_df.reset_index()
return resampled_df
And here is the cuDF:
def g_resample_and_interpolate(gdf, datetime_column):
# Convert datetime column to datetime format
gdf[datetime_column] = cudf.to_datetime(df[datetime_column])
gdf = gdf.set_index(datetime_column)
# Resample to 1-minute intervals and interpolate
gdf_resampled = gdf.resample('1T').interpolate(method='linear')
# Resample to 1-hour intervals and forward fill and backward fill
resampled_gdf = gdf_resampled.resample('1H').first().ffill().bfill()
# Reset index to get datetime back as a column
resampled_gdf = resampled_gdf.reset_index()
return resampled_gdf
The original function of dataframe pandas is working fine, but the cuDF GPU dataframe is not working fine.
Usage:
resample_and_interpolate(gdf_dict[20000019].to_pandas, 'charttime')
g_resample_and_interpolate(gdf_dict[20000019], 'charttime')
Returning error:
KeyError: 'interpolate'
6 # Resample to 1-minute intervals and interpolate
----> 7 df_resampled = df.resample('1T').interpolate(method='linear')
AttributeError: DataFrameResampler object has no attribute interpolate
Add
asfreq()
to resolve that.