why does dataframe.interpolate with spline create unexpected wave

44 Views Asked by At

I'm trying to use dataframe.interpolate to fill missing data. Here is my test:

from itertools import product
df=pd.DataFrame.from_dict({
    1.5      :[np.nan    ,91.219     ,np.nan     ,np.nan     ,102.102    ,np.nan     ,np.nan     ], 
    2.0      :[np.nan    ,np.nan     ,np.nan     ,np.nan     ,103.711    ,np.nan     ,103.031    ], 
    2.5      :[np.nan    ,98.25      ,np.nan     ,100.406    ,104.695    ,np.nan     ,104.938    ], 
    3.0      :[np.nan    ,101.578    ,np.nan     ,102.969    ,104.875    ,np.nan     ,105.242    ], 
    3.5      :[np.nan    ,103.859    ,87.93      ,104.531    ,104.906    ,np.nan     ,105.32     ], 
    4.0      :[np.nan    ,105.156    ,94.469     ,105.656    ,105.844    ,89.68      ,106.523    ], 
    4.5      :[94.266    ,106.039    ,96.82      ,106.75     ,103.156    ,93.703     ,107.938    ], 
    5.0      :[97.336    ,107.953    ,98.602     ,107.906    ,104.25     ,96.547     ,109.703    ], 
    5.5      :[99.664    ,110.438    ,100.203    ,108.906    ,100.375    ,98.844     ,110.188    ], 
    6.0      :[101.344   ,112.703    ,101.492    ,108.688    ,102.906    ,100.68     ,110.5      ], 
    6.5      :[102.313   ,112.078    ,102.266    ,108.813    ,104.5      ,101.875    ,104    ], 
    7.0      :[102.656   ,114.469    ,102.242    ,108.813    ,np.nan     ,102.625    ,109    ], 
    7.5      :[103.25    ,np.nan     ,102.594    ,108.813    ,np.nan     ,103.234    ,109    ], 
    }, orient='index')
df.plot(title='original')
for int_method,int_order in list(product(['spline'],range(1,4)))+[
    (x,3) for x in ['nearest', 'zero', 'slinear', 'quadratic', 'cubic', 'barycentric', 'polynomial',
              'krogh', 'piecewise_polynomial', 'pchip', 'akima', 'cubicspline','from_derivatives','linear',
              ]
]:
    spl=df.interpolate(limit_direction='both',method=int_method,order=int_order)
    spl.plot(title=f'{int_method},{int_order}')

It seems only spline can give me the exptrapolation that I need. However, I found it seems to add some unexpected fluctuations:

unexpected fluctuation from spline

Can someone helps me to understand what happened and even provide some advice on how to improve(I know "improve" is vague phrase here. I can't find a clear definition for it myself)? Thanks!

1

There are 1 best solutions below

0
On

It is because pandas.DataFrame.interpolate calls scipy.interpolate internally, and scipy.interpolate would sort the X's (in the graph they are on the Y-axis) before it does the interpolation. Obviously it is not what we intended here, which mingles the momentum at the left side and the right side.