I want to use interpolation to replace some missing values in dataframe. Here is the code to replicate my sample data.

 import numpy as np
 import pandas as pd

 np.random.seed(12345)
 # simulate a yield-curve df with dimension 250 (time) x 10 (maturities)
 df = pd.DataFrame(np.random.randn(250, 10), columns=np.arange(1, 11), index=pd.date_range('2015-01-01', periods=250, freq='B'))
 # set 20% data to NaN
 sel = np.random.choice([True, False], size=(250, 10), p=[0.2, 0.8])
 df.values[sel] = np.nan
 print(df)
                1       2       3    ...        8       9       10
2015-01-01 -0.2047  0.4789     NaN   ...    0.2817  0.7690     NaN
2015-01-02  1.0072     NaN  0.2750   ...   -0.3718  1.6690 -0.4386
2015-01-05 -0.5397  0.4770     NaN   ...       NaN  0.0009  1.3438
2015-01-06 -0.7135 -0.8312 -2.3702   ...    0.1198 -1.0635  0.3329
2015-01-07 -2.3594 -0.1995 -1.5420   ...   -0.7539  0.3313     NaN
...            ...     ...     ...   ...       ...     ...     ...
2015-12-10 -1.2566 -0.3499  0.8317   ...   -1.3944 -0.8628  1.3491
2015-12-11  0.6566 -0.8118 -0.3346   ...    0.7839  0.0165     NaN
2015-12-14 -0.1559 -2.8151     NaN   ...   -0.3437 -1.4282  0.9694
2015-12-15 -1.0264     NaN -2.2169   ...    1.2570  1.6857  0.7227
2015-12-16  0.9407  0.5943  0.1862   ...   -0.5998     NaN  0.9468

[250 rows x 10 columns]

Interpolation column-wise based on 'time method' works.

 # column-wise time series interpolation works nicely
 df1 = df.copy()
 df1.interpolate(method='time', axis=0, limit=10, inplace=True)
 print(df1)
                1       2       3    ...        8       9       10
2015-01-01 -0.2047  0.4789     NaN   ...    0.2817  0.7690     NaN
2015-01-02  1.0072  0.4785  0.2750   ...   -0.3718  1.6690 -0.4386
2015-01-05 -0.5397  0.4770 -1.7089   ...   -0.0031  0.0009  1.3438
2015-01-06 -0.7135 -0.8312 -2.3702   ...    0.1198 -1.0635  0.3329
2015-01-07 -2.3594 -0.1995 -1.5420   ...   -0.7539  0.3313 -0.1842
...            ...     ...     ...   ...       ...     ...     ...
2015-12-10 -1.2566 -0.3499  0.8317   ...   -1.3944 -0.8628  1.3491
2015-12-11  0.6566 -0.8118 -0.3346   ...    0.7839  0.0165  1.2542
2015-12-14 -0.1559 -2.8151 -1.7463   ...   -0.3437 -1.4282  0.9694
2015-12-15 -1.0264 -1.1104 -2.2169   ...    1.2570  1.6857  0.7227
2015-12-16  0.9407  0.5943  0.1862   ...   -0.5998  1.6857  0.9468

[250 rows x 10 columns]

But interpolation row-by-row using cubic spline doesn't work at all?

 # row-wise cross-maturity interpolation (cubic spline) doesn't work at all?
 df2 = df.copy()
 df2.interpolate(method='cubic', axis=1, inplace=True)
 print(df2)
                1       2       3    ...        8       9       10
2015-01-01 -0.2047  0.4789     NaN   ...    0.2817  0.7690     NaN
2015-01-02  1.0072     NaN  0.2750   ...   -0.3718  1.6690 -0.4386
2015-01-05 -0.5397  0.4770     NaN   ...       NaN  0.0009  1.3438
2015-01-06 -0.7135 -0.8312 -2.3702   ...    0.1198 -1.0635  0.3329
2015-01-07 -2.3594 -0.1995 -1.5420   ...   -0.7539  0.3313     NaN
...            ...     ...     ...   ...       ...     ...     ...
2015-12-10 -1.2566 -0.3499  0.8317   ...   -1.3944 -0.8628  1.3491
2015-12-11  0.6566 -0.8118 -0.3346   ...    0.7839  0.0165     NaN
2015-12-14 -0.1559 -2.8151     NaN   ...   -0.3437 -1.4282  0.9694
2015-12-15 -1.0264     NaN -2.2169   ...    1.2570  1.6857  0.7227
2015-12-16  0.9407  0.5943  0.1862   ...   -0.5998     NaN  0.9468

[250 rows x 10 columns]

Comparing df and df2 shows that they are equal, so no changes have been made by interpolation. I am not sure where I get wrong. Any help is highly appreciated.

 import pandas.util.testing as pdt
 pdt.assert_frame_equal(df, df2)

As JohnE suggests, it seems that the inplace=True causes the problem for row-by-row interpolation. After dropping it, it now works.

print(df2.interpolate(method='linear', axis=1))
                1       2       3    ...        8       9       10
2015-01-01 -0.2047  0.4789 -0.0384   ...    0.2817  0.7690  0.7690
2015-01-02  1.0072  0.6411  0.2750   ...   -0.3718  1.6690 -0.4386
2015-01-05 -0.5397  0.4770 -0.2721   ...    0.1518  0.0009  1.3438
2015-01-06 -0.7135 -0.8312 -2.3702   ...    0.1198 -1.0635  0.3329
2015-01-07 -2.3594 -0.1995 -1.5420   ...   -0.7539  0.3313  0.3313
...            ...     ...     ...   ...       ...     ...     ...
2015-12-10 -1.2566 -0.3499  0.8317   ...   -1.3944 -0.8628  1.3491
2015-12-11  0.6566 -0.8118 -0.3346   ...    0.7839  0.0165  0.0165
2015-12-14 -0.1559 -2.8151 -0.1460   ...   -0.3437 -1.4282  0.9694
2015-12-15 -1.0264 -1.6216 -2.2169   ...    1.2570  1.6857  0.7227
2015-12-16  0.9407  0.5943  0.1862   ...   -0.5998  0.1735  0.9468

[250 rows x 10 columns]
0

There are 0 best solutions below