Fitting spline to data with duplicate x using scipy (Python)

3.7k Views Asked by At

I am trying to fit spline on my data having 5700 data samples (with duplicate x values ie. x: horizontal axis) using 'interp1d' function from package 'scipy' in python. I tried with lower order spline (k=1) and with the quadratic spline(k=2), cubic spline (k=3). And I am shocked to witness the weird response of 'spline' on my data.Although spline at 'k=1' made some sense but is ridiculously overfitting it(quadratic and cubic spline performed too bad).In my first trial, I used Polynomial fitting to fit the data and results were encouraging. I was expecting spline fitting will give results better than what I got with polynomial fitting.This is the result with splines. Please suggest me, where am I wrong?

1

There are 1 best solutions below

2
On BEST ANSWER

If a data set has separate uniquely valued points for each X value, the effective weighting for each data point is 1.0. If however a single data point in that data set is doubled, or copied, once that single data point has an effective weight of 2.0.

If all data points in a data set are copied once, each point then has the same effective weight of 2.0 - that is, all data points will have the same weight.

If some data points in this data set have unique X values and some appear more than once, one approach is to average the Y values for the "duplicate" data points so that each point again has an effective weight of 1.0. This can sometimes work in the specific case I describe.