Fitting an N-dimensional parabola in python

951 Views Asked by At

I have a set of N samples (N~10000 to 100000) :

(y_i, P_i)

They sample an unknown function :

y = f(P)

In my case, P is a set of n_p parameters with n_p typically around 10.

My aim is to use the samples to find the polynomial of order 2 which approximate the best my unknown function. The typical result of this would be the (n_p+1)+(n_p+1)*n_p/2 coefficients of the best-fit polynomial.

However, I would like to obtain the solution in the following form :

f(P) = (P-mu)*(C^-1)*(P-mu)^T + K

with mu a vector of dimension n_p, C a (n_p X n_p) symmetric matrix, and K a constant. These mu, C and K are what I'd like to obtain.

Is there a standard implementation somewhere in the Python ecosystem and/or an efficent way to do this ?

Thanks in advance !

Edit : Here is a typical setup, where I create fake samples (P and Y) and by only using them, I would like to recover the original mu, C and K :

import numpy as np

n_p = 3
N = 15

# Generate the "mu" we'll try to recover
mu = np.random.rand(n_p)

# Generate the "C" we'll try to recover
tmp = np.random.rand(n_p,n_p)
C = np.dot(tmp, tmp.T)

# Generate the "K" we'll try to recover
K = np.random.rand()

# Generate the samples
P = np.random.rand(n_p,N)
Y = np.array([np.dot(np.dot((P[:,i]-mu),np.linalg.inv(C)),(P[:,i]-mu))+K for i in range(N)])
1

There are 1 best solutions below

1
On

Is something like this what you're looking for? I'm not 100% sure it's accurate for your use case (I don't know what additional constraints you can impose when you're fitting to an order 2 polynomial), but it should try to find a U, A (what you called C), and K that minimize the least squares error.

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt

samples = 100
num_params = 20

y = np.random.rand(samples)
p = np.random.rand(samples, num_params)


def my_func(params):
    u = params[0:num_params]
    u = np.expand_dims(u, axis=1)
    a = params[num_params:-1]
    k = params[-1]
    a = a.reshape(num_params, num_params)
    a_inv = np.linalg.inv(a)
    shifted_p = p - np.transpose(u)
    mult_with_a_inv = np.dot(shifted_p, a_inv)
    mat_mult_vec = np.einsum('ij,ji->i', mult_with_a_inv, np.transpose(shifted_p))
    return_val = y - k - mat_mult_vec
    return sum(return_val**2)


guess = np.random.rand(num_params+num_params**2+1)
new_params = optimize.fmin_cg(my_func, guess)
new_u = new_params[0:num_params]
new_a = new_params[num_params:-1]
new_a = new_a.reshape(num_params, num_params)
new_k = new_params[-1]

original_y, = plt.plot(np.arange(samples), y)
new_y, = plt.plot(np.arange(samples), np.einsum('ij,ji->i', np.dot(p-np.transpose(new_u),
                                                                   np.linalg.inv(new_a)),
                                                            np.transpose(p-np.transpose(new_u))
                                 ) + new_k)
plt.legend([original_y, new_y], ['Original Y', 'Newly Fitted Y'])
plt.show()

There's probably a more mathematically sound approach to this, but hopefully this at least helps.

Edit: Just noticed I messed up the sign of k. I also needed to make the number of parameters higher. This seems to work super well though. I'm pretty much exactly recovering the original noise.

Also adding sample output.

Sample Output