Hyperparameter comparison between GPy and sklearn

32 Views Asked by At

I have the following code:

vals_x = [-7, -5.9, -5.7, -5.4, -4.8, -4.5, -4.3, -2.5, -2, -0.5, 0.3, 0.4, 0.5, 2.5, 2.7, 4.5, 4.6, 5, 5.8, 6]
poly_x = PolynomialFeatures(1).fit_transform(np.array(vals_x).reshape(-1,1))
vals_y = [-1.7, 0, 0.1, 0.4, -0.8, -1.4, -1.3, 0.4, 1.3, 1.8, -0.2, -0.7, -1, -2.8, -2.3, -1.5, -1.4, -1.8, -1, -0.9]

test_x = np.linspace(-7, 7, 100).reshape(-1, 1)
poly_test_x = PolynomialFeatures(1).fit_transform(test_x)

gpr = GPy.models.GPRegression(poly_x, np.array(vals_y).reshape(-1, 1), normalizer=True)

gpr.optimize_restarts(num_restarts=5, verbose=False)

print(gpr.kern)
print(gpr.Gaussian_noise)

predictions = gpr.predict(poly_test_x)

Which gives me this output:

rbf.         |               value  |  constraints  |  priors
variance     |  1.6785094155692202  |      +ve      |        
lengthscale  |  1.0562527101500396  |      +ve      |        
Gaussian_noise.  |                 value  |  constraints  |  priors
variance         |  0.020212266824723164  |      +ve      |        

Now I want to translate that to sklearn as closely as possible. In the code above, I have 3 hyperparameters, which are optimized during training: length scale, function variance and gaussian noise.

If I now want to do the same in sklearn, I would replace gpr with

kernel = RBF() + WhiteKernel()
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=5).fit(poly_x, np.array(vals_y).reshape(-1, 1))

Here, the length scale is embedded in the RBF-Kernel and the gaussian noise is embedded in the WhiteKernel and both are optimized during training. However, I am missing the function variance. I know I can express it as x * RBF() + WhiteKernel(), but then it would be fixed and not get optimized during training.

Does anyone know, if I am missing something or if sklearn just doesn't have this parameter?

0

There are 0 best solutions below