Covariance deviation when using method L-BFGS-B of scipy.optimize.minimize

Question

Covariance deviation when using method L-BFGS-B of scipy.optimize.minimize

178 Views Asked by jlandercy At 10 December 2023 at 04:59

When assessing covariance matrix with scipy.optimize.minimize for a simple LLS problem, I have got significant deviation when using the method L-BFGS-B. After a lot of research, I have created a detailed MCVE to highlight the issue.

MCVE

Then handy `curve_fit` method

Let say we want to fit the following model to some trial dataset:

import numpy as np
from scipy import optimize

def model(x, a, b, c):
    return a * x[:, 0]**2 + b * x[:, 0] + c

np.random.seed(12345)
X = np.linspace(-1, 1, 30).reshape(-1, 1)
y = model(X, 3, 2, 1)
s = 0.1 * np.ones(y.size)
n = s * np.random.normal(size=y.size)
yn = y + n

popt, pcov = optimize.curve_fit(model, X, yn, sigma=s, absolute_sigma=True)
popt, pcov

# (array([3.00876769, 2.00038216, 1.03068484]),
#  array([[ 3.29272571e-03, -1.11752553e-11, -1.17327007e-03],
#         [-1.11752553e-11,  9.35483883e-04,  2.98424026e-12],
#         [-1.17327007e-03,  2.98424026e-12,  7.51395072e-04]]))

The curve_fit method directly provides Covariance matrix which is considered as the expected result. Now we want to reproduce this covariance matrix using different optimize methods.

Check with baseline method `least_squares`

Using the least_squares function (which is used by curve_fit under the hood) and the recipe scipy use to robustly assess the covariance leads to compliant results:

def residuals_factory(x, y, sigma=1.):
    def wrapped(beta):
        return (y - model(x, *beta)) / sigma
    return wrapped

residuals = residuals_factory(X, yn, sigma=s)

sol0 = optimize.least_squares(residuals, x0=[1, 1, 1])
#     message: `xtol` termination condition is satisfied.
#     success: True
#      status: 3
#         fun: [-5.954e-01  9.965e-02 ... -3.855e-01  9.455e-01]
#           x: [ 3.009e+00  2.000e+00  1.031e+00]
#        cost: 16.317663438370303
#         jac: [[-1.000e+01  1.000e+01 -1.000e+01]
#               [-8.668e+00  9.310e+00 -1.000e+01]
#               ...
#               [-8.668e+00 -9.310e+00 -1.000e+01]
#               [-1.000e+01 -1.000e+01 -1.000e+01]]
#        grad: [ 1.070e-07 -1.632e-06  1.722e-07]
#  optimality: 1.632403261453419e-06
# active_mask: [ 0.000e+00  0.000e+00  0.000e+00]
#        nfev: 4
#        njev: 3

U, s, Vh = np.linalg.svd(sol0.jac, full_matrices=False)
tol = np.finfo(float).eps*s[0]*max(sol0.jac.shape)
w = s > tol
cov = (Vh[w].T/s[w]**2) @ Vh[w]

# array([[ 3.29272571e-03,  5.77282993e-12, -1.17327008e-03],
#        [ 5.77282993e-12,  9.35483875e-04,  1.14915683e-12],
#        [-1.17327008e-03,  1.14915683e-12,  7.51395089e-04]])

In this context we can reproduce the covariance.

Check with `minimize` method

Using minimize, we can perform the same operation by stating the Chi Squared loss function instead of residuals.

def loss_factory(x, y, sigma=1.):
    def wrapped(beta):
        return 0.5 * np.sum(np.power((y - model(x, *beta)) / sigma, 2))
    return wrapped

loss = loss_factory(X, yn, sigma=s)

sol1 = optimize.minimize(loss, x0=[1, 1, 1])
#  message: Optimization terminated successfully.
#  success: True
#   status: 0
#      fun: 16.31766343837042
#        x: [ 3.009e+00  2.000e+00  1.031e+00]
#      nit: 5
#      jac: [ 1.907e-06  1.192e-06 -4.768e-07]
# hess_inv: [[ 3.293e-03  1.104e-11 -1.173e-03]
#            [ 1.104e-11  9.355e-04 -1.734e-12]
#            [-1.173e-03 -1.734e-12  7.514e-04]]
#     nfev: 32
#     njev: 8

Again we are able to get the covariance matrix (see hess_inv). This also works as expected with method BFGS.

Issue with `L-BFGS-B` method

But when we use the L-BFGS-B method, the Inverse Hessian returned does not comply with previously computed covariance matrices:

sol2 = optimize.minimize(loss, x0=[1, 1, 1], method="L-BFGS-B")
#  message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
#  success: True
#   status: 0
#      fun: 16.317663438454282
#        x: [ 3.009e+00  2.000e+00  1.031e+00]
#      nit: 8
#      jac: [-5.720e-05  2.917e-04 -4.182e-04]
#     nfev: 36
#     njev: 9
# hess_inv: <3x3 LbfgsInvHessProduct with dtype=float64>

sol2.hess_inv.todense()
# array([[ 0.03719202,  0.02371079, -0.02612526],
#        [ 0.02371079,  0.02215422, -0.01929008],
#        [-0.02612526, -0.01929008,  0.01984616]])

I wonder why the inverse hessian is so different in this configuration and how can I adapt this code to get the covariance matrix as I did in the three previous snippets.

Question

Why the method minimize with option method='L-BFGS-B' returns a so different inverse hessian and how should I fix it in order to get a proper covariance matrix.

Original Q&A

There are 1 best solutions below

**jlandercy** · Accepted Answer · 2023-12-12T07:28:16.450000

As @NickODell pointed out, method L-BFGS-B compute a limited memory approximation of the Hessian.

There is a m switch on the underlying function fmin_l_bfgs_b, but it's not available in the minimize signature, therefore we cannot control how precise is this approximation.

As a workaround, Hessian matrix can be computed after minimizing:

from autograd import hessian

H = hessian(loss)(solution.x)
C = np.linalg.inv(H)

# array([[ 3.29272573e-03,  1.28301871e-19, -1.17327009e-03],
#        [ 2.44025127e-19,  9.35483871e-04, -6.92261149e-20],
#        [-1.17327009e-03, -3.24227332e-20,  7.51395089e-04]])

Which is compliant with other results.

Alternative:

import numdifftools

H = numdifftools.Hessian(loss)(solution.x)
C = np.linalg.inv(H)

Covariance deviation when using method L-BFGS-B of scipy.optimize.minimize

MCVE

Then handy `curve_fit` method

Check with baseline method `least_squares`

Check with `minimize` method

Issue with `L-BFGS-B` method

Related questions

Question

There are 1 best solutions below

Related Questions in SCIPY

Related Questions in CURVE-FITTING

Related Questions in SCIPY-OPTIMIZE

Related Questions in MINIMIZE

Related Questions in COVARIANCE-MATRIX

Trending Questions

Popular # Hahtags

Popular Questions

Covariance deviation when using method L-BFGS-B of scipy.optimize.minimize

MCVE

Then handy curve_fit method

Check with baseline method least_squares

Check with minimize method

Issue with L-BFGS-B method

Related questions

Question

There are 1 best solutions below

Related Questions in SCIPY

Related Questions in CURVE-FITTING

Related Questions in SCIPY-OPTIMIZE

Related Questions in MINIMIZE

Related Questions in COVARIANCE-MATRIX

Trending Questions

Popular # Hahtags

Popular Questions

Then handy `curve_fit` method

Check with baseline method `least_squares`

Check with `minimize` method

Issue with `L-BFGS-B` method