scipy.minimize doesn't work. Loss doesn't decrease

43 Views Asked by At

I'm working on Python to optimize some parameters with neural data.

I used scipy.optimize.minimize(BFGS method).

It's the problem that the loss is too big to decrease. like those.

21395.738099992504 21395.738127943157 21395.738115877684 21395.738162622994 21395.738115161068 21395.738162367332 21395.738110567938 21395.73813130535 21395.738097769616 21395.738104114924 21395.73813444559

Here's my code.

def fit_lnp(X, y, opts):
    """
    Fits an LNP model to the data.

    Args:
        X (numpy.ndarray): Input data matrix.
        y (numpy.ndarray): Output data vector.
        opts (dict): Options for the optimizer.

    Returns:
        tuple: Tuple containing the optimized parameters and predicted values.
    """
    init = 1e-3 * np.random.randn(X.shape[0], 1).flatten()
    result = minimize(
        lambda param: do_lnp(param, X.T, y.T), init, method="BFGS", options=opts
    )
    param = result.x
    yHat = np.exp(np.dot(X.T, param)).T
    return param, yHat


def do_lnp(param, X, y):
    """
    Computes the objective function, its gradient, and the Hessian for the LNP model.

    Args:
        param (numpy.ndarray): Parameter array.
        X (numpy.ndarray): Input data matrix.
        y (numpy.ndarray): Output data vector.

    Returns:
        tuple: Tuple containing the objective function value, its gradient, and the Hessian matrix.
    """
    lamda = 1e-2
    # compute the firing rate
    u = np.dot(X, param.reshape(-1, 1))
    rate = np.exp(u)

    # start computing the Hessian
    rX = rate[:, None] * X  # equivalent to MATLAB's bsxfun(@times, rate, X)
    hessian_glm = np.dot(rX.T, X)

    # regularize term
    reg_val = (lamda / 2) * np.sum(np.exp(param) ** 2)

    # compute f, the gradient, and the hessian
    f = np.sum(rate - y * u) + reg_val
    print(f)
    df = np.dot(X.T, (rate - y))
    hessian = hessian_glm
    return f

Should I change the method to minimize? How could I figure out the way to improve this optimization?

I need some help. Thank you.

1

There are 1 best solutions below

1
Roger V. On

As is, function do_lnp calculates the values of the function, its gradient and hessian... but returns only the function:

return f

should be replaced by

return f, df, hessian

and one needs to specify jac=True, hess=True in the call to minimize.

Alternatively, hessian and jacobian can be passed as separate functions.

See for the full description scipy.optimize.minimize

Some general tips:

  • Note that you can pass additional function parameters to minimize - no need to use a lambda function.
  • It is good to explore first the target function, to understand its behavior near the minimum, at least in some simple cases.
  • You can try using "Nelder-Mead" as the optimization method - it is much slower, but very robust.