scipy.optimize failure with a "vectorized" implementation

687 Views Asked by At

I have an optimization problem (1d) coded in 2 ways - one using a for loop and an other using numpy arrays. The for loop version works fine but the numpy one fails. Actually it is a bit more complicated, it can work with different starting points (!!) or if I choose an other optimization algo like CG.

The 2 versions (functions and gradients) are giving the same results and the returned types are also the same as far as I can tell.

Here is my example, what am I missing?

import numpy as np
from scipy.optimize import minimize

# local params
v1 = np.array([1., 1.])
v2 = np.array([1., 2.])
# local functions
def f1(x):
    s = 0
    for i in range(len(v1)):
        s += (v1[i]*x-v2[i])**2
    return 0.5*s/len(v1)
def df1(x):
    g = 0
    for i in range(len(v1)):
        g += v1[i]*(v1[i]*x-v2[i])
    return g/len(v1)
def f2(x):
    return 0.5*np.sum((v1*x-v2)**2)/len(v1)
def df2(x):
    return np.sum(v1*(v1*x-v2))/len(v1)

x0 = 10. # x0 = 2 works
# tests...
assert np.abs(f1(x0)-f2(x0)) < 1.e-6 and np.abs(df1(x0)-df2(x0)) < 1.e-6 \
        and np.abs((f1(x0+1.e-6)-f1(x0))/(1.e-6)-df1(x0)) < 1.e-4

# BFGS for f1: OK
o = minimize(f1, x0, method='BFGS', jac=df1)
if not o.success:
    print('FAILURE', o)
else:
    print('SUCCESS min = %f reached at %f' % (f1(o.x[0]), o.x[0]))
# BFGS for f2: failure
o = minimize(f2, x0, method='BFGS', jac=df2)
if not o.success:
    print('FAILURE', o)
else:
    print('SUCCESS min = %f reached at %f' % (f2(o.x[0]), o.x[0]))

The error I get is

A1 = I - sk[:, numpy.newaxis] * yk[numpy.newaxis, :] * rhok
IndexError: invalid index to scalar variable.

but I doesn't really helps me since it can work with some other starting values.

I am using an all new fresh python install (python 3.5.2, scipy 0.18.1 and numpy 1.11.3).

1

There are 1 best solutions below

0
On

The solver expects the return value of jacobian df2 to be the same shape as its input x. Even though you passed in a scalar here, it's actually converted into a single element ndarray. Since you used np.sum, your result became scalar and that causes strange things to happen.

Enclose the scalar result of df2 with np.array, and your code should work.