SciPy differential evolution fails if objective function has too many arguments

1.3k Views Asked by At

Does the SciPy implementation of the differential evolution algorithm have a maximum number of variables? My code works on a toy version of the problem with 8 variables, but when I try to optimize the actual problem with 4000 variables a value of infinity is consistently returned for the objective function.

code (see GitHub repo for input files)

import numpy as np
from scipy.optimize import differential_evolution as de
from scipy.optimize import NonlinearConstraint as nlc

def kf(x, w, freq):
    kc = x>0
    kw = ~np.any(w[~kc,:], axis=0)
    return -freq[kw].sum()

def cons_fun(x):
    return np.sum(x>0)

def optimize(w, freq):
    cons = nlc(cons_fun, -np.inf, 1000)
    bnds = [np.array([-1,1]),]*w.shape[0]
    res = de(kf, args=(w, freq), maxiter=1000, bounds=bnds, popsize=2, polish=False,
             constraints=cons, disp=True, workers=-1, updating='deferred')
    output = res.x>0
    np.save('output.npy', output)

if __name__ == '__main__':
    # try optimizing toy version of problem
    small_w = np.load('small_w.npy')
    small_freq = np.load('small_freq.npy')
    optimize(small_w, small_freq)
    
    # try optimizing actual problem
    w = np.load('w.npy')
    freq = np.load('freq.npy')
    optimize(w, freq)

program output for the actual problem

differential_evolution step 1: f(x)= inf
differential_evolution step 2: f(x)= inf
differential_evolution step 3: f(x)= inf

...and so on for hundreds of steps

more information about the optimization problem

I'm trying to determine a set of 1000 Chinese characters that maximizes your ability to write common words. The array w is a sparse boolean matrix with shape 4000 (number of potential characters) by 30000 (number of words). An element of w is true if the character corresponding to that row occurs in the word corresponding to that column. The array freq is a vector of length 30000 that contains word frequency values.

The objective function kf takes 4000-element array x as its argument. The array x contains values between -1 and 1. The trial set of characters is determined by the positive elements in x. A nonlinear constraint restricts the number of positive elements in x to 1000.

1

There are 1 best solutions below

0
On

There is no limit to the number of variables that can be used in differential_evolution.

For a constrained minimization with differential_evolution the objective function is only evaluated if the constraints are feasible. This is so that computational time is not wasted on trial solutions.

A trial solution is accepted if:

        * it satisfies all constraints and provides a lower or equal objective
          function value, while both the compared solutions are feasible
        - or -
        * it is feasible while the original solution is infeasible,
        - or -
        * it is infeasible, but provides a lower or equal constraint violation
          for all constraint functions.

Have you investigated your constraints function to check that it is possible to create a feasible solution within the bounds?