Pass arguments to objective function in scipy differential evolution with multiple workers

910 Views Asked by At

For some optimization problem I am using differential evolution from scipys optimization toolbox. I'd like to use several CPUs to speed up the process, but I would like to pass several additional arguments to the objective function. However, these are not just some scalars but some datasets which are required for the optimization to evaluate the models.

When I try to pass the arguments to the objective function directly in the usual way, python complains that the objective function is not pickable. When I put my data into a dictionary and pass that to the objective function, python complains about " File "/usr/lib64/python3.6/multiprocessing/connection.py", line 393, in _send_bytes header = struct.pack("!i", n) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 "

How can I pass nontrivial data to the objective function of differential evolution when using multiple workers? I have yet to find a way.

Something like

par02 = {'a':2,'b':3, "data":train_data}

# Define optimization bounds.
bounds = [(0, 10), (0, 10)]

# Attempt to optimize in series.
# series_result = differential_evolution(rosenbrock, bounds, args=(par02,))
# print(series_result.x)

# Attempt to optimize in parallel.
parallel_result = differential_evolution(rosenbrock, bounds, args=(par02,),
                                         updating='deferred', workers=-1)

does not work for example.

Anyone got an idea? Or do I really have to load the data from disk everytime the objective function is called? That would slow down the optimization considerably I believe.

1

There are 1 best solutions below

0
On

It always helps if a MWE is provided. For parallel processing to be used the objective function and args need to be picklable.

The following illustrates the problem:

-------stuff.py--------
from scipy.optimize import rosen

def obj(x, *args):
    return rosen(x)

-------in the CLI-------
import numpy as np
import pickle
from scipy.optimize import differential_evolution, rosen
from stuff import obj

train_data = "123"
par02 = {'a':2,'b':3, "data":train_data}

bounds = [(0, 10), (0, 10)]

# This will work because `obj` is importable in
# the __main__ context.
differential_evolution(obj, bounds, args=(par02,),
                       updating='deferred', workers=-1)


def some_other_func(x, *args):
    pass

wont_work = {'a':2,'b':3, "data":some_other_func}

# look at the output here. ` some_other_func` is referenced with
# respect to __main__
print(pickle.dumps(wont_work))

# the following line will hang because `some_other_func`
# is not importable by the main process; to get it
# to work the function has to reside in an importable file.
parallel_result = differential_evolution(obj, bounds, args=(wont_work,),
                                         updating='deferred', workers=-1)

Basically you can't use classes/functions that are defined in the main context (i.e. the CLI), they have to be importable by main.

https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming