I need to pass a variable to the setup() method of dispy node so i can tell the node which data set to load from a config file. Else i have to write a specific script for each data set and this is going to be painful.
def setup(): # executed on each node before jobs are scheduled
# read data in file to global variable
global data
data = open('file.dat').read()
return 0
...
if __name__ == '__main__':
import dispy
cluster = dispy.JobCluster(compute, depends=['file.dat'], setup=setup, cleanup=cleanup)
So i want to pass the string "file.dat" to setup so each node can instantiate the data once (as its large).
Let me see if I understand the problem. You want to pass an argument to setup, but the actual call of
setupoccurs somewhere in the functionJobCluster. That call does not know it should pass an argument. Is that correct?The solution is to use the standard library
functools.partial. You do something like this:The object returned by
partial, when called with no arguments, calls setup with one positional argument ("file.dat"). You have to rewrite setup to handle this argument, like so: