I'm trying to use python multiprocessing module to speed up some computations. The first step is acquiring a number of models from the BioModels database. There is an API for this called BioServices which can be downloaded with pip install bioservices
. I've managed to do this in serial but this takes time and would benefit from paralleling.
bio=bioservices.BioModels() #initialize the class for downloading models
m=bio.getAllCuratedModelsId() #assign model ID's to the m (a python list)
def f(ID):
dct={}
name=bio.getModelNameById(ID)#retrieve the model name for the result dict key
print 'running {}'.format(name) #print some information so you can see the program working
dct[name]=bio.getModelSBMLById(ID) #get the model and assign as value in dct
time.sleep(0.5) #stop the program for a bit to prevent bombarding the services and being cut out
return dct
model_dct={}
P=multiprocessing.Pool(8)
for i in m:
model_dct.update(P.map(f,i)) # parallelize
print time.time()-start+'seconds'
At present this is just initializing the bio class and crashing (or doing nothing at least). Could anybody suggest how to fix my code?
Thanks
Pool.map
is meant to apply a function to all items in an iterable, so you should say:but rather just
This will give you a list of dicts, one for each ID.