'Pool not running' problem about using pathos multiprocessing in a loop

215 Views Asked by At

When I try to do some parallel Computing in a loop,

from pathos.multiprocessing import ProcessingPool as Pool
def add(x,y):
    return x+y
for i in range(10):
    pool = Pool(4)
    x = [0,1,2,3]
    y = [4,5,6,7]
    pool.map(add, x, y)
    pool.close()
    pool.join()

it returns errors:

ValueError: Pool not running

The only way I can find is:

for i in range(10):
    pool = Pool(4)
    pool.restart()
    x = [0,1,2,3]
    y = [4,5,6,7]
    pool.map(add, x, y)
    pool.close()
    pool.join()

I expected I can re-assign a new pool in each iteration, but it seemed not to work. What happened?

1

There are 1 best solutions below

0
On

I'm the pathos author. In pathos, a Pool is a singleton, so when you close the pool... instantiating a new Pool actually just gives you the closed pool. There is, however, restart and clear. I think you probably are looking for clear. So, after join add clear() -- then the instantiation of the new pool will be as you expect.

Python 3.7.16 (default, Dec  7 2022, 05:04:27) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathos.pools as pp
>>> pool = pp.ProcessPool()
>>> pool.map(lambda x:x*x, range(4))
[0, 1, 4, 9]
>>> pool.close()
>>> pool.join()
>>> pool.clear()
>>> 
>>> pool = pp.ProcessPool()
>>> pool.map(lambda x:x*x, range(4))
[0, 1, 4, 9]
>>> pool.close()
>>> pool.join()
>>> pool.clear()
>>>
>>> # actually, this works too (after the clear)...
>>> pool.map(lambda x:x*x, range(4))
[0, 1, 4, 9]
>>>