Parallel Python: restriction on the number of procceses per core

222 Views Asked by At

I'm working with the next code (this is just a part of the full code) in parallel python in a computer with two cores

import pp
from scheduling import *
from numpy import *

def sched_pp_process(B,T,D,BL,blocks,number_block,number_core):
   ppservers = ()
   job_server = pp.Server(number_core,ppservers)
   jobs = [(i, job_server.submit(local_sched,(B,T,D,blocks[i][0],blocks[i][1],i), (), ("MineLink","time","sys","gurobipy"))) for i in range(number_block)]
   for i, job in jobs:
       if job() == ():
           pass
       else:
           BL.append(job())

def block_est_list(B,T,BL,blocks,number_block,case):
    if case == 1:
        for i in range(number_block):
        blocks.append((random.randint(0,B+1),random.randint(1,T+1)))
    elif case == 2:
        for i in range(number_block):
        blocks.append((random.randint(0,B+1),random.randint(T/2+1,T+1)))

B = 4004
D = 2 
T = 4
number_block = 100

blocks = []
BL = []

block_est_list(B,T,BL,blocks,number_block,1)

sched_pp_process(B,T,D,BL,blocks,number_block,2)

The local_sched function is too big to specify it here, but what it does is solve an optimization problem with gurobipy. When I specify 2 cores (number_core = 2) I'm only able to run 12 processes for each core, so I'm able to run just 24 processes even when there are 100 processes; after that, python stop working, even when in the windows task manager says that python still is running. When this happens I have to stop the process with de windows task manager to be able to use de command prompt again. If I specify 3 cores then I'm able to run 36 of the 100 processes and so on. For all I know this shouldn't happen. Does anyone know why this happens?

1

There are 1 best solutions below

0
Mike McKerns On

I don't think it's a restriction on the processes per core, but more likely that you are hitting some form of restriction due to memory (or otherwise) on each core. Depending on how your code is set up, pp launches a local ppserver per job or per core, and also launches a python instance for each job. pp can be configured so it will start running into swap (your processes are swapping turns at who gets to use the processor and who gets to be idle), and that is slow. The built in load balancer will try to off-load the jobs onto a different core… but if you only have 2, then it just bogs down.

pp puts the jobs on a queue, pulls them off, and starts running them on the ppserver nodes. If you overload a ppserver, it will bog down. The solution is to use more cores, or configure your ppservers so you limit the number of jobs running on each core. It would seem your limit is about 12.

By the way… You might want to also check out ppft (a fork of pp) which has better capabilities for transferring code objects across processors, runs on python 2 or 3, and is pip installable. There's also pathos.pp, which provides a higher-level pipe and map interface on top of pp… and tries to minimize the overhead of spawning ppservers. If you are doing optimization where you are leveraging pp to run jobs, then you might be interested in mystic, which can leverage pathos to run optimization jobs in parallel. Some of the codes have recent releases, while others have releases that are a few years stale. They are all in the process of having new releases cut. However, the public trunk: https://github.com/uqfoundation is always pretty stable. Yes, I'm the author of the above codes, so this is also a bit of a shameless plug… but it sounds like they might be useful to you.