I am trying to learn parallel programming with python 3 and have troubles with all the toy examples. Particularly, get any code from textbook/course/youtube, try to execute it and... get very slow working. I've actually never seen fast working examples for beginners. Everything is slow, if you can execute it. It is much slower then usual serial code with loops. Could anyone help with issue?
I work in Windows 10, Jupyter and use Intel Core i5-8300H 2.3 GHz, 4 physical cores and 8 threads.
I modified code from here but the same issue with other places.
My code:
import numpy as np
import time
import multiprocessing as mp
import additional
# Prepare data
sz = 10000000
np.random.RandomState(100)
arr = np.random.randint(0, 10, size=[sz, 5])
data = arr.tolist()
data[:5]
# Step 1: Init multiprocessing.Pool()
N = mp.cpu_count()
print("number of processors: ", N)
pool = mp.Pool(N)
start = time.perf_counter()
# Step 2: `pool.apply` the `howmany_within_range()`
results = [pool.apply(additional.howmany_within_range, args=(row, 4, 8)) for row in data]
finish = time.perf_counter()
print(f'Finished in {round(finish-start, 3)} second(s)')
# Step 3: Don't forget to close
pool.close()
print(results[:10])
#Serial code, loops
results = []
start = time.perf_counter()
for row in data:
results.append(additional.howmany_within_range(row, minimum=4, maximum=8))
finish = time.perf_counter()
print(f'Finished in {round(finish-start, 3)} second(s)')
print(results[:10])
additional.py
def howmany_within_range(row, minimum, maximum):
"""Returns how many numbers lie within `maximum` and `minimum` in a given `row`"""
count = 0
for n in row:
if minimum <= n <= maximum:
count = count + 1
return count
It works with 10^7
elements
Parallel code:
number of processors: 8
Finished in 1563.35 second(s)
Serial calculations
Finished in 5.375 second(s)
What is
howmany_within_range
? Is it a quick operation or a slow operation.Remember that multiprocessing has an overhead. Your main program has to package up the arguments, send them to another process, wait for the results, and then unpackage them. If the cost of packaging/unpackaging is more than the cost of what you're doing, then multiprocessing won't gain you anything.