Does python's Pool() for parallelization prevent writing to global variables?

Question

Does python's Pool() for parallelization prevent writing to global variables?

119 Views Asked by AlbertBr At 21 May 2025 at 14:16

In python 2.7 I am trying to distribute the computation of a two-dimensional array on all of the cores.
For that I have two arrays associated with a variable at the global scope, one to read from and one to write to.

import itertools as it
import multiprocessing as mp

temp_env = 20
c = 0.25
a = 0.02
arr = np.ones((100,100))
x = arr.shape[0]
y = arr.shape[1]
new_arr = np.zeros((x,y))

def calc_inside(idx):
    new_arr[idx[0],idx[1]] = (       arr[idx[0],  idx[1]  ]
                             + c * ( arr[idx[0]+1,idx[1]  ]
                                   + arr[idx[0]-1,idx[1]  ]
                                   + arr[idx[0],  idx[1]+1]
                                   + arr[idx[0],  idx[1]-1]
                                   - arr[idx[0],  idx[1]  ]*4
                                     )
                             - 2 * a
                                 * ( arr[idx[0],  idx[1]  ]
                                   - temp_env
                                     )
                               )

inputs = it.product( range( 1, x-1 ),
                     range( 1, y-1 )
                     )
p = mp.Pool()
p.map( calc_inside, inputs )

#for i in inputs:
#    calc_inside(i)

#plot arrays as surface plot to check values

Assume there is some additional initialization for the array arr with some different values other than that exemplary 1-s, so that the computation ( an iterative calculation of the temperature ) actually makes a sense.

When I use the commented out for-loop, instead of the Pool.map() method, everything works fine and the array actually contains values. When using the Pool() function, the variable new_array just stays in its initialized state ( meaning it contains only the zeros, as it was originally initialised with ).

Q1 : Does that mean that Pool() prevents writing to global variables?

Q2 : Is there any other way to tackle this problem with parallelization?

Original Q&A

There are 1 best solutions below

**user3666197** · Answer 1

A1 :
your code actually does not use any variable declared using a syntax of global <variable>. Nevertheless, do not try to go into trying to use them, the less in going into distributed processing.

A2 :
Yes, going parallelised is possible, but better get a thorough sense of the costs of doing so, before spending ( well, wasting ) efforts, that never justify costs for doing so.

Why starting about costs?

Will you pay your bank clerk an amount of $2.00 to receive just a $1.00 banknote in exchange?

Guess no one would ever do.

The same is with trying to go parallel.

Syntax is "free" and "promising", Syntax costs of the actual execution of a simple and nicely-looking syntax-constructor is not. Expect rather shocking surprises instead of receiving any dinner for free.

What are the actual costs? Benchmark. Benchmark. Benchmark!

The useful work
Your code actually does just a few memory accesses and a few floating point operations "inside" the block and exits. These FLOP-s take less than a few tens of [ns] max a few units of [us] on recent CPU frequencies ~ 2.6 ~ 3.4 [GHz]. Do benchmark it:

from zmq import Stopwatch; aClk = Stopwatch()

temp_env = 20
c        =  0.25
a        =  0.02

aClk.start()
_ = ( 1.0
    + c * ( 1.0
          + 1.0
          + 1.0
          + 1.0
          - 1.0 * 4
            )
    - 2 * a
        * ( 1.0
          - temp_env
            )
      )
T = aClk.stop()

So, a pure [SERIAL]-process execution, on a Quad-core CPU, will not be worse than about a T+T+T+T ( having been executed one after another ).

         +-+-+-+--------------------------------- on CpuCore[0]
         : : : :
<start>| : : : :
       |T| : : :
       . |T| : :
       .   |T| :
       .     |T|
       .       |<stop>
       |<----->|
        = 4.T in a pure [SERIAL]-process schedule on CpuCore[0]

What will happen, if the same amount of usefull work will be enforced to happen inside some form of the now [CONCURRENT]-process execution ( using multiprocessing.Pool's methods ), potentially using more CPU-cores for the same purpose?

The actual compute-phase, will not last less than T again, right? Why ever would it be? Yes, never.

                               A.......................B...............C.D Cpu[0]
<start>|                       :                       :               : :
       |<setup a subprocess[A]>|                       :               :
       .                       |                       :               
       .                       |<setup a subprocess[B]>|
       .                       | +............+........................... Cpu[!0]
       .                       | :            :        |
       .                       |T|            :        |
       .                         |<ret val(s) |        | +............+... Cpu[!0]
       .                         |   [A]->main|        | :            :
       .                                               |T|            :
       .                                                 |<ret val(s) |
       .                                                 |   [B]->main|
       .                                                                      
       .
       .                                                           ..   |<stop>
       |<--------------------------------------------------------- .. ->|
           i.e. >> 4.T ( while simplified, yet the message is clear )

The overheads ^{i.e. the Costs you will always pay per-call ( and that is indeed many times )}
Subprocess setup + termination costs ( benchmark it to learn the scales of these cost ). Memory access costs ( latency, where zero cache-re-use happens to help, as you exit ).

Epilogue

Hope the visual message is clear enough to ALWAYS start accounting the accrued costs before deciding about any achievable benefits from going into re-engineering code into a distributed process-flow, having a multi-core or even many-core [CONCURRENT] fabric available.

Does python's Pool() for parallelization prevent writing to global variables?

There are 1 best solutions below

Why starting about costs?

What are the actual costs? Benchmark. Benchmark. Benchmark!

Epilogue

Related Questions in PYTHON-2.7

Related Questions in PARALLEL-PROCESSING

Related Questions in GLOBAL-VARIABLES

Related Questions in POOL

Related Questions in PARALLELISM-AMDAHL

Trending Questions

Popular # Hahtags

Popular Questions