Hardware for python multiprocessing

651 Views Asked by At

I have a task where I need to run the same function on many different pandas dataframes. I load all the dataframes into a list then pass it to Pool.map using the multiprocessing module. The function code itself has been vectorized as much as possible, contains a few if/else clauses and no matrix operations.

I'm currently using a 10-core xeon and would like to speed things up, ideally passing from Pool(10) to Pool(xxx). I see two possibilities:

  • GPU processing. From what I have read though I'm not sure if I can achieve what I want and would in any case need lots of code modification.

  • Xeon-Phi. I know it's being discontinued, but supposedly code adaptation is easier and if thats really the case I'd happily get one.

Which path should I concentrate on? Any other alternatives?

Software: Ubuntu 18.04, Python 3.7. Hardware: X99 chipset, 10-core xeon (no HT)

2

There are 2 best solutions below

3
Pavel Kovtun On BEST ANSWER

You can rely on new Intel 2066 platform or Xeon. With newest AVX512 they accelerated numpy processing a lot (numpy is the base of pandas). Check: https://software.intel.com/en-us/articles/the-inside-scoop-on-how-we-accelerated-numpy-umath-functions

First of all, try to switch to numpy-based calculations (even with simple .values over the series), it can improve the processing speed up to 10x

You can also try to get 2 CPU motherboard and get more parallelization for calculation.

In the most situations, the bottleneck is not the processing of the data, but IO operations - reading from drive to memory. This will be the problem using GPU too.

0
alittlebluebug On

Took a while, but after changing it all to numpy and achieving a little more vectorization I managed to get a speed increase of over 20x - so thanks Paul. max9111 thanks too, I'll have a look into numba.