PyOpenCL, array filter: copy_if vs my own atomic-based implementation

297 Views Asked by At

I have an array of random integers. For example [132, 2, 31, 49, 15, 6, 70, 18 ... , 99, 1001]. I want to produce array of all numbers that greater than 100 for example and get size of that array.

There are two ways:

  1. New feature of PyOpenCL copy_if. It's based on GenericScanKernel and if we go deeper on Prefix Sums.
  2. Pure OpenCL solution that used Atomics

Does copy_if always works properly? As I can see copy_if doesn't use atomic. Is it possible to faced with trouble using copy_if?

What about performance of copy_if compared to atomic way?

What would you choose and why?

1

There are 1 best solutions below

0
On

I have never seen an error with copy_if. Always the same results; it seems very robust. (I haven't built unit tests, though.)

As for performance, copy_if should be much faster, especially if your GPU is fast. As others have said, atomics and GPUs are a bad combination (I have suffered too much to learn this...)

And if the number of expected results is small in relation to your dataset, I have proposed a sparse_copy_if() method here---where you can also find a copy_if example.

Fork my code and the following should work:

from my_pyopencl_algorithm import copy_if 
final_gpu, evt = my_pyopencl_algorithm.sparse_copy_if(array_gpu, "ary[i] > 100", queue = queue)