I have an array of random integers. For example [132, 2, 31, 49, 15, 6, 70, 18 ... , 99, 1001]
. I want to produce array of all numbers that greater than 100 for example and get size of that array.
There are two ways:
- New feature of PyOpenCL
copy_if
. It's based onGenericScanKernel
and if we go deeper on Prefix Sums. - Pure OpenCL solution that used Atomics
Does copy_if
always works properly? As I can see copy_if
doesn't use atomic. Is it possible to faced with trouble using copy_if
?
What about performance of copy_if
compared to atomic way?
What would you choose and why?
I have never seen an error with
copy_if
. Always the same results; it seems very robust. (I haven't built unit tests, though.)As for performance,
copy_if
should be much faster, especially if your GPU is fast. As others have said, atomics and GPUs are a bad combination (I have suffered too much to learn this...)And if the number of expected results is small in relation to your dataset, I have proposed a
sparse_copy_if()
method here---where you can also find acopy_if
example.Fork my code and the following should work: