Pre 8.x equivalent of __reduce_max_sync() in CUDA

250 Views Asked by Serge Rogatch At 20 October 2024 at 08:48

cuda-memcheck has detected a race condition in the code that does the following:

condition = /*different in each thread*/;
shared int owner[nWarps];
/* ... owner[i] is initialized to blockDim.x+1 */
if(condition) {
    owner[threadIdx.x/32] = threadIdx.x;
}

So basically this code computes the owner thread for each warp based on some condition. For some warp there could be no owner, but for some the number of owners can be more than 1, and then a race condition happens because multiple threads assign a value to the same shared memory region.

After trying the docs, I think what I need can be done with:

const uint32_t mask = __ballot_sync(0xffffffff, condition);
if(mask != 0) {
    const unsigned max_owner = __reduce_max_sync(mask, threadIdx.x);
    if(threadIdx.x == max_owner) {
        // at most 1 thread assigns here per warp
        owner[threadIdx.x/32] = max_owner;
    }
}

However, my attempt has 2 issues:

I don't really need to find the max thread - it's enough to select any 1 thread for each warp if there is a thread with condition==true
It requires CUDA compute capability 8.x, while I need to support devices of 5.2 compute capability

Could you please help me solve the above issues?

Original Q&A

There are 1 best solutions below

Serge Rogatch On 16 October 2021 at 15:01

The following function seems to solve the problem:

void SetOwnerThread(int* dest, const bool condition) {
  const uint32_t mask = __ballot_sync(0xffffffff, condition);
  if(!mask) {
    return;
  }
  const uint32_t lowest_bit = mask & -mask;
  const uint32_t my_bit = (1 << (threadIdx.x & 31));
  if(lowest_bit == my_bit) {
    dest = threadIdx.x;
  }
}

Pre 8.x equivalent of __reduce_max_sync() in CUDA

There are 1 best solutions below

Related Questions in C++

Related Questions in PARALLEL-PROCESSING

Related Questions in CUDA

Related Questions in GPU-WARP

Related Questions in COMPUTE-CAPABILITY

Trending Questions

Popular # Hahtags

Popular Questions