CUDA atomicAdd_block is undefined

1.7k Views Asked by At

According to CUDA Programming Guide, "Atomic functions are only atomic with respect to other operations performed by threads of a particular set ... Block-wide atomics: atomic for all CUDA threads in the current program executing in the same thread block as the current thread. These are suffixed with _block, e.g., atomicAdd_block"

However, I cannot use atomicAdd_block while my code is compiled fine with atomicAdd. Is there any header or library that I should add or link to?

2

There are 2 best solutions below

0
On BEST ANSWER

atomicAdd() has been supported for a long time - by earlier versions of CUDA and with older micro-architectures. However, atomicAdd_system() and atomicAdd_block were introduced, IIANM, with the Pascal micro-architecture, in 2016. The minimum Compute Capability in which they are supported is 6.0. If you're targeting CC 5.2 or earlier - or if your CUDA version is several years old - then they might not be available to you.

This is actually likely to be the case, since even for the current version of CUDA, nvcc will default to Compute Capability 5.2 if no other value is specified with -gencode or -arch (e.g. if you run nvcc -o out my_file.cu).

4
On

As Robert said, the solution is to add -arch=sm_70 in compile or for those who use CMake is to add set(CMAKE_CUDA_ARCHITECTURES 70) to their CMakeLists.txt