CUDA atomic operations and concurrent kernel launch

1k Views Asked by user3128889 At 27 July 2025 at 17:21

Currently I develop a GPU-based program that use multiple kernels that are launched concurrently by using multiple streams.

In my application, multiple kernels need to access a queue/stack and I have plan to use atomic operations.

But I do not know whether atomic operations work between multiple kernels concurrently launched. Please help me anyone who know the exact mechanism of the atomic operations on GPU or who has experience with this issue.

Original Q&A

There are 1 best solutions below

ArchaeaSoftware On 24 December 2013 at 10:52

Atomics are implemented in the L2 cache hardware of the GPU, through which all memory operations must pass. There is no hardware to ensure coherency between host and device memory, or between different GPUs; but as long as the kernels are running on the same GPU and using device memory on that GPU to synchronize, atomics will work as expected.

CUDA atomic operations and concurrent kernel launch

There are 1 best solutions below

Related Questions in CONCURRENCY

Related Questions in CUDA

Related Questions in GPU-ATOMICS

Trending Questions

Popular # Hahtags

Popular Questions