Maximum number of concurrent kernels & virtual code architecture

1.8k Views Asked by user2255757 At 15 October 2025 at 06:00

Maximum number of resident grids per device (Concurrent Kernel Execution)

and for each compute capability it says a number of concurrent kernels, which I assume to be the maximum number of concurrent kernels.

Now I am getting a GTX 1060 delivered which according to this nvidia CUDA resource has a compute capability of 6.1. From what I have learned about CUDA so far you can specify the virtual compute capability of your code at compile time in NVCC though with the flag -arch=compute_XX.

So will my GPU be hardware constrained to 32 concurrent kernels or is it capable of 128 with the -arch=compute_60 flag?

Original Q&A

There are 2 best solutions below

Robert Crovella On 11 December 2016 at 22:55 BEST ANSWER

According to table 13 in the NVIDIA CUDA programming guide compute capability 6.1 devices have a maximum of 32 resident grids = 32 concurrent kernels.

Even if you use the -arch=compute_60 flag, you will be limited to the hardware limit of 32 concurrent kernels. Choosing particular architectures to compile for does not allow you to exceed the hardware limits of the machine.

yanbc On 20 April 2023 at 02:00

Adding to the accepted answer, it is now Table 15 in the NVIDIA CUDA C Programming Guide as of April 2022, with the latest CUDA version being 12.1. Or, you can just search Technical Specifications per Compute Capability in the docs.

Maximum number of concurrent kernels & virtual code architecture

There are 2 best solutions below

Related Questions in CUDA

Related Questions in COMPUTE-CAPABILITY

Trending Questions

Popular # Hahtags

Popular Questions