Why is z always zero in CUDA kernel

803 Views Asked by smok At 25 April 2016 at 14:26

I am using Cudafy to do some calculations on a NVIDIA GPU. (Quadro K1100M capability 3.0, if it matters)

My question is, when I use the following

cudaGpu.Launch(new dim3(44,8,num), new dim(8, 8)).MyKernel...

why are my z indexes from the GThread instance always zero when I use this in my kernel?

int z = thread.blockIdx.z * thread.blockDim.z + thread.threadIdx.z;

Furthermore, if I have to do something like

cudaGpu.Launch(new dim3(44,8,num), new dim(8, 8, num)).MyKernel...

z does give different indexes as it should, but num can't be very large because of the restrictions on number of threads per block. Any surgestion on how to work around this?

Edit

Another way to phrase it. Can I use thread.z in my kernel (for anything useful) when block size is only 2D?

Original Q&A

There are 1 best solutions below

Taro On 26 April 2016 at 07:34 BEST ANSWER

On all currently supported hardware, CUDA allows the use of both three dimensional grids and three dimensional blocks. On compute capability 1.x devices (which are no longer supported), grids were restricted to two dimensions.

However, CUDAfy currently uses a deprecated runtime API function to launch kernels, and silently uses only gridDim.x and gridDim.y, not taking gridDim.z in account :

_cuda.Launch(function, gridSize.x, gridSize.y);

As seen in the function DoLaunch() in CudaGPU.cs.

So while you can specify a three dimensional grid in CUDAfy, the third dimension is ignored during the kernel launch. Thanks to Florent for pointing this out !

Why is z always zero in CUDA kernel

There are 1 best solutions below

Related Questions in C#

Related Questions in CUDA

Related Questions in CUDAFY.NET

Trending Questions

Popular # Hahtags

Popular Questions