How Does Cuda Interpret Stride Loops

31 Views Asked by llllllll At 17 July 2023 at 04:14

I'm having trouble understanding how the stride loop actually works. For just generally iterating through arrays.

This is the example stride loop that I found. For a single block stride loop.

<<<1, 256>>>

__global__
void add(int n, float *x, float *y)
{
  int index = threadIdx.x;
  int stride = blockDim.x;
  for (int i = index; i < n; i += stride)
      y[i] = x[i] + y[i];
}

I'm guessing that it only runs the += stride once per block, and then the inner code per thread. But there is nothing that actually specifies that, since from normal c++ logic it would run the stride calculation every time the loop looped.

Or does it just run the looping logic for every single instruction/thread, since it seems like that would impact performance.

Original Q&A

How Does Cuda Interpret Stride Loops

There are 0 best solutions below

Related Questions in CUDA

Related Questions in GPU

Related Questions in STRIDE

Trending Questions

Popular # Hahtags

Popular Questions