Let's say I have an OpenGL compute shader with local_size=8*8*8. How do the invocations map to nVidia GPU warps? Would invocations with the same gl_LocalInvocationID.x
be in the same warp? Or y? Or z? I don't mean all invocations, I just mean general aggregation.
I am asking this because of optimizations as in one moment, not all invocations have work to do so I want them to be in the same warp.
According to this: https://www.khronos.org/opengl/wiki/Compute_Shader#Inputs
So it is quite safe to assume that invocations with the same
gl_LocalInvocationID.x
are in the same warp.