With recent NVIDIA micro-architectures, there's a new (?) taxonomy of warp stall reasons / warp scheduler states. One of these is:
Wait : Warp was stalled waiting on a fixed latency execution dependency.
As @GregSmith explains, fixed-latency instructions are: "Math, bitwise [and] register movement". But what are fixed-latency "execution dependencies"? Are these just "waiting for somebody else's fixed-latency instruction to conclude before we can issue it ourselves"?
Execution dependencies are dependencies that need to be resolved before the next instruction can be issued. These include register operands and predicates. The WAIT stall reason will be issued between instructions that have fixed latency. The compiler can choose to add additional waits between instructions to the same pipeline if the pipeline issue frequency is not 1 warp per cycle (e.g. FMA and ALU pipe can issue every other cycle on GV100 - GA100).
EXAMPLE 1 - No dependencies - compiler added waits
If the compiler did not add wait cycles then the stall reason would be math_throttle. This can also show up if the warp is ready to issue the instruction (all dependencies resolved) and another warp is issuing an instruction to the target pipeline.
EXAMPLE 2 - Wait stalls due to read after write dependency