Can a CUDA event be fired from device-side code?

709 Views Asked by At

Is there any way to fire an event (for benchmarking purposes, similar to cudaEvents in the CPU code) from a device kernel in CUDA?

E.g. suppose I would like to measure the time passed from kernel start to the first thread ever that starts a computation and the time passed from the last thread that leaves the computation to the CPU return.

Can I do that?

2

There are 2 best solutions below

0
On BEST ANSWER

The device runtime API (used with dynamic parallelism) does have limited stream and events support, but event timing is not supported.

So, no you can't do that.

0
On

An ugly workaround would be writing to some managed-memory location, and having a host-side thread poll it and fire the event when the value changes.