I have once again inherited code that looks suspicious; it is basically this:
(void) nppiFilter...(...);
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess)
{
std::cerr << cudaGetErrorString(err);
}
We ignore the NPP error but instead check for a CUDA error.
First, does NPP set the CUDA error flag on error? I'm pretty sure the answer is "not explicitly" so this code will miss NPP-only errors, but I want to check.
Second, is it necessary to check both errors or will this suffice:
NppStatus nppErr = nppiFilter...(...);
if (nppErr != NPP_NO_ERROR)
{
std::cerr << "NPP error " << nppErr;
}
Or should I check both just in case? There is a NPP_CUDA_KERNEL_EXECUTION_ERROR
which suggests to me that maybe checking cudaGetLastError()
would be useful, but is it?
No, it does not. The CUDA error state may be set by something under the hood being done by NPP, but NPP does not specifically set the CUDA error state.
It should be sufficient to just check the NPP status. However, if you wanted to do additional debug analysis, it might be useful to also check the CUDA error state. In fact I often run
cuda-memcheck
when I am looking for additional clues. The only normal value this would have is to provide "additional clues".A safe assumption is that many CUDA libraries may have functions that launch work asynchronously. That is: underlying GPU activity may still be occurring even after the function has returned control to the CPU thread. In such cases, the expectation is that a well-designed library will catch errors due to asynchronous activity "later", when you do a subsequent library call, or CUDA API call (perhaps to retrieve the calculated data from device to host).
In such cases, you wouldn't be able to rely on the function return value anyway. Therefore careful error checking throughout your program is the safest bet, and this includes both library API level (e.g. NPP) as well as CUDA API level. But for production purposes, I would simply test at every opportunity, not necessarily suggesting that you insert extra checks such as:
(unless it immediately follows a CUDA API call and that is your strategy**)
nor would I suggest arbitrarily inserting:
However, if you were designing a library, you might want to have some sort of explicit error checking of the above type at the entry to your functions.
This is obviously a matter of opinion to some degree. You may wish to carry error checking to an extreme level. It should not have much of an impact on your program, as long as you don't insert synchronizing calls to check for errors.
My comments above mostly pertain to how I would write production code. For learning purposes, or any time you are having trouble with a code you are writing, it's usually a good idea to be very rigorous about error checking, and indeed insert extra error checking to catch asynchronous errors for localization of the error to a particular function.
**You might wish to insert:
after every kernel call, in your code. This will catch any kernel errors that are detectable at launch time, such as incorrect grid dimensions. This type of call should be relatively lightweight.