I'm writing some CUDA code, and I want it to behave differently based on whether or not --use_fast_math was set or not. And - I want to make that decision at compile time, not at run time.
It seems that NVCC does not add or change a preprocessor define when --use_fast_math is set. I checked this by comparing the output of:
nvcc -Xcompiler -dM -E -x cu -
with the output of
nvcc -Xcompiler -dM -E --use_fast_math -x cu -
and they're exactly the same; so that avenue seems to be blocked. Now, if the compiling user would invoke NVCC with --use_fast_math -DUSING_FAST_MATH then I could also detect that; but suppose it's library code and we can't impose these restrictions on the user.
Is there some other way for code undergoing compilation to notice that --use_fast_math is on?
Note: "Noticing" can mean using preprocessor #if or #ifdef directives, using SFINAE, using compiler-builtin values or constexpr functions - whatever is available at compile time.
The answer is almost certainly no. The fast math functions are hardware instructions and they are substituted by code generation within the CUDA device code compiler. An example:
You can see that there is no nvcc steered pre-processor magic, just arguments passed to the device compiler which has resulted in PTX code with the requisite instructions in place. This means that, in theory, you might be able to mess around with LLVM hacks to intercept or identify the bytecode you are looking for, but I very much doubt that is what you had in mind.