I like to run my code with floating point exceptions enabled. I do this under Linux using:
feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );
So far so good.
The issue I am having, is that sometimes the compiler (I use clang8) decides to use SIMD instructions to do a scalar division. Fine, if that is faster, even for a single scalar, why not.
But the result is that an unused lane in the SIMD register can contain a zero.
And when the SIMD division is executed, a floating point exception is thrown.
Does that mean that floating point exceptions cannot be used at all if you allow the compiler to use sse/avx extensions?
In my case, this line of C code:
float a0, min, a, d;
...
a0 = (min - a) / (d);
...is exectuted as:
divps %xmm2,%xmm3
Which then throws a:
Thread 1 "noisetuner" received signal SIGFPE, Arithmetic exception.
I think you have found a bug in clang or maybe in llvm.
Here’s how I have reproduced, clang 10.0 emits the same code i.e. has that bug as well. Clearly, that
vdivpsinstruction only has valid data in the initial 2 lanes of the vectors, and in the higher 2 lanes it will run 0.0 / 0.0, thus you’ll get a runtime exception if you enable these interrupts inmxcsrregister like you’re doing.Microsoft, Intel and gcc don’t emit
divpsfor that code. If you can, switch to gcc and it should be good.Update: Clang 10+ has an option controlling such optimizations,
-ffp-exception-behavior=maytrap, take a look: https://godbolt.org/z/WG7bEE