I am pulling my hair out for the last couple of days with this "innocuous" piece of code (minimal reproducible example, part of a larger modular multiplication routine):
#include <iostream>
#include <limits>
using ubigint = unsigned long long int;
using bigint = long long int;
void modmul(bigint a, bigint b, ubigint p) {
ubigint ua = a < 0 ? -a : a;
ubigint ub = b < 0 ? -b : b;
ua %= p;
ub %= p;
std::cout << "ua: " << ua << '\n';
}
int main() {
bigint minbigint = std::numeric_limits<bigint>::min();
bigint maxbigint = std::numeric_limits<bigint>::max();
std::cout << "minbigint: " << minbigint << '\n';
std::cout << "maxbigint: " << maxbigint << '\n';
modmul(minbigint, maxbigint, 2314); // expect ua: 2036, got ua: 0
}
I am compiling on macOS 11.4 with clang 12.0 installed from Homebrew
clang version 12.0.0
Target: arm64-apple-darwin20.5.0
Thread model:posix
InstalledDir: /opt/homebrew/opt/llvm/bin
When compiling with clang -O1, the program spits out the expected result (in this case, 2036, I've checked with Wolfram Mathematica, Mod[9223372036854775808, 2314], and this is correct). However, when I compile with clang -O2 or clang -O3 (full optimization), somehow the variable ua is zeroed out (its value becomes 0). I am at a complete loss here, and have no idea why this happens. IMO, there's no UB, nor overflow, or anything dubious in this piece of code. I'd greatly appreciate any advice, or if you can reproduce the issue on your side.
PS: the code behaves as expected on any other platforms, including Windows/Linux/FreeBSD/Solaris), with any combination of compilers. I'm only getting this error on Apple M1 with clang 12 (didn't test with other compilers on M1).
UPDATE: As @harold pointed out in the comment section,
negqandsubqfrom 0 is exactly the same. So the my discussion related tonegqandsubqbelow is incorrect. Please disregard that part, sorry for not double checking before posting answer.About the original question, I recompile a slightly simpler version of the code godbolt and find out that the problematic compiler's optimization is in
mainnotmodmul. Inmain, clang see that all of its operands formodmulis constant so it decided to do the computation ofmodmulat compile time. When calculatingubigint ua = a < 0 ? -a : a;, clang find out that is signed integer overflow UB so it decided to return 0 and print out. That may seem to be a radical thing to do but it's legal because of UB. Moreover, since there is no mathematically correct answer due to the limitation of two's compliment system, return 0 is arguably as good (or as bad) as any other result.OLD ANSWER BELOWS
As some one pointed out in the comment section, the 2 lines below in your code is undefined behavior - signed integer overflow UB.
If you wonder what exactly clang does under the hood to produce 2 different results at 2 different optimization levels, consider a simple example as following.
When compile with -O0
Compile with -O3
At -O0, clang use normal
subqinstruction which perform binary subtraction of 0 and %rax and produce results with integer-wrap-around behavior.At -O3, clang can do better, it use
negqinstruction which only replace the operand with its two's complement (i.e flip all the bits and add 1). However, you can see that this optimization is only legal if signed integer overflow is undefined behavior (hence the compiler can just ignore overflow cases). If the standard required integer-wrap-around behavior, clang must fall back to the unoptimized version.