While I was studying C++, I found something weird...
I though that the below code would produce the result of big number(At least not 1.1).
Instead the result was enter image description here
Other compilers worked as expected.
But the clang compiler with aggressive optimization seem to ignore the while loop.
So my question is, what's the problem with my code? Or is this intended by the clang?
I used the apple clang compiler(v14.0.3)
#include <iostream>
#include <thread>
static bool should_terminate = false;
void infinite_loop() {
long double i = 1.1;
while(!should_terminate)
i *= i;
std::cout << i;
}
int main() {
std::thread(infinite_loop).detach();
std::cout << "main thread";
for (int i = 0 ; i < 5; i++) {
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << ".";
}
should_terminate = true;
}
Assembly result from compiler explorer(clang v16.0.0, -O3)
This also seemed to skip the while loop.
_Z13infinite_loopv: # @_Z13infinite_loopv
sub rsp, 24
fld qword ptr [rip + .LCPI0_0]
fstp tbyte ptr [rsp]
mov rdi, qword ptr [rip + _ZSt4cout@GOTPCREL]
call _ZNSo9_M_insertIeEERSoT_@PLT
add rsp, 24
ret
Your code has undefined behaviour:
should_terminateis not an atomic object, so writing to it in one thread and accessing it in another thread potentially concurrently (i.e. without any synchronization) is a data race, which is always undefined behaviour.Practically speaking this UB rule permits the compiler to make exactly the optimization you see here.
The compiler can assume that
should_terminatewill never change in the loop, because it cannot possibly be written to from another thread since that would be a data race. So when reaching the loop it is eitherfalseand staysfalse, so that the loop never terminates, or it istrue, in which case the loop body doesn't execute at all.Then, because an infinite loop that doesn't perform any atomic/IO/volatile/synchronization operation would also have UB, the compiler can further deduce that
should_terminatemust be (always)truewhen the loop is reached. Consequently the loop body can never be executed and removing the loop is a permitted optimization.So Clang is behaving correctly here and your expectations are wrong.
should_terminatemust be astd::atomic<bool>(orstd::atomic_flag) so that writing to it unsynchronized with other access it is not a data race.