I am reading the following article about sequence points in C: https://www.geeksforgeeks.org/sequence-points-in-c-set-1/
In it, there are several examples of undefined behavior, such as expressions that call two functions that modify a single global variable, or a single expression that increments the same variable more than once.
In theory, I understand the concept. However, no matter how many times I try to run the examples, the behavior is the same, and never "surprising."
For the purpose of getting a hands-on appreciation of undefined behavior, what's the easiest way to get the examples to be "surprising"?
(If it matters, I am using MINGW64.)
This is about the best I can come up with at short notice:
Source code:
Resulting assembly using gcc 8.3 -O3
See it in action: https://godbolt.org/z/E0XDYt
In particular, it relies on the undefined behavior caused by casting the address of an
int
to ashort*
, an action that breaks the strict aliasing rule, and therefore causes undefined behavior.Start with the assembly of
undefined()
. That assumes that sincea
andb
are different types, they cannot overlap, therefore it optimizes thereturn *a;
intomov eax,1
, even though it would actually return zero if it fetched the value from memory. Which it does with optimization off, so this is one of those really insidious problems that only manifests in an optimized release build, and not when you try and debug it with a non-optimized debug build.However, note how the code in
main()
does try and get it right: it inlines, and then optimizes away the call toundefined()
and instead assumes0
inz
when it does thexor eax,eax
just above the call toprintf
. So it's ignoring what it just figured out as the return value a few lines above, and is instead using a different value.All in all, a very badly broken program. Exactly what you risk with undefined behavior.