In P2641r4: Checking if a union alternative is active, the author provides an implementation of an optional<bool>
as a motivating example and claims that this is well-formed.
struct OptBool { union { bool b; char c; }; OptBool() : c(2) { } OptBool(bool b) : b(b) { } auto has_value() const -> bool { return c != 2; } auto operator*() -> bool& { return b; } };
However, I am not convinced. Namely, has_value()
doens't look to be safe because if a bool
is the active union member, then c != 2
accesses an inactive member and performs union type punning. To my knowledge, this is not allowed in C++.
The author explains that it can't be done because an inactive union member is being read, and provides the following implementation:
constexpr auto has_value() const -> bool { if consteval { return std::is_within_lifetime(&b); } else { return c != 2; } }
What did the author mean by this? Does this mean that you cannot read the inactive union member in a constant expression but it would otherwise be permitted? Is this code totally well-formed or does it rely on compiler extensions that would permit union type punning at run-time?
Note: This is a sister question to Is the author's implementation of an optional<bool> well-defined in P2641? which discusses the other implementation.
I assume that
operator*
has a precondition as usual that theOptBool(bool b)
overload has been used. Usingoperator*
when the optional is empty is clearly UB, but also is not intended use.When
b
is the active member, then accessing thec
has undefined behavior because it must be out-of-lifetime.The intent here is to look at the object representation, which can be achieved by adding a seemingly unnecessary cast:
The inner cast will yield a pointer to the
OptBool
object, becauseOptBool
is standard-layout and pointer-interconvertible with thec
subobject.The outer cast will then produce a pointer to the
OptBool
object with expression typeunsigned char*
. Accessing through it is not an aliasing violation. However, it is currently not specified what value this access should read. The intention is for it to read the first byte of the object representation of theOptBool
object (and also thebool
orchar
object), but that isn't specified to happen at the moment. There is P1839 trying to fix that. It is in practice what everyone assumes as the behavior, even if the standard doesn't say that at the moment, which is a defect.In any case, the implementation of course assumes a specific implementation of
bool
, specifically its size, alignment and object/value representations.