What are the breaking changes caused by rewritten comparison operators?

3.4k Views Asked by At

There are some new rules about rewritten comparison operators in C++20, and I'm trying to understand how they work. I've run into the following program:

struct B {};

struct A
{
    bool operator==(B const&);  // #1
};

bool operator==(B const&, A const&);  // #2

int main()
{
  B{} == A{};  // C++17: calls #2
               // C++20: calls #1
}

which actually breaks existing code. I'm a little surprised by this; #2 actually still looks better to me :p

So how do these new rules change the meaning of existing code?

2

There are 2 best solutions below

6
On

That particular aspect is a simple form of rewriting, reversing the operands. The primary operators == and <=> can be reversed, the secondaries !=, <, >, <=, and >=, can be rewritten in terms of the primaries.

The reversing aspect can be illustrated with a relatively simple example.

If you don't have a specific B::operator==(A) to handle b == a, you can use the reverse to do it instead: A::operator==(B). This makes sense because equality is a bi-directional relationship: (a == b) => (b == a).

Rewriting for secondary operators, on the other hand, involves using different operators. Consider a > b. If you cannot locate a function to do that directly, such as A::operator>(B), the language will go looking for things like A::operator<=>(B) then simply calculating the result from that.

That's a simplistic view of the process but it's one that most of my students seem to understand. If you want more details, it's covered in the [over.match.oper] section of C++20, part of overload resolution (@ is a placeholder for the operator):

For the relational and equality operators, the rewritten candidates include all member, non-member, and built-in candidates for the operator <=> for which the rewritten expression (x <=> y) @ 0 is well-formed using that operator<=>.

For the relational, equality, and three-way comparison operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each member, non-member, and built-in candidate for the operator <=> for which the rewritten expression 0 @ (y <=> x) is well-formed using that operator<=>.


Hence gone are the days of having to provide a real operator== and operator<, then boiler-plating:

operator!=      as      !  operator==
operator>       as      ! (operator== || operator<)
operator<=      as         operator== || operator<
operator>=      as      !  operator<

Don't complain if I've gotten one or more of those wrong, that just illustrates my point on how much better C++20 is, since you now only have to provide a minimal set (most likely just operator<=> plus whatever else you want for efficiency) and let the compiler look after it :-)


The question as to why one is being selected over the other can be discerned with this code:

#include <iostream>

struct B {};
struct A {
    bool operator==(B const&) { std::cout << "1\n"; return true; }
};
bool operator==(B const&, A const&) { std::cout << "2\n"; return true; }

int main() {
  auto b = B{}; auto a = A{};

           b ==          a;  // outputs: 1
  (const B)b ==          a;  //          1
           b == (const A)a;  //          2
  (const B)b == (const A)a;  //          2
}

The output of that indicates that it's the const-ness of a deciding which is the better candidate.

As an aside, you may want to have a look at this article, which offers a more in-depth look.

12
On

From a non-language-lawyer sense, it works like this. C++20 requires that operator== compute whether the two objects are equal. The concept of equality is commutative: if A == B, then B == A. As such, if there are two operator== functions that could be called by C++20's argument reversal rules, then your code should behave identically either way.

Basically, what C++20 is saying is that if it matters which one gets called, you're defining "equality" incorrectly.


So let's get into the details. And by "the details", I mean the most horrifying chapter of the standard: function overload resolution.

[over.match.oper]/3 defines the mechanism by which the candidate function set for an operator overload is built. C++20 adds to this by introducing "rewritten candidates": a set of candidate functions discovered by rewriting the expression in a way that C++20 deems to be logically equivalent. This only applies to the relational and in/equality operators.

The set is built in accord with the following:

  • For the relational ([expr.rel]) operators, the rewritten candidates include all non-rewritten candidates for the expression x <=> y.
  • For the relational ([expr.rel]) and three-way comparison ([expr.spaceship]) operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y <=> x.
  • For the != operator ([expr.eq]), the rewritten candidates include all non-rewritten candidates for the expression x == y.
  • For the equality operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y == x.
  • For all other operators, the rewritten candidate set is empty.

Note the particular concept of a "synthesized candidate". This is standard-speak for "reversing the arguments".

The rest of the section details what it means if one of the rewritten candidates gets chosen (aka: how to synthesize the call). To find which candidate gets chosen, we must delve into the most horrifying part of the most horrifying chapter of the C++ standard:

Best viable function matching.

What matters here is this statement:

a viable function F1 is defined to be a better function than another viable function F2 if for all arguments i, ICSi(F1) is not a worse conversion sequence than ICSi(F2), and then

And that matters... because of this. Literally.

By the rules of [over.ics.scs], an identity conversion is a better match than a conversion that adds a qualifier.

A{} is a prvalue, and... it's not const. Neither is the this parameter to the member function. So it's an identity conversion, which is a better conversion sequence than one that goes to the const A& of the non-member function.

Yes, there is a rule further down that explicitly makes rewritten functions in the candidate list less viable. But it doesn't matter, because the rewritten call is a better match on function arguments alone.

If you use explicit variables and declare one like this A const a{};, then [over.match.best]/2.8 gets involved and de-prioritizes the rewritten version. As seen here. Similarly, if you make the member function const, you also get consistent behavior.