Why doesn’t Python provide default implementations of __le__ and __ge__?

1.6k Views Asked by At

The following mathematical relationships between comparison relations (=, ≠, <, >, ≤ and ≥) are always valid and therefore implemented by default in Python (except for the 2 union relationships, which seems arbitrary and is the reason of this post):

  • 2 complementary relationships: "= and ≠ are each other’s complement";
  • 6 converse relationships*: "= is the converse of itself", "≠ is the converse of itself", "< and > are each other’s converse", and "≤ and ≥ are each other’s converse";
  • 2 union relationships: "≤ is the union < and =" and "≥ is the union of > and =".

The following relationships between comparison relations are only valid for total orders and therefore not implemented by default in Python (but users can conveniently implement them when they are valid with the class decorator functools.total_ordering provided by the Python standard library):

  • 4 complementary relationships: "< and ≥ are each other’s complement" and "> and ≤ are each other’s complement".

Why is Python only lacking the 2 union relationships above ("≤ is the union < and =" and "≥ is the union of > and =")?

It should provide a default implementation of __le__ in terms of __lt__ and __eq__, and a default implementation of __ge__ in terms of __gt__ and __eq__, like these (but probably in C for performance, like __ne__):

def __le__(self, other):
    result_1 = self.__lt__(other)
    result_2 = self.__eq__(other)
    if result_1 is not NotImplemented and result_2 is not NotImplemented:
        return result_1 or result_2
    return NotImplemented

def __ge__(self, other):
    result_1 = self.__gt__(other)
    result_2 = self.__eq__(other)
    if result_1 is not NotImplemented and result_2 is not NotImplemented:
        return result_1 or result_2
    return NotImplemented

The 2 union relationships are always valid so these default implementations would free users from having to provide them all the time (like here).

Here is the paragraph of the Python documentation which states explicitly that the 2 union relationships are not currently implemented by default (bold emphasis mine):

By default, __ne__() delegates to __eq__() and inverts the result unless it is NotImplemented. There are no other implied relationships among the comparison operators, for example, the truth of (x<y or x==y) does not imply x<=y.


* Converse relationships are implemented in Python through the NotImplemented protocol.

3

There are 3 best solutions below

13
deceze On

Why exactly this decision was made only the original author knows, but given these hints from the manual reasons can be inferred:

To automatically generate ordering operations from a single root operation, see functools.total_ordering().

While this decorator makes it easy to create well behaved totally ordered types, it does come at the cost of slower execution and more complex stack traces for the derived comparison methods. If performance benchmarking indicates this is a bottleneck for a given application, implementing all six rich comparison methods instead is likely to provide an easy speed boost.

Pair this with Python's mantra of explicit is better than implicit, the following reasoning should be satisfactory:

Deriving __ne__ from __eq__ is virtually free, it's just the operation not o.__eq__(other), i.e. inverting a boolean.

However, deriving __le__ from the union of __lt__ and __eq__ means that both methods need to be called, which could be a potentially large performance hit if the comparison done is complex enough, especially compared to an optimised single __le__ implementation. Python lets you opt-into this convenience-over-performance explicitly by using the total_ordering decorator, but it won't implicitly inflict it on you.

You could also argue for explicit errors if you attempt to do unimplemented comparisons instead of implicitly derived comparisons which you didn't implement and which may create subtle bugs, depending on what you meant to do with your custom classes. Python won't make any guesses for you here and instead leave it up to you to either explicitly implement the comparisons you want, or to again explicitly opt-into the derived comparisons.

23
MisterMiyagi On

TLDR: The comparisons operators are not required to return bool. This means that results may not strictly adhere to "a <= b is a < b or a == b" or similar relations. Most importantly, the boolean or may fail to preserve their semantics.

Automatically generating special methods may silently lead to wrong behaviour, similar to how automatic __bool__ is not generally applicable. (This example also treats <= etc. as more than bool.)


An example is expressing time points via comparison operators. For example, the usim simulation framework (disclaimer: I maintain this package) defines time points that can be checked and waited for. We can use comparisons to describe "at or after" some point in time:

  • time > 2000 after 2000.
  • time == 2000 at 2000.
  • time >= 2000 at or after 2000.

(The same applies to < and ==, but the restriction is more difficult to explain.)

Notably, there are two features to each expression: Whether it is satisfied right now (bool(time >= 2000)) and when it will be satisfied (await (time >= 2000)). The first can obviously be evaluated for every case. However, the second cannot.

Waiting for == and >= can be done by waiting for/sleeping until an exact point in time. However, waiting for > requires waiting for a point in time plus some infinitely small delay. The latter cannot be accurately expressed, since there is no generic infinitely small but non-zero number for contemporary number types.

As such, the result of == and >= is fundamentally of a different kind than >. Deriving >= as "> or ==" would be wrong. Thus, usim.time defines == and >= but not > to avoid errors. Automatically defining comparison operators would prevent this, or wrongly define the operators.

20
Martijn Pieters On

Your question is based on a number of incorrect assumptions. You start your question with:

The following mathematical relationships between comparison relations (=, , <, >, and ) are always valid and therefore implemented by default in Python (except for the 2 union relationships, which seems arbitrary and is the reason of this post).

There is no default implementation for < or > either. There is no default __lt__ or __gt__ implementation, so there can't be a default implementation for __le__ or __ge__ either.*

This is covered in the expressions reference documentation under Value Comparisons:

The default behavior for equality comparison (== and !=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e. x is y implies x == y).

A default order comparison (<, >, <=, and >=) is not provided; an attempt raises TypeError. A motivation for this default behavior is the lack of a similar invariant as for equality.

The behavior of the default equality comparison, that instances with different identities are always unequal, may be in contrast to what types will need that have a sensible definition of object value and value-based equality. Such types will need to customize their comparison behavior, and in fact, a number of built-in types have done that.

The motivation to not provide default behavior is included in the documentation. Note that these comparisons are between the value of each object, which is an abstract concept. From the same documentation section, at the start:

The value of an object is a rather abstract notion in Python: For example, there is no canonical access method for an object’s value. Also, there is no requirement that the value of an object should be constructed in a particular way, e.g. comprised of all its data attributes. Comparison operators implement a particular notion of what the value of an object is. One can think of them as defining the value of an object indirectly, by means of their comparison implementation.

So comparisons are between one notion of the value of an object. But what that notion is exactly, is up to the developer to implement. Python will not assume anything about the value of an object. That includes assuming that there is any ordering inherent in the object values.

The only reason that == is implemented at all, is because when x is y is true, then x and y are the exact same object, and so the value of x and the value of y are the exact same thing and therefore must be equal. Python relies on equality tests in a lot of different places (like testing for containment against a list), so not having a default notion of equality would make a lot of things in Python a lot harder. != is the direct inverse of ==; if == is true when the values of the operands are the same, then != is only true when == is false.

You can't say the same for <, <=, => and > without help from the developer, because they require much more information about how the abstract notion of a value of the object needs to be compared to other similar values. Here, x <= y is not simply the result inverse of x > y, because there isn't any information about the values of x or y, and how that relates to == or != or < or any other value comparison.

You also state:

The 2 union relationships are always valid so these default implementations would free users from having to provide them all the time

The 2 union relationships are not always valid. It may be that the > and < operator implementation is not making comparisons and an implementation is free to return results other than True or False. From the documentation on the __lt__ etc. methods:

However, these methods can return any value, so if the comparison operator is used in a Boolean context (e.g., in the condition of an if statement), Python will call bool() on the value to determine if the result is true or false.

If an implementation decides to give > and < between two objects a different meaning altogether, the developer should not be left with incorrect default implementations of __le__ and __ge__ that assume that the implementation for __lt__ and __gt__ return booleans, and so will call bool() on their return values. This may not be desireable, the developer should be free to overload the meaning of __bool__ too!

The canonical example for this is the Numpy library, which was the primary driver for implementing these rich comparisons hooks. Numpy arrays do not return booleans for comparison operations. Instead, they broadcast the operation between all contained values in the two arrays to produce a new array, so array_a < array_b produces a new array of boolean values for each of the paired values from array_a and array_b. An array is not a boolean value, your default implementation would break as bool(array) raises an exception. While in Numpy's case they also implemented __le__ and __ge__ to broadcast the comparisons, Python can't require all types to provide implementations for these hooks just to disable them when not desired.

You appear to be conflating mathematical relationships with Python's use of some of those relationships. The mathematical relationships apply to certain classes of values (numbers, mostly). They do not apply to other domains, it is up to the implementation of each type to decide whether to honour those mathematical relationships.

Finally, the complementary relationship between < and >=, and between > and <= *only applies to total order binary relationships, as stated in the complement section of the Wikipedia article on binary relation:

For example, = and are each other's complement, as are and , and , and and , and, for total orders, also < and , and > and .

Python can't make the assumption that all type implementations wish to create total order relations between their values.

The standard library set type, for example, does not support total order between sets, set_a < set_b is true when set_a is a subset of a larger set_b. This means there can be a set_c that is a subset of set_b but set_c is not necessarily a subset, or superset of set_a. Set comparisons are also have no connexity, set_a <= set_b and set_b <= set_a can both be false, at the same time, when both sets have elements that are not present in the other.


* Note: the object.__lt__, object.__gt__, object.__le__ and object.__ge__ methods do have a default implementation, but only to return NotImplemented unconditionally. They exist only to simplify the implementation of the <, >, <= and >= operators, which for a [operator] b need to test a.__[hook]__(b) first, then try b.__[converse hook]__(a) if the first returns NotImplemented. If there was no default implementation, then the code would also need to check if the hook methods exist first. Using < or > or <= or >= on objects that do provide their own implementations results in a TypeError, nonetheless. Do not regard these as default implementations, they do not make any value comparisons.