Equality and relative ordering of objects

6 years ago by Rudi Theunissen — view source

unread

The Comparable RFC (http://wiki.php.net/rfc/comparable) was written in 2010
and
revised in 2015 by Adam Harvey, but was not conclusive. I would like to take
on some shared responsibility to push this forward and re-open the
discussion.

Why is this useful, and why should it be added to PHP?

The default comparison of objects can be expensive.

Objects are compared and tested for equality by recursively comparing their
properties. If an object has many properties, or properties that are objects
or have large arrays, this comparison may dive very deep.

It may not be necessary to do all this work to determine whether two objects
are equal. For example, if you have a data model that represents a database
entity, you're only concerned about the data attributes, and there's no need
to compare the database connection or other internal metadata.

Objects that have a defining value (like a number type), can simply compare
itself in value to the other object (if also a number type), and can skip
checking other properties. This has the potential to save a lot of
unnecessary comparisons.

The default comparison of objects can be risky.

There could be cases where two objects have the same public properties but
one of them (perhaps a subclass) has a private property. These two objects
could be considered equal to the outside world, but == would return
false. When a developer added this private property, it might not have
been obvious that they were affecting equality. This makes == checks on
objects very fragile where it could be explictly defined and controlled.

There is no way to specify strict comparison.

With PHP becoming more type-safe since PHP 7 with strict mode and scalar
type-hints, the current behaviour feels out of date. For example:

$a = new stdClass();
$b = new stdClass();

$a->prop = 1;
$b->prop = true;

$a == $b; // true

Adding this functionality does not affect backwards-compatibility and
could
probably be implemented in 5.x as well. To my knowledge, The only major
changes required are in the default object handlers. This would
automatically affect functions like array_search and sort.

Other major languages support this and there seems to be a decent amount
of
public support for it.

What is the proposed API for this?

Magic methods. Objects that don't implement them maintain the current
behaviour.

/**
 * @returns -1 if less than,
 *     0 if equal to,
 *     1 if greater than the other value (in terms of relative ordering).
 */
__compareTo(mixed $other): int;

/**
 * @returns true if this instance is equal to the given value, or false if
not.
 */
__equals(mixed $other): bool;

Why magic methods?

No risk of breaking backwards compatibility because method names that
start
with __ are always reserved.
All objects can already be compared and tested for equality, so checking
if
an object implements an interface would not give you any useful
information.
Alternatively, we could introduce a function like is_comparable.
Python uses magic methods because there are no interfaces. Java has a
Comparable interface because not all objects are comparable by default. It's
important to note that we're not exposing the ability to make an object
comparable, but instead to change the default behaviour when compared. All
objects are technically already comparable.
PHP already uses magic methods to change the default behaviour of objects.
Interfaces like Iterable or IteratorAggregate must be interfaces
because
not all objects are iterable.

Why do we need __equals when __compareTo can just return 0?

Anything can be tested for equality, but all things are not comparable.
For example, an apple can be equated with an orange and we can say that the
apple does not equal the orange. However, comparing the apple to an orange
makes no logical sense. There is nothing about either object that would
allow us to say that the apple is greater or less than the orange.
Some things that are not equal have the same relative ordering. For
example,
a floating point value of 2.0 does not equal the integer 2, but
their
relative ordering is the same, and a comparison would return 0.
The contexts in which they are called are not the same. Testing for
equality
occurs when an object is compared with another using == or in functions
like array_search. The question we're asking is whether the two values
are
considered equal, and we're not concerned with ordering. The context for
comparison is when we need to determine the relative ordering of two values
using <, >, or in functions like sort.

Issues to be discussed:

What happens when we use <= and >=? Does the "or equal to" part call
__equals or does it check if __compareTo returns 0?
Should we also expose strict comparison, ie. === ?
Naming:

__eq
__equalTo
__equals
__compareTo
__comparedTo
__compare
__cmp

6 years ago by Christoph M. Becker — view source

unread

The Comparable RFC (http://wiki.php.net/rfc/comparable) was written in 2010
and
revised in 2015 by Adam Harvey, but was not conclusive. I would like to take
on some shared responsibility to push this forward and re-open the
discussion.

Thanks!

Adding this functionality does not affect backwards-compatibility and
could
probably be implemented in 5.x as well.

New features should target master only, so this could only go into PHP
7.3 (note that feature freeze is scheduled for July, 17th) or PHP 7.4.

--
Christoph M. Becker

6 years ago by michal@brzuchalski.com — view source

unread

2018-06-21 11:27 GMT+02:00 Rudi Theunissen rtheunissen@php.net:

The Comparable RFC (http://wiki.php.net/rfc/comparable) was written in
2010
and
revised in 2015 by Adam Harvey, but was not conclusive. I would like to
take
on some shared responsibility to push this forward and re-open the
discussion.

Why is this useful, and why should it be added to PHP?

The default comparison of objects can be expensive.

Objects are compared and tested for equality by recursively comparing their
properties. If an object has many properties, or properties that are
objects
or have large arrays, this comparison may dive very deep.

It may not be necessary to do all this work to determine whether two
objects
are equal. For example, if you have a data model that represents a database
entity, you're only concerned about the data attributes, and there's no
need
to compare the database connection or other internal metadata.

Objects that have a defining value (like a number type), can simply compare
itself in value to the other object (if also a number type), and can skip
checking other properties. This has the potential to save a lot of
unnecessary comparisons.

The default comparison of objects can be risky.

There could be cases where two objects have the same public properties but
one of them (perhaps a subclass) has a private property. These two objects
could be considered equal to the outside world, but == would return
false. When a developer added this private property, it might not have
been obvious that they were affecting equality. This makes == checks on
objects very fragile where it could be explictly defined and controlled.

There is no way to specify strict comparison.

With PHP becoming more type-safe since PHP 7 with strict mode and scalar
type-hints, the current behaviour feels out of date. For example:
$a = new stdClass();
$b = new stdClass();

$a->prop = 1;
$b->prop = true;

$a == $b; // true
Adding this functionality does not affect backwards-compatibility and
could
probably be implemented in 5.x as well. To my knowledge, The only major
changes required are in the default object handlers. This would
automatically affect functions like array_search and sort.

Other major languages support this and there seems to be a decent amount
of
public support for it.

What is the proposed API for this?

Magic methods. Objects that don't implement them maintain the current
behaviour.
/**
 * @returns -1 if less than,
 *     0 if equal to,
 *     1 if greater than the other value (in terms of relative ordering).
 */
__compareTo(mixed $other): int;

When comparing objects should be obvoius to use object type hint not
mixed

/**

@returns true if this instance is equal to the given value, or false if
not.
*/
__equals(mixed $other): bool;

>

If __compareTo($other) gives 0 then there is no justification for
additional method.
IMO you're duplicating functionality here.


>
>
> Why magic methods?
>
> - No risk of breaking backwards compatibility because method names that
> start
> with `__` are always reserved.
>
> - All objects can already be compared and tested for equality, so checking
> if
> an object implements an interface would not give you any useful
> information.
> Alternatively, we could introduce a function like `is_comparable`.
>
> - Python uses magic methods because there are no interfaces. Java has a
> Comparable interface because not all objects are comparable by default.
> It's
> important to note that we're not exposing the ability to make an object
> comparable, but instead to change the default behaviour when compared. All
> objects are technically already comparable.
>
> - PHP already uses magic methods to change the default behaviour of
> objects.
>
> - Interfaces like `Iterable` or `IteratorAggregate` must be interfaces
> because
> not all objects are iterable.
>
>
> Why do we need `__equals` when `__compareTo` can just return `0`?
>
> 1. Anything can be tested for equality, but all things are not comparable.
> For example, an apple can be equated with an orange and we can say that the
> apple does not equal the orange. However, comparing the apple to an orange
> makes no logical sense. There is nothing about either object that would
> allow us to say that the apple is greater or less than the orange.
>
> 2. Some things that are not equal have the same relative ordering. For
> example,
>     a floating point value of `2.0` does not equal the integer `2`, but
> their
>     relative ordering is the same, and a comparison would return `0`.
>

That's true, but you want to compare objects not scalar types, right?


>
> 3. The contexts in which they are called are not the same. Testing for
> equality
> occurs when an object is compared with another using `==` or in functions
> like `array_search`. The question we're asking is whether the two values
> are
> considered equal, and we're not concerned with ordering. The context for
> comparison is when we need to determine the relative ordering of two values
> using `<`, `>`, or in functions like `sort`.
>
>
> Issues to be discussed:
>
> 1. What happens when we use `<=` and `>=`? Does the "or equal to" part call
> `__equals` or does it check if `__compareTo` returns 0?
>
> 2. Should we also expose strict comparison, ie. `===` ?
>
> 3. Naming:
> - `__eq`
> - `__equalTo`
> - `__equals`
> - `__compareTo`
> - `__comparedTo`
> - `__compare`
> - `__cmp`
>

Thanks for reanimating this subject.

-- 
regards / pozdrawiam,
--
Michał Brzuchalski
about.me/brzuchal
brzuchalski.com

6 years ago by Dan Ackroyd — view source

unread

The Comparable RFC (http://wiki.php.net/rfc/comparable) was written in 2010
but was not conclusive. I would like to take
on some shared responsibility to push this forward and re-open the
discussion.

Why is this useful, and why should it be added to PHP?

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

For example, as it is available in other languages, giving an example
of some code that is much better than the equivalent would be in PHP
could be a good way of showing it is desirable.

cheers
Dan

6 years ago by Rudi Theunissen — view source

unread

If __compareTo($other) gives 0 then there is no justification for
additional method.
IMO you're duplicating functionality here.

I tried my best to explain that this is not the case. There are perfectly
logical cases where
two values are not equal, but have the same relative ordering. The best
example on my mind
is a decimal type with a value of 2.0000 and an integer 2. They're not
equal, but there's no
clear winner for ordering between them.

When comparing objects should be obvoius to use object type hint not
mixed

This brings up an interesting talking point. Should an object's __equals
method only
accepts instances of the class (and subclasses)? So object == int would
always fail? Or
should we allow the method to accept any value and leave it to the user to
do an instanceof?

That's true, but you want to compare objects not scalar types, right?

See above.

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

For example, as it is available in other languages, giving an example
of some code that is much better than the equivalent would be in PHP
could be a good way of showing it is desirable.

You can't override the behaviour of <, <=, >, >=, ==, !=
with a userland implementation.
Therefore, you won't be able to affect the internals of array functions
like in_array, sort etc.
Having it as a language level feature dictates a standard, which is
often hard to establish as a third party.
There are some minor but worth mentioning performance advantages in
having it be part of the language.

Not sure if that is a strong enough case, happy to elaborate some more and
provide some code.

The Comparable RFC (http://wiki.php.net/rfc/comparable) was written in
2010
but was not conclusive. I would like to take
on some shared responsibility to push this forward and re-open the
discussion.

Why is this useful, and why should it be added to PHP?

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

For example, as it is available in other languages, giving an example
of some code that is much better than the equivalent would be in PHP
could be a good way of showing it is desirable.

cheers
Dan

6 years ago by Levi Morrison — view source

unread

You can't override the behaviour of <, <=, >, >=, ==, !=
with a userland implementation.

Therefore, you won't be able to affect the internals of array functions
like in_array, sort etc.

In my opinion we should have functions which take comparator and/or
equality functions as parameters even if we can override these
operators. I'd like to see an outline of such a plan as part of this
RFC or as a precursor to it.

6 years ago by Levi Morrison — view source

unread

You can't override the behaviour of <, <=, >, >=, ==, !=
with a userland implementation.

Therefore, you won't be able to affect the internals of array functions
like in_array, sort etc.

In my opinion we should have functions which take comparator and/or equality functions as parameters even if we can override these operators. I'd like to see an outline of such a plan as part of this RFC or as a precursor to it.

(I know some already have this option, such as usort, but we don't
have an in_array that takes an equality callback, correct?)

6 years ago by Rudi Theunissen — view source

unread

In my opinion we should have functions which take comparator and/or
equality functions as parameters even if we can override these
operators.

I really like this idea, because it puts the responsibility of the
definition on the caller. It
allows you to do things like "Is there an instance of X in this array that
has a value of Y greater than Z".

This is probably a candidate for a separate proposal though, perhaps even
in favour of this one. In saying
that, I still believe that there's a place for dynamic comparison and
equality if used responsibly. I see the
primary value in objects that have a specific, obvious value, such as money
or dimensions.

You can't override the behaviour of <, <=, >, >=, ==, !=
with a userland implementation.

Therefore, you won't be able to affect the internals of array
functions
like in_array, sort etc.

In my opinion we should have functions which take comparator and/or
equality functions as parameters even if we can override these operators.
I'd like to see an outline of such a plan as part of this RFC or as a
precursor to it.

(I know some already have this option, such as usort, but we don't
have an in_array that takes an equality callback, correct?)

6 years ago by Dan Ackroyd — view source

unread

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

You can't override the behaviour of <, <=, >, >=, ==, != with
a userland implementation.

Therefore, you won't be able to affect the internals of array functions
like in_array, sort etc.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

cheers
Dan

6 years ago by Rudi Theunissen — view source

unread

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

Absolutely. I was hoping to gather some thoughts and opinions first while I
work on the implementation before I submit an official RFC. I'll make sure
to
include what you've mentioned, I completely agree.

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

You can't override the behaviour of <, <=, >, >=, ==, !=
with
a userland implementation.

Therefore, you won't be able to affect the internals of array
functions
like in_array, sort etc.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

cheers
Dan

6 years ago by Rudi Theunissen — view source

unread

What's the best place to override == internally? do_operation or a new
object handler?
I'd like to separate equality from compare_function.. or should we ignore
__equals
and assume that the values are equal if __compareTo returns 0?

Here's some context: I'm modifying is_equal_function to check for an
__equals
implementation if the value is an object, which I think should work, but
it's not clear how
an internal object (like the ds structures, for example) should override ==.

do_operation seems like a good choice for this, but I wanted to check
with you all first.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

Absolutely. I was hoping to gather some thoughts and opinions first while
I
work on the implementation before I submit an official RFC. I'll make sure
to
include what you've mentioned, I completely agree.

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

You can't override the behaviour of <, <=, >, >=, ==, !=
with
a userland implementation.

Therefore, you won't be able to affect the internals of array
functions
like in_array, sort etc.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

cheers
Dan

6 years ago by Rudi Theunissen — view source

unread

On further investigation, I'm not sure if we need both __equals and
__compareTo,
even with all the talk about contexts and the fact that an object can be
tested for
equality and not necessarily ordering as well. If we take out the
__equals method
and only include __compareTo, we can allow the user to return NULL to
indicate
that the object doesn't support the comparison that is being done. So the
return
values of __compareTo then becomes:

 1:    Greater than
-1:    Less than
 0:    Equal to, ==
 NULL: Unsupported, fall back to default behaviour.

In the case of an object returning NULL, we can fall back to the default
behaviour,
which is equivalent to the compare handler returning FAILURE, which then
falls
through to the compare_objects handler.

I think this will be less confusing and much easier to implement.

It's interesting to note that Java's Comparable interface considers a
0-return
to indicate equality, where a == would compare references in the same way
PHP's === would. So in dropping the __equals method, we're slightly more
aligned with Java, even though that's not exactly the goal here. :p

What's the best place to override == internally? do_operation or a new
object handler?
I'd like to separate equality from compare_function.. or should we ignore
__equals
and assume that the values are equal if __compareTo returns 0?

Here's some context: I'm modifying is_equal_function to check for an
__equals
implementation if the value is an object, which I think should work, but
it's not clear how
an internal object (like the ds structures, for example) should override
==.

do_operation seems like a good choice for this, but I wanted to check
with you all first.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

Absolutely. I was hoping to gather some thoughts and opinions first while
I
work on the implementation before I submit an official RFC. I'll make
sure to
include what you've mentioned, I completely agree.

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

You can't override the behaviour of <, <=, >, >=, ==,
!= with
a userland implementation.

Therefore, you won't be able to affect the internals of array
functions
like in_array, sort etc.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

cheers
Dan

6 years ago by Rudi Theunissen — view source

unread

Here is a WIP implementation of __compareTo, with some decent tests.

https://github.com/php/php-src/compare/master...rtheunissen:rt-compare-to-magic-method?diff=unified

On further investigation, I'm not sure if we need both __equals and
__compareTo,
even with all the talk about contexts and the fact that an object can be
tested for
equality and not necessarily ordering as well. If we take out the
__equals method
and only include __compareTo, we can allow the user to return NULL to
indicate
that the object doesn't support the comparison that is being done. So the
return
values of __compareTo then becomes:
 1:    Greater than
-1:    Less than
 0:    Equal to, ==
 NULL: Unsupported, fall back to default behaviour.
In the case of an object returning NULL, we can fall back to the default
behaviour,
which is equivalent to the compare handler returning FAILURE, which then
falls
through to the compare_objects handler.

I think this will be less confusing and much easier to implement.

It's interesting to note that Java's Comparable interface considers a
0-return
to indicate equality, where a == would compare references in the same
way
PHP's === would. So in dropping the __equals method, we're slightly
more
aligned with Java, even though that's not exactly the goal here. :p

What's the best place to override == internally? do_operation or a new
object handler?
I'd like to separate equality from compare_function.. or should we ignore
__equals
and assume that the values are equal if __compareTo returns 0?

Here's some context: I'm modifying is_equal_function to check for an
__equals
implementation if the value is an object, which I think should work, but
it's not clear how
an internal object (like the ds structures, for example) should override
==.

do_operation seems like a good choice for this, but I wanted to check
with you all first.

On Fri, 22 Jun 2018 at 13:06, Rudi Theunissen rtheunissen@php.net
wrote:

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

Absolutely. I was hoping to gather some thoughts and opinions first
while I
work on the implementation before I submit an official RFC. I'll make
sure to
include what you've mentioned, I completely agree.

On Fri, 22 Jun 2018 at 11:14, Dan Ackroyd danack@basereality.com
wrote:

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature
is
so much better (or even at all better) than having it be implemented
in userland.

You can't override the behaviour of <, <=, >, >=, ==,
!= with
a userland implementation.

Therefore, you won't be able to affect the internals of array
functions
like in_array, sort etc.

Yes, that's the type of thing that I think needs to be included as
part of the RFC.

Including a list of all the (or at least the important) functions that
would be affected by this RFC should be made both for clarity and so
that people can think through any edge cases.

cheers
Dan

6 years ago by Levi Morrison — view source

unread

Here is a WIP implementation of __compareTo, with some decent tests.

https://github.com/php/php-src/compare/master...rtheunissen:rt-compare-to-magic-method?diff=unified

Help me understand the nullable return. If you are defining this
operator then by definition you don't want the standard behavior, so
why is returning null helpful? Seems like implementors ought to throw.

6 years ago by Rudi Theunissen — view source

unread

The idea behind the nullable return is to allow for partial support, where
you might want to check for
equality but ignore ordering. You can't return nothing or NULL if we
normalise to -1,0,1 because NULL
would be 0 and therefore indicate that the values are equal.

We could require that the user always returns an integer, and suggest -1 to
be returned when there's no
clear definition for ordering. The null return was to avoid returning that
-1 when ordering isn't defined.

class Equatable {
    public function __compareTo($other) {
        if ($other instanceof self && $this->prop === $other->prop) {
            return 0;
        }

        // No logical ordering exists, so we're not concerned?
    }
}

As it stands, we have a few options here:

Allow anything to be returned and normalise to -1, 0 and 1.
Require an integer, error when anything else is returned, suggest -1 as
a comparison-not-defined convention.
Use NULL (or void return) as an indication that the comparison couldn't
be made, fall through to compare_objects.
Use NULL (or void return) as an indication that the comparison couldn't
be made, internal error.

My vote at the moment would go to #3 or #4 because it's the most flexible
and the NULL is an edge-case.

On Sun, Jun 24, 2018 at 8:11 AM Rudi Theunissen rtheunissen@php.net
wrote:

Here is a WIP implementation of __compareTo, with some decent tests.

https://github.com/php/php-src/compare/master...rtheunissen:rt-compare-to-magic-method?diff=unified

Help me understand the nullable return. If you are defining this
operator then by definition you don't want the standard behavior, so
why is returning null helpful? Seems like implementors ought to throw.

6 years ago by Levi Morrison — view source

unread

The idea behind the nullable return is to allow for partial support, where you might want to check for
equality but ignore ordering. You can't return nothing or NULL if we normalise to -1,0,1 because NULL
would be 0 and therefore indicate that the values are equal.

We could require that the user always returns an integer, and suggest -1 to be returned when there's no
clear definition for ordering. The null return was to avoid returning that -1 when ordering isn't defined.
class Equatable {
    public function __compareTo($other) {
        if ($other instanceof self && $this->prop === $other->prop) {
            return 0;
        }

        // No logical ordering exists, so we're not concerned?
    }
}
As it stands, we have a few options here:

Allow anything to be returned and normalise to -1, 0 and 1.

Require an integer, error when anything else is returned, suggest -1 as a comparison-not-defined convention.

Use NULL (or void return) as an indication that the comparison couldn't be made, fall through to compare_objects.

Use NULL (or void return) as an indication that the comparison couldn't be made, internal error.

In these schemes I am not understanding how to specify that something
is not equal, not ordered, and should not fall back to something else.
Consider things like UUIDs and other identifiers where equality is
desired and ordering does not make sense. Other languages (most? all?)
separate equality and ordering for this reason; I think we ought to do
the same.

6 years ago by Rudi Theunissen — view source

unread

Other languages (most? all?) separate equality and ordering for this
reason.

Java doesn't really separate them. Their == always checks object
reference so is like PHP's ===.
But they do have the .equals() method on all objects (our ==) and the
collections use that for equality.

In these schemes I am not understanding how to specify that something

is not equal, not ordered, and should not fall back to something else.
Consider things like UUIDs and other identifiers where equality is
desired and ordering does not make sense.

For something to be not equal, you return anything but 0. If something is
not ordered, you return -1 (LHS wins).
If you don't want something to fall back to any default behaviour, just
return an integer and not NULL.

The realistic case, in my opinion, is that there will always be some
property to order by, whether it's the object's
numeric value, a string that can be alphabetical, or a time based element.
We can argue that if the object doesn't
have a defined, logical ordering, then it also wouldn't make sense to order
a collection of them in the first place.

So with that in mind, maybe it's okay to take "NULL falls back to default
behaviour" idea out, and do a normalisation
on whatever the user returns. If it's NULL, then that'll be 0, and the
value will be equal. The only thing I don't like about
that is that the user didn't specify "equal" with a 0, they returned the
absence of a direction, and we should therefore
either throw an exception internally, or fall back to the current
comparison behaviour (comparing properties).

The NULL return is a special case for me. Anything else is some value
that can be normalised ($v ? ($v < 0 ? -1 : 1) : 0).

Therefore, I believe it comes down to a simply question: how do we handle
the case where __compareTo returns NULL?

6 years ago by Rowan Collins — view source

unread

So with that in mind, maybe it's okay to take "NULL falls back to default
behaviour" idea out, and do a normalisation
on whatever the user returns. If it's NULL, then that'll be 0, and the
value will be equal. The only thing I don't like about
that is that the user didn't specify "equal" with a 0, they returned the
absence of a direction, and we should therefore
either throw an exception internally, or fall back to the current
comparison behaviour (comparing properties).

Since this is a method, and not just a callback passed somewhere, could
we simply mandate that the signature has an "int" return type?

That way, if the class is written with strict_types=1, it will throw a
TypeError for null, and the user will have a big hint that that's going
to happen when they write ": int".

Regards,

--
Rowan Collins
[IMSoP]

6 years ago by Rudi Theunissen — view source

unread

Since this is a method, and not just a callback passed somewhere, could
we simply mandate that the signature has an "int" return type?

We can error internally if the return value wasn't an integer, and the user
is welcome
to add a : int to the signature, but I'm not sure if you can specify
argument info for magic methods.

My only concern is what the user would return when ordering doesn't make
sense. For example:

class Shape {
    ...

    public function __compareTo(Shape $other): int {
        if ($this->name === $other->name) {
            return 0;
        }

        /**
         * There's no logical ordering for a shape!
         * I have to either:
         *     - Compare alphabetically by name
         *     - Let the LHS win and return -1.
         *
         * To indicate that the other shape DOES NOT equal this
         * shape, we would need to return either 1 or -1.
         *
         * In cases where neither of them make sense logically,
         * should we force the user to pick one or do we support
         * a nullable return which behaves like returning -1?
         */

         return -1; // Arbitrary
    }
}

A nullable integer return type to achieve this would be something like this:

class Shape {
    ...

    public function __compareTo(Shape $other): ?int {
        if ($this->name === $other->name) {
            return 0;
        }

        /**
         * Return nothing because there's no logical direction.
         * This is the equivalent of returning -1 because the RHS
         * will always win. Sorting an array of shapes will have no
         * effect but functions like in_array will work as expected.
         */
    }
}

There is one more tricky case when it comes to scalar values, and this is
why the "fall back
to default behaviour on null" came about:

What happens when we compare an object to null or false?

If we go with the : ?int scheme where returning null is the same as
returning -1, it would
mean that Object < false would return true, which is strange.

If we go with the : int scheme, we force the user to provide an arbitrary
direction in cases
where ordering isn't applicable, which really shouldn't be necessary.

But, if we change the behaviour of the null return to fall back to
default behaviour, we can
partially override comparability where applicable, and avoid weird cases
like Object < false.

However, it might not be clear to the user that default behaviour is in
effect, which may lead
to bugs that are difficult to debug or behaviour that is difficult to
trace. At this stage I'm
confident that a nullable integer return type is the way to go, we just
have to determine how
to handle the null case.

The way I see it, we have two options:

Fall through without warning, because it's documented and consistent.
The user can throw
an exception themselves if their class is compared to something
unexpected.
Fall through with a warning or notice, because it's probably unintended
behaviour and the
user should know about it.

class Shape {
    ...

    public function __compareTo($other) {
        if ($other instanceof Shape) {
            return $this->name <=> $other->name
        }

        /**
         * Return nothing because we don't know how to
         * compare against other types, and we're happy
         * to fall back to default behaviour.
         *
         * Alternatively, the user can type-hint Shape
         * which will guard against unintended comparison
         * and make the instanceof check redundant.
         */
    }
}

At this stage, my vote would go for a mixed return type where the null
case falls
through to default behaviour without notice. All return values will be
normalised to
-1, 0 and 1, but null will fall through.

I'm going to draft an RFC based on this approach with some clear examples.
:)

Other languages (most? all?) separate equality and ordering for this

reason.

Java doesn't really separate them. Their == always checks object
reference so is like PHP's ===.
But they do have the .equals() method on all objects (our ==) and the
collections use that for equality.

In these schemes I am not understanding how to specify that something

is not equal, not ordered, and should not fall back to something else.
Consider things like UUIDs and other identifiers where equality is
desired and ordering does not make sense.

For something to be not equal, you return anything but 0. If something is
not ordered, you return -1 (LHS wins).
If you don't want something to fall back to any default behaviour, just
return an integer and not NULL.

The realistic case, in my opinion, is that there will always be some
property to order by, whether it's the object's
numeric value, a string that can be alphabetical, or a time based element.
We can argue that if the object doesn't
have a defined, logical ordering, then it also wouldn't make sense to
order a collection of them in the first place.

So with that in mind, maybe it's okay to take "NULL falls back to default
behaviour" idea out, and do a normalisation
on whatever the user returns. If it's NULL, then that'll be 0, and the
value will be equal. The only thing I don't like about
that is that the user didn't specify "equal" with a 0, they returned the
absence of a direction, and we should therefore
either throw an exception internally, or fall back to the current
comparison behaviour (comparing properties).

The NULL return is a special case for me. Anything else is some value
that can be normalised ($v ? ($v < 0 ? -1 : 1) : 0).

Therefore, I believe it comes down to a simply question: how do we handle
the case where __compareTo returns NULL?

6 years ago by Levi Morrison — view source

unread

Other languages (most? all?) separate equality and ordering for this reason.

Java doesn't really separate them. Their == always checks object reference so is like PHP's ===.
But they do have the .equals() method on all objects (our ==) and the collections use that for equality.

Java has .equals and .compareTo; these operations are separate. In
Java neither integrates with operators.

6 years ago by Rudi Theunissen — view source

unread

Java has .equals and .compareTo; these operations are separate. In
Java neither integrates with operators.

Yeah that's right. I was just pointing out that Java's == always checks
against
the reference and you can't override it (so it's like PHP's ===). Their
.equals()
method is like PHP's ==.

I agree that it's better to separate the two operations, which is why the
first message
in this thread talked about __equals and __compareTo. However, when I
started
implementing it, I couldn't see a nice way to separate them internally as
everything
goes through compare_function.

On Sun, Jun 24, 2018 at 2:31 PM Rudi Theunissen rtheunissen@php.net
wrote:

Other languages (most? all?) separate equality and ordering for this
reason.

Java doesn't really separate them. Their == always checks object
reference so is like PHP's ===.
But they do have the .equals() method on all objects (our ==) and the
collections use that for equality.

Java has .equals and .compareTo; these operations are separate. In
Java neither integrates with operators.

6 years ago by Rudi Theunissen — view source

unread

The part I found difficult was in the handlers - we only have compare, no
equals.
The only way we can have the handler differentiate between equality and
ordering
is if we pass a parameter to the handler, which means we'd have to change
the header.

From:

typedef int (*zend_object_compare_zvals_t)(zval *result, zval *op1, zval *op2);

To:

typedef int (*zend_object_compare_zvals_t)(zval *result, zval *op1, zval *op2, int mode);

Or we could introduce a new handler? Not sure if that's something we can do
easily.

6 years ago by Rudi Theunissen — view source

unread

This discussion has moved to the RFC thread here:
https://externals.io/message/102473

The part I found difficult was in the handlers - we only have compare,
no equals.
The only way we can have the handler differentiate between equality and
ordering
is if we pass a parameter to the handler, which means we'd have to change
the header.

From:

typedef int (*zend_object_compare_zvals_t)(zval *result, zval *op1, zval *op2);

To:

typedef int (*zend_object_compare_zvals_t)(zval *result, zval *op1, zval *op2, int mode);

Or we could introduce a new handler? Not sure if that's something we can
do easily.

6 years ago by Stanislav Malyshev — view source

unread

Hi!

I think if you want to push the RFC forward, a really quite strong
case needs to be made for why having it be a language level feature is
so much better (or even at all better) than having it be implemented
in userland.

I tend to agree here. There are not that many cases where < and > are
natural for objects - mostly for implementation of extended numeric
types like complex numbers, but that's not a very common thing people do
in PHP. And when they do, they usually come with custom methods, since
these types work different from scalar types.

Sorting may be a major use case, but there I think using custom sort
functions have it covered. So I'd like to see more specifically on the
case why it's important to have operators and not specialized methods
for objects in the RFC. Especially for </> kind - I can see some case
for non-strict equality, but ordering seems to be a bit exotic (I may be
missing some cases though).

There are also some complications there. I.e., if you implement both
__compare and __equals, you essentially have two functions that do
"equals" - you probably won't want to call both for something like >=?
But that creates an opening for very weird bugs.

Also note: Python had cmp in Python 2, but they moved away from it
to specialized comparison methods in Python 3. It would be a good idea
to look into why they did it, so we could learn from their experience.
Doesn't mean our decision would be the same, but it looks like they had
something that made them change it, so I think we should at least
consider what it was.

Stas Malyshev
smalyshev@gmail.com