Hello again internals!
Thank you all for the feedback that you've provided on my initial inquiries
about this feature. Several bits of feedback I've received have resulted in
changes to the proposal as it stands and made this a better proposal.
First, the RFC has been moved to the wiki:
https://wiki.php.net/rfc/user_defined_operator_overloads
The RFC is very long, as it contains a lot of research into the topic,
which is part of the reason I'm bringing it to discussion so early. Work
has also begun on an implementation, for which the draft PR can be viewed
here:
https://github.com/php/php-src/pull/7388
As the PR indicates, there is still a ways to go before a finished
implementation is ready, however the RFC document appears complete enough
now to be opened for discussion. Again, as I've indicated before, I'm
taking my time with this, so there won't be a vote prior to a final
implementation being reviewable.
It is possible that this RFC will involve some changes to opcache as well
as new opcodes for > and >= in order to maintain consistency of operand
precedence in execution, and that's something I want to take my time with
and listen to a lot of feedback on. Its impact is currently unknown as the
implementation for it is unfinished.
Some key points for this proposal as it is currently:
- TypeErrors from the operator methods are thrown immediately instead of
being suppressed. - The parameter for the other operand must be explicitly typed; an omitted
type is not considered mixed for these methods. Reasoning explained in the
RFC.- Type coercion still occurs as normal when strict types is not used.
- The return type of the == operator function is restricted to be a bool.
- The return type of the <=> operator function and all implied operators
(<, <=, >, >=) is restricted to be int and is normalized to -1, 0, 1. - User classes are forced to implement overloads to work with the supported
operators, resulting in an InvalidOperator exception if unimplemented.
(This is a BC break, but impact is anticipated to be low)- The <=> operator instead falls back to the existing comparison logic
if the overload is unimplemented. - The special cases of $obj == null and $obj == false return the
expected bool value even if the overload is unimplemented.
- The <=> operator instead falls back to the existing comparison logic
- Internal classes silently fail-over to existing operator logic. (This
mainly affects ==). - Enums are allowed to utilize operator overloads, but unlike other
classes, are not forced to and fall back to existing logic for comparisons. - The proposal is for dynamic methods instead of static methods, with
reasoning explained in the RFC. - The identity operator === is not overloadable to prevent possibly
terrible misuse. - Interfaces are not suggested or proposed, with reasoning explained in the
RFC.
Open questions:
- Should this RFC include the bitwise operators or remain limited to the
minimal set of math operators? - Should the == operator fail-over to existing behavior for all classes if
unimplemented, not just internal ones? (This would reduce BC break, but
make catching implementation errors more difficult for userland code.) - Should implicit interfaces be provided for the overloads? (Actual
interfaces cannot be provided except for if my previous Never For Parameter
Types RFC were adopted.)
Thank you for all the feedback and help so far.
Jordan
Hello again internals!
Thank you all for the feedback that you've provided on my initial inquiries
about this feature. Several bits of feedback I've received have resulted in
changes to the proposal as it stands and made this a better proposal.
First, the RFC has been moved to the wiki:https://wiki.php.net/rfc/user_defined_operator_overloads
The RFC is very long, as it contains a lot of research into the topic,
which is part of the reason I'm bringing it to discussion so early. Work
has also begun on an implementation, for which the draft PR can be viewed
here:https://github.com/php/php-src/pull/7388
As the PR indicates, there is still a ways to go before a finished
implementation is ready, however the RFC document appears complete enough
now to be opened for discussion. Again, as I've indicated before, I'm
taking my time with this, so there won't be a vote prior to a final
implementation being reviewable.It is possible that this RFC will involve some changes to opcache as well
as new opcodes for > and >= in order to maintain consistency of operand
precedence in execution, and that's something I want to take my time with
and listen to a lot of feedback on. Its impact is currently unknown as the
implementation for it is unfinished.Some key points for this proposal as it is currently:
- TypeErrors from the operator methods are thrown immediately instead of
being suppressed.- The parameter for the other operand must be explicitly typed; an omitted
type is not considered mixed for these methods. Reasoning explained in the
RFC.
- Type coercion still occurs as normal when strict types is not used.
- The return type of the == operator function is restricted to be a bool.
- The return type of the <=> operator function and all implied operators
(<, <=, >, >=) is restricted to be int and is normalized to -1, 0, 1.- User classes are forced to implement overloads to work with the supported
operators, resulting in an InvalidOperator exception if unimplemented.
(This is a BC break, but impact is anticipated to be low)
- The <=> operator instead falls back to the existing comparison logic
if the overload is unimplemented.- The special cases of $obj == null and $obj == false return the
expected bool value even if the overload is unimplemented.- Internal classes silently fail-over to existing operator logic. (This
mainly affects ==).- Enums are allowed to utilize operator overloads, but unlike other
classes, are not forced to and fall back to existing logic for comparisons.- The proposal is for dynamic methods instead of static methods, with
reasoning explained in the RFC.- The identity operator === is not overloadable to prevent possibly
terrible misuse.- Interfaces are not suggested or proposed, with reasoning explained in the
RFC.
This is an amazingly detailed and well-written RFC. Thank you!
Comments in no particular order:
-
In the enum example, you can make it vastly simpler by copying my previous bitwise example from the previous thread, using ->value and ->from(). At the very least, all those if-else blocks should be match()es.
-
As you seem to be a fan of math terminology (hi friend!), I believe the formal way to state commuitivity is "multiplication is commutative over natural numbers". Some of the wording around that and associativity (Ibid.) would probably be cleaner with that phrasing.
-
The "Separate codebases" risk is odd. As you note, "the same as anything else, duh, copy-pasta will bite you." I'm not sure why that's even listed.
-
The larger risk would be competing implementations that use operators differently. Eg, I could see one collection implementation using / to indicate "remove these elements" (because that's the complement of modulo, which is "all elements except these"), and another to indicate "give me an array of these many equally-sized collection objects." That would be confusing if you bounce between two implementations on different projects. (Eg, Laravel and Symfony.) (We could argue that one of those implementations of / is dumb, but that doesn't mean they won't exist.)
-
I don't really grok how $queue -= 1 would work. If that expands to $queue = $queue - 1, then the = operator would return the value assigned ($queue, after the value is removed), and you would get the next item... how? I don't know how you'd do that, making that syntax nominally valid, but useless.
-
"In Python, the comparison operators are not directly commutative, but have a reflected pair corresponding with the swapped order. However, each object could implement entirely different logic, and thus no commutativity is enforced. "
I... have no idea what that sentence means. Please clarify.
-
It doesn't seem to be explicitly stated, but I presume developers can type their operator methods as
mixed
if they are so inclined? (I don't see why they would, but they could.) -
You refer to InvalidOperator as an exception, although it extends Error. That's technically incorrect. It is a new throwable, which extends error. Please use the correct terms. (The design is fine, IMO, but it should have the correct terminology.)
-
I think sticking to compareTo() is probably wise. I can see all sorts of abuses of the other comparison operators if they were allowed... and I'd probably do some of that abuse myself, frankly. :-)
-
Please make sure via tests that __compareTo() is triggered by
sort()
and friends, as that's the main use case for it: Making sorting objects just like sorting anything else, without having to provide a separate, independent comparison algorithm. (Basically, replacingusort()
with __compareTo().)
Open questions:
- Should this RFC include the bitwise operators or remain limited to the
minimal set of math operators?
This should IMO be decided based on how easy they are to implement. If it's trivial to include, and uncontroversial, sure. If it's contentious or hard, punt to a later RFC.
- Should the == operator fail-over to existing behavior for all classes if
unimplemented, not just internal ones? (This would reduce BC break, but
make catching implementation errors more difficult for userland code.)
It would help if you documented the current behavior in the RFC, because I had to go test it to see what it was. :-)
I'd say yes, keep the existing behavior. I actually have a few tests right now in a serialization library that check if $foo == unserialize(serialize($foo)); (Not those functions, but in concept.)
- Should implicit interfaces be provided for the overloads? (Actual
interfaces cannot be provided except for if my previous Never For Parameter
Types RFC were adopted.)
I think the RFC makes a good case that, unlike Stringable, they're not really useful. The reason to have them would be to ask "does this object implement this magic method", but as you note that doesn't really tell you much. Stringable only has one meaningful operation, so it's meaningful to type against. Addable doesn't give you sufficient information. (Insert the usual comments about Generics here.)
Thanks!
--Larry Garfield
On Fri, Aug 20, 2021 at 9:49 AM Larry Garfield larry@garfieldtech.com
wrote:
- The "Separate codebases" risk is odd. As you note, "the same as
anything else, duh, copy-pasta will bite you." I'm not sure why that's
even listed.
- The larger risk would be competing implementations that use operators
differently. Eg, I could see one collection implementation using / to
indicate "remove these elements" (because that's the complement of modulo,
which is "all elements except these"), and another to indicate "give me an
array of these many equally-sized collection objects." That would be
confusing if you bounce between two implementations on different projects.
(Eg, Laravel and Symfony.) (We could argue that one of those
implementations of / is dumb, but that doesn't mean they won't exist.)
Agreed. I reworded and simplified that subsection.
- I don't really grok how $queue -= 1 would work. If that expands to
$queue = $queue - 1, then the = operator would return the value assigned
($queue, after the value is removed), and you would get the next item...
how? I don't know how you'd do that, making that syntax nominally valid,
but useless.
I added more details to clarify what the example was actually about. It was
intended to show how such usage is ambiguous and incorrect. Because a
class that implements __sub() can't prevent it from also working with the
reassignment operator -=, usage in this fashion will be either highly
discouraged or loudly communicated by the library, which in either case
should help promote good community standards around the feature. In this
example, the __sub() method didn't return an instance of Queue, it was
returning something like an array of int
items from the queue, which
obviously behaves terribly with the reassignment operator.
- "In Python, the comparison operators are not directly commutative, but
have a reflected pair corresponding with the swapped order. However, each
object could implement entirely different logic, and thus no commutativity
is enforced. "I... have no idea what that sentence means. Please clarify.
I added a more explicit explanation of this. Basically, comparisons are not
"commutative". They have a reversed relationship with some other
comparison. $x > $y
gets reflected to $y < $x
. Equality comparison is
its own "reflection". The reflection for the spaceship operator is $x <=>$y === ($y <=> $x) * -1
.
- It doesn't seem to be explicitly stated, but I presume developers can
type their operator methods asmixed
if they are so inclined? (I don't
see why they would, but they could.)
Yes, they are not prevented from doing this, but they must do so explicitly
with the mixed
literal when typing the parameter. The return value is not
forced to be typed (that is, no type declaration is treated like mixed,
even though in the engine it's a separate flag), but can also be mixed if
they wish.
- You refer to InvalidOperator as an exception, although it extends Error.
That's technically incorrect. It is a new throwable, which extends
error. Please use the correct terms. (The design is fine, IMO, but it
should have the correct terminology.)
Thanks. I've cleaned up all the references I believe.
- I think sticking to compareTo() is probably wise. I can see all sorts of
abuses of the other comparison operators if they were allowed... and I'd
probably do some of that abuse myself, frankly. :-)
- Please make sure via tests that __compareTo() is triggered by
sort()
and
friends, as that's the main use case for it: Making sorting objects just
like sorting anything else, without having to provide a separate,
independent comparison algorithm. (Basically, replacingusort()
with
__compareTo().)
Yes, I have three existing tests in the engine to resolve, then I will move
on to improving and expanding the tests for this feature. :)
Open questions:
- Should this RFC include the bitwise operators or remain limited to the
minimal set of math operators?This should IMO be decided based on how easy they are to implement. If
it's trivial to include, and uncontroversial, sure. If it's contentious or
hard, punt to a later RFC.
They are very trivial to implement. If anyone has an objection to this
specifically, I'll of course take that into consideration. I am not
planning on entertaining the idea of any of the miscellaneous operators,
reassignment operators, or logical/boolean operators in this RFC though.
That seems too free form, especially when the PHP community won't have any
pre-established community standards around using them prior to this feature.
- Should the == operator fail-over to existing behavior for all classes
if
unimplemented, not just internal ones? (This would reduce BC break, but
make catching implementation errors more difficult for userland code.)It would help if you documented the current behavior in the RFC, because I
had to go test it to see what it was. :-)I'd say yes, keep the existing behavior. I actually have a few tests
right now in a serialization library that check if $foo ==
unserialize(serialize($foo)); (Not those functions, but in concept.)
Yes, I actually already committed the change for this and updated the RFC.
There are too many problems involved in trying to error unimplemented ==
operators, and that might not be desirable behavior anyways.
Jordan
Pierre and Dik,
The thread you were replying to is older and doesn't reference the current
RFC: https://wiki.php.net/rfc/user_defined_operator_overloads
This is the current discussion thread for the RFC.
In particular, I'd encourage reading about which operators are supported,
the restrictions on the implementations, and the failure mode behavior.
Pierre:
and hell, if "->" doesn't mean "access that object member" anymore, but
"hey, the implementor could do anything at all instead", sky will fall upon
us.
I hate to break it to you, but __call and __get already do this, and aren't
even a part of this RFC. I'd highly encourage you to read what operators
are supported and how, because this RFC definitely does not suggest opening
ALL operators to overloads. There are very deliberate omissions.
However, the object accessor ->
is already overloadable to some extent
with __call
and __get
. The scope resolution operator ::
is already
overloadable to some extent with __callStatic
. The string concatenation
operator .
is already overloadable to some extent with __toString
. The
assignment operator =
is already overloadable to some extent with
__set
.
These all have limitations in how they are called automatically however.
For instance, __set
doesn't allow arbitrary overloading of the assignment
operator, instead it requires the left side to reference a class property.
Similar sorts of restrictions are proposed here. The fact that the
reassignment operators must be supported by any overload will very highly
discourage non-immutable implementations. The __equals
overload must
return a bool. The __compareTo
overload must return an int and will have
its return values automatically normalized to -1, 0, 1.
The RFC document covers a lot of detail about the proposed implementation
and addresses many of the things you brought up.
Jordan
Le 23/08/2021 à 15:37, Jordan LeDoux a écrit :
Pierre and Dik,
The thread you were replying to is older and doesn't reference the current
RFC: https://wiki.php.net/rfc/user_defined_operator_overloadsThis is the current discussion thread for the RFC.
In particular, I'd encourage reading about which operators are supported,
the restrictions on the implementations, and the failure mode behavior.Pierre:
and hell, if "->" doesn't mean "access that object member" anymore, but
"hey, the implementor could do anything at all instead", sky will fall upon
us.I hate to break it to you, but __call and __get already do this, and aren't
even a part of this RFC. I'd highly encourage you to read what operators
are supported and how, because this RFC definitely does not suggest opening
ALL operators to overloads. There are very deliberate omissions.
Yes, I'm well aware of that, but better stick with the devil you know :)
Anyway, I always thought that __set, __get __call and __invoke should
have been banned a long time ago. I the hell the paved with good
intentions and we cannot remove those since it would be a too big BC
break. At least, we know those quite well and are used to it, but do it
with all operators and code reading will become a real mess (you have
to mentally switch from reading the line with an operator to grep some
code in order to find the operator overload definition, whereas a nice
method call makes everything clear and navigable in any IDE).
I read the whole RFC, I'm still not convinced it's a good thing to open
the door to operator overloading, in my opinion, messy use case should
probably be deprecated and replaced by semantic equivalent methods
instead. Even thought I know for a long time that a few operators work
with DateTime, it always mind-fuck me in some way.
That's a subjective opinion of course, and all your arguments remain
legit and valid, you've probably thought the problem much more deeper
than I did. At least, I give you that, it will give a rebirth of
equals() and compareTo() declined RFCs, and that's something really
positive in all that.
I'll give it a second read, I might not have seen all recent changes.
Thanks for the quick answer.
Regards,
--
Pierre
Anyway, I always thought that __set, __get __call and __invoke should
have been banned a long time ago.
Considering I use __invoke on a daily basis as it makes life so much easier
when you have single method (service) classes, I'm going to disagree on
this one. The other functions are also very useful tools to migrate legacy
code.
I agree that not being able to click through to operator overloading
methods is annoying, untraceable __toString is the worst. I do think that
this is something IDEs can handle, though not sure how reviewing will be
when diffs still hide the implementation.