Hello internals,
I last brought this RFC up for discussion in August, and there was 
certainly interesting discussion. Since then there have been many 
improvements, and I'd like to re-open discussion on this RFC. I mentioned 
in the first email to the list that I was planning on taking a while before 
approaching a vote, however the RFC is much closer to vote-ready now, and 
I'd like to open discussion with that in mind.
RFC Link: https://wiki.php.net/rfc/user_defined_operator_overloads
There is a patch for this RFC, however the latest commits are not playable. 
It will build, but with various problems which are being worked on related 
to enums. The last playable commit can be found by checking out this commit:
https://github.com/JordanRL/php-src/commit/e044f53830a9ded19f7c16a9542521601ac3f331
This commit however does not have the enum for operator position described 
in the RFC. It uses a bool instead with true being the left side, and false 
being the right side.
Implementation details still left:
- There are issues related to opcache/JIT still, so if you want to play 
 around with the playable commit disable both.
- Reflection has not been updated, but the proposed updates necessary are 
 described in the RFC.
It is a long RFC, but operator overloads are a complicated topic if done 
correctly. Please review the FAQ section before asking a question, as it 
covers many of the main objections or inquiries to the feature. I'd be 
happy to expand on any of the answers there if prompted however.
Jordan
Hi Jordan,
Hello internals,
I last brought this RFC up for discussion in August, and there was
certainly interesting discussion. Since then there have been many
improvements, and I'd like to re-open discussion on this RFC.
In general I'm in favour of this RFC; a few months ago I was 
programming something and operator overloads would have been a good 
solution, but then I remembered I was using PHP, and they haven't been 
made possible yet.
However.....I think the new 'operator' keyword is probably not the way 
to go. Although it's cute, it has some significant downsides.
There are quite a few downstream costs for making a new type of 
methods in classes. All projects that analyze code (rector, 
codesniffer, PHPStorm, PhpStan/Psalm, PHPUnit's code coverage 
annotations etc) would have to add a non-trivial amount of code to not 
bork when reading the new syntax. Requiring more code to be added and 
maintained in PHP's builtin Reflection extension is also a cost. 
That's quite a bit of work for a feature that has relatively rare 
use-cases.
I just don't agree/understand with some of the reasoning in the RFC of 
why using symbols is preferable.
"In such a situation, using magic methods would not be desired, as any 
combination of symbols may be used for the new infix. The restrictions 
on function names, such as needing to reserve the & to mark a function 
as being by-reference, would place limitations on such future scope."
I don't get this. The magic methods in previous drafts of the RFC 
don't have a problem with & as the methods are named with 'two 
underscores' + name e.g. __bitwiseAnd. That does't appear to cause a 
problem with an ampersand?
"By representing the implementations by the symbols themselves, this 
RFC avoids forcing implementations to be mislabeled with words or 
names which do not match the semantic meaning of that symbol in the 
program context.
The name of the function (e.g. __add) always refers to the symbol used 
where it is used, not what it is doing.
If the code is $a + $b then that is an addition operator, when the 
code is read. If I was reading the code, and I saw that either  $a or 
$b were objects, I would know to go looking for an __add magic method.
" '// This function unions, it does not add'"
Then that is probably an example of an inappropriate use of operator 
overloads, and so shouldn't be used as a justification for a syntax 
choice.
"Non-Callable - Operand implementations cannot be called on an
instance of an object the way normal methods can."
I think this is just wrong, and makes the RFC unacceptable to me.
Although most of the code I write is code that just performs 
operations as I see fit, some of the time the operations need to be 
driven by user data. Even something simple like a 
calculator-as-a-service would need to call the operations dynamically 
from user provided data.
I also have an aesthetic preference when writing tests to be explicit 
as possible, rather than concise as possible e.g.
$foo->__add(5, OperandPosition::LeftSide); 
$foo->__add(5, OperandPosition::RightSide);
instead of:
$foo + 5; 
5 + $foo
As I find that easier to reason about.
cheers 
Dan 
Ack
/congratulations on stunning the audience into silence otherwise though.
Hi Jordan,
Hello internals,
I last brought this RFC up for discussion in August, and there was
certainly interesting discussion. Since then there have been many
improvements, and I'd like to re-open discussion on this RFC.In general I'm in favour of this RFC; a few months ago I was
programming something and operator overloads would have been a good
solution, but then I remembered I was using PHP, and they haven't been
made possible yet.
I too far prefer this RFC to its predecessors, and hope it passes in some form.
However.....I think the new 'operator' keyword is probably not the way
to go. Although it's cute, it has some significant downsides.There are quite a few downstream costs for making a new type of
methods in classes. All projects that analyze code (rector,
codesniffer, PHPStorm, PhpStan/Psalm, PHPUnit's code coverage
annotations etc) would have to add a non-trivial amount of code to not
bork when reading the new syntax. Requiring more code to be added and
maintained in PHP's builtin Reflection extension is also a cost.
That's quite a bit of work for a feature that has relatively rare
use-cases.I just don't agree/understand with some of the reasoning in the RFC of
why using symbols is preferable."In such a situation, using magic methods would not be desired, as any
combination of symbols may be used for the new infix. The restrictions
on function names, such as needing to reserve the & to mark a function
as being by-reference, would place limitations on such future scope."I don't get this. The magic methods in previous drafts of the RFC
don't have a problem with & as the methods are named with 'two
underscores' + name e.g. __bitwiseAnd. That does't appear to cause a
problem with an ampersand?"By representing the implementations by the symbols themselves, this
RFC avoids forcing implementations to be mislabeled with words or
names which do not match the semantic meaning of that symbol in the
program context.The name of the function (e.g. __add) always refers to the symbol used
where it is used, not what it is doing.If the code is
$a + $bthen that is an addition operator, when the
code is read. If I was reading the code, and I saw that either $a or
$b were objects, I would know to go looking for an __add magic method." '// This function unions, it does not add'"
Then that is probably an example of an inappropriate use of operator
overloads, and so shouldn't be used as a justification for a syntax
choice.
I believe the intent is for things like dot product.
int * int
This is known as "multiplication", or we could abbreviate it __mul if we felt like it.
vector * vector
This is known as a "Dot product", or "scalar product", and is not really the same as multiplication. It uses effectively the same operator sigil, though.
There's also cross-product, which nominally uses x as a symbol in mathematics.  Which is... often also translated to * in code, but is a very different operation and is not multiplication as we know it, nor is it the same as dot product.
So __mul() would be an incorrect name for either one. Using symbols, however, would (with some future extension to make it extensible) allow for:
operator *(Vector $v, $left) { ... }
operator x(Vector $v, $left) { ... }
which (for someone who knows vector math) is a lot more self-explanatory than "which one is being mistranslated as multiply?"
At least, that's my understanding of the argument for it. The point about making meta programming more difficult is certainly valid, though. Personally I think I could go either way on this one, as I see valid arguments either direction.
"Non-Callable - Operand implementations cannot be called on an
instance of an object the way normal methods can."
I think this is just wrong, and makes the RFC unacceptable to me.
Although most of the code I write is code that just performs
operations as I see fit, some of the time the operations need to be
driven by user data. Even something simple like a
calculator-as-a-service would need to call the operations dynamically
from user provided data.
I largely agree here.  I don't know if it's because of the operator choice or not, but being able to call an operator dynamically is important in many use cases.  It doesn't have to be a pristine syntax, but some way to do that dynamically (without having a big match statement everywhere you need to) would be very welcome.
Another question: Can an interface or abstract class require an operator to be implemented? That's not currently discussed at all. (I would expect the answer to be Yes.)
--Larry Garfield
Using symbols, however, would (with some future extension to make it extensible) allow for:
I don't get how it's easier, other than being able to skip naming the 
symbol name. e.g. adding union and intersection operators
function __union(...){} 
function __intersection(...){}
vs
operator  ∪(...){} 
operator ∩(...){}
In fact, I find one of those quite a bit easier to read...
Larry Garfield wrote:
It uses effectively the same operator sigil, though.
Yes, that's what I was trying to say.
Danack wrote:
The name of the function (e.g. __add) always refers to the symbol used
where it is used, not what it is doing.
If the naming is taken from the sigil, then it's always appropriate.
So if operator * has the magic method __asterisk instead of __mul, it 
avoids any suggestion of what the operation actually means for the 
object.
btw, I don't really care about this naming problem. My concern is that 
it's being used as a reason for introducing a special new type 
function, when it's really not a big enough problem to deserve making 
the language have special new syntax.
cheers 
Dan 
Ack
Using symbols, however, would (with some future extension to make it extensible) allow for:
I don't get how it's easier, other than being able to skip naming the
symbol name. e.g. adding union and intersection operatorsfunction __union(...){}
function __intersection(...){}vs
operator ∪(...){}
operator ∩(...){}In fact, I find one of those quite a bit easier to read...
If the list of operators is expanded by the engine, yes. The point is that IF it were decided in the future to allow user-space defined operators, that would be considerably easier with a separate keyword. Eg:
class Matrix { 
operator dot(Matrix $other, bool $left) { 
// ... 
} 
}
$result = $m1 dot $m2;
Whether that is something we want to do is another question, but the operator keyword makes that logistically easy, while using __mul or __astrisk makes it logistically hard.
Using an attribute instead to bind a method to an operator, as previously suggested, would also have that flexibility if we ever wanted it. There seems to be a lot of backpressure against using attributes for such things, though, and it wouldn't cleanly self-resolve the issue of keywords that make sense on methods being nonsensical on operators. (public, static, etc.). I'd probably be fine with it myself, but I cannot speak for others.
--Larry Garfield
If the list of operators is expanded by the engine, yes. The point is that IF it were decided in the future to allow user-space defined operators, that would be considerably easier with a separate keyword.
A real-life example of this approach would be PostgreSQL, where a 
user-defined operator can be (almost) any combination of + - * / < > = ~ 
! @ # % ^ & | ` ?
It would be possible to have an open-ended naming scheme for these, 
such as "function __atSign_leftAngle" for the operator @> (which 
conventionally means "contains" in PostgreSQL) but it would be rather 
awkward compared to "operator &>" or "#[Operator('&>')]".
Regards,
-- 
Rowan Tommins 
[IMSoP]
Danack wrote:
btw, I don't really care about this naming problem. My concern is that
it's being used as a reason for introducing a special new type
function, when it's really not a big enough problem to deserve making
the language have special new syntax.
Danack wrote:
I think you've taken the position that using the symbols are cool, and
you're reasoning about how the RFC should operate from decision.
Ah, I see. That's a more fundamental objection than the technicals, I 
think. It sort of implies that any arguments I provide are justifications 
rather than arguments, which makes it difficult to have a productive 
conversation about it. You expressed a similar concern about your efforts 
to present arguments to me, which makes sense if this is your fundamental 
concern.
First, let me start off by saying that I fully acknowledge and document in 
the RFC that it is possible to provide a perfectly workable version of 
this RFC without the operator keyword. I mention as much in the RFC. If 
that is a true blocker for voters, I would at least consider it. However, I 
do believe that's the incorrect decision. Not because it's "cool". The code 
that handles the parsing of the new keyword is the only part of this RFC 
that I didn't write from scratch, it was contributed by someone more 
familiar with the parser. I feel like I could hardly have the "coolness" of 
the work being my motivating factor when I did not in fact write that part 
of the code.
But I do understand the concern. Adding complexity without reason is 
foolish, particularly on a project that impacts many people and is 
maintained by volunteers. As I immediately told you, I don't think your 
concern is without merit, and I don't think it's something that should be 
dismissed. But I clearly have (still) done a poor job communicating what I 
perceive as the factors that outweigh this concern. It's not that I think 
the concern is invalid or that it's small, it's that I view other things as 
being an acceptable tradeoff. So I'll attempt one more time to communicate 
why.
Forwards Compatibility
Other replies have touched on this, and the RFC talks about this too, but 
perhaps the language used has been skipping a couple of steps. This is, by 
far, the biggest driving factor for why I believe the operator keyword is 
the correct decision, so I will spend most of my time here.
There are two main kinds of forward compatibility achieved with a new 
keyword that are difficult to achieve with magic methods: compatibility 
with arbitrary symbol combinations, and behavior modifiers that can be 
scoped only to operators. You mention that the symbols could be replaced 
with their symbol names in english, which avoids the issue of misnaming the 
functions. But this would still require the engine to specifically support 
every symbol combination that is allowed.
Now, in this RFC I am limiting overloads to not only symbols which are 
already used, but to a specific subset of them which are predetermined. 
This is for several reasons:
- The PHP developer community will have no direct experience with operator 
 overloads unless they have experience with another language such as C# or
 python which supports them. Giving developers an initial set of operators
 that address 90% of use cases but are limited allows the PHP developer
 community time to learn and experiment with the feature while avoiding some
 of the most dangerous possible misuses, such as accidentally redefining the
 && operator in a way that breaks boolean algebra.
- This reduces the change necessary to the VM, to class entries, and to 
 the behavior of existing opcodes. This PR is already very large, and I
 wanted to make sure that it wasn't impossible for the people who
 participate here on their own time to actually consider the changes being
 suggested.
- I am already aware of several people within internals that believe any 
 version of this feature will result in uncontrolled chaos in PHP codebases.
 I think this is false, as I do not see that kind of uncontrolled chaos in
 the languages which do have this feature. However I would think that
 allowing arbitrary overloads would increase that proportion.
- This is limited to operator combinations with objects, which ALL 
 currently result in an error. That means there is no code that was working
 on PHP 8.1 that will break with this included, as all such code currently
 results in a fatal error. The current error is even the parent class of the
 error after this RFC, so even the catch blocks, if they currently exist
 in PHP codebases, should continue to work as before.
However, once a feature is added it is very difficult to change it. Not 
only for backward compatibility reasons, but for the sheer inertia of the 
massive impact that PHP has. I do not plan on ever proposing that arbitrary 
symbol combinations be allowed for overloads myself. But I cannot possibly 
know what internals might think of that possibility 10 years from now when 
this feature has been in widespread usage for a long time. Using magic 
methods makes it extremely difficult at any point in the future to allow 
PHP developers the option of an overload for say +=+. What would such a 
magic method be? __plus_equals_plus()? With some kind of magic in the 
compiler to rename symbols in certain circumstances?
That sounds far less maintainable to me. It seems more likely that even 
if it were a desired feature 10 years from now, it would be something that 
would be extremely difficult to implement, maintain, and pass.
I also elaborate in the RFC as to why I think allowing operator specific 
method modifiers is a very powerful bit of forwards compatibility as well. 
Method modifiers simply result in a change to the function flags mask, 
which is an extremely low cost lookup, which makes it very easy to 
implement such features in the future if they are desired. I want to make 
sure that once included, this feature doesn't result in a dead-end 
implementation that boxes internals out of improvements that can be made 
moving forward. I think that this is something that is far easier to do 
with the operator keyword than it is with magic methods.
Code That Promotes Correct Usage
Enums, as an example, are classes. Internally, they are classes in most 
respects. So why is a new keyword for enums useful? Not only for many of 
the same reasons listed above, but also because it is useful for the 
language to communicate to the developer that a certain thing should be 
treated differently, even if it shares a syntax. The fact that PHP 
developers can see that enums are different from classes in their code is 
not a trivial and unimportant matter.
In the same way, operator overloads are methods. Internally, they are 
methods in most respects. But it is useful for the language to 
communicate that these methods will change engine behavior. It is 
useful for it to communicate that they should be treated differently. The 
fact that PHP developers will be able to see that operators are different 
from methods will help avoid some of the concerns people have with misuse. 
It will communicate that these are areas where new maxims and new habits 
should apply, that new things must be learned and new rules followed.
This may seem like such an esoteric suggestion to some, but it follows from 
an entire field of study: human-centered design. This is a rigorous field 
which explores how technology can be designed to be used correctly.
Acceptance Of Restrictions
We can, of course, place restrictions on how operator overloads are used 
when we are concerned about causing trouble. But such restrictions will 
generate frustration and opposition in some circumstances. Enums are 
another great example. Methods on enums are simply not allowed to do things 
that will mutate the object. The engine simply prohibits it. This makes a 
lot of sense for enums, but would such restrictions be possible if enums 
were simply classes which have cases within them? Technically, certainly it 
would be possible. But while I do not hear a lot of PHP developers 
complaining about having method behavior restricted in enums, I expect that 
there would be a lot of this unnecessary noise if instead PHP developers 
saw them as "classes which have cases".
The fact that they are marked as a distinct construct simply makes such 
restrictions make more sense to the people who use them.
These are engine hooks. People should not be shoving lots of other logic 
into operator overloads. They should always be returning a result, they 
should nearly always be implemented immutably, they should document the 
logic of interaction with the given operator and nothing more. They 
shouldn't be directly called, because they should not contain the kind of 
logic that you want to directly call.
One of these restrictions that I included in this RFC was that typing the 
parameters is not optional. This is extremely useful for operator 
overloads, because you must document all the types that your implementation 
understands how to interact with, and the engine will simply not allow for 
undetermined or uncertain values to be handled. This restriction would feel 
very out of place to many in a function, because other PHP functions do not 
behave this way. But for a new thing, with a new keyword that marks itself 
as something separate? Well now it makes sense. New things have their own 
rules. Just like the restrictions on enum classes.
--
I think these things outweigh the cost of adding a new keyword, 
particularly a new keyword that is limited only to the class definition and 
that has behavior and syntax that is substantially similar to something 
developers are already familiar with. I truly believe this is the better 
way of doing this feature, I would not suggest it otherwise. And while an 
implementation that doesn't include this is possible and workable, I feel 
it is suboptimal and limiting. I feel that it is more likely to result in 
problematic usage, complaints, and buggy code from PHP developers.
This new keyword required very minimal changes to the parser, and no 
changes to the compiler. I think this is an acceptable tradeoff for the 
benefits it brings. That is the reason that I am arguing for it, and no 
other reason. I'm sorry if it seems like I am not listening to what you are 
saying. That is not the case, I take the feedback of others on this list 
very seriously. It's just that you haven't yet brought up a point which I 
haven't considered and personally decided was worth the benefits. I agree 
this will result in changes for tooling. I accept that those changes will 
be larger with a new keyword. I do not think that it is worth delivering an 
inferior version of this feature that is more prone to error and misuse, 
and is more restricted in future scope.
Jordan
Hi!
- I am already aware of several people within internals that believe any
version of this feature will result in uncontrolled chaos in PHP codebases.
I think this is false, as I do not see that kind of uncontrolled chaos in
the languages which do have this feature. However I would think that
allowing arbitrary overloads would increase that proportion.
Depends on how you define "uncontrolled chaos". I have encountered 
toolkits where the authors think it's cute to define "+" to mean 
something that has nothing to do with mathematical addition (go read the 
manual for an hour to figure what it actually does) or for += to mean 
something different than + and assignment. Some people even learn to 
love it, so far I haven't.
However, once a feature is added it is very difficult to change it. Not
only for backward compatibility reasons, but for the sheer inertia of the
massive impact that PHP has. I do not plan on ever proposing that arbitrary
symbol combinations be allowed for overloads myself. But I cannot possibly
know what internals might think of that possibility 10 years from now when
this feature has been in widespread usage for a long time. Using magic
methods makes it extremely difficult at any point in the future to allow
PHP developers the option of an overload for say +=+. What would such a
That's awesome. The only thing worse than a toolkit author that thinks 
it's cute to play with "+" is the one that thinks it's cute to invent 
some random combination of special characters and assign it some meaning 
that of course is obvious to the author, but unfortunately that's the 
only person in existence to whom is it obvious. Some people enjoy the 
code being a puzzle that you need to untangle and make a sherlockian 
detective work to even begin to understand what is going on in this 
code. Other people have work to do. And again, what's the intuitive 
difference between operators +=+@-+ and ++--=!* ?
Of course, of course I know, every feature can be abused. The difference 
here is that if it's hard to think about any use that wouldn't be an abuse.
That sounds far less maintainable to me. It seems more likely that even
if it were a desired feature 10 years from now, it would be something that
would be extremely difficult to implement, maintain, and pass.
I must notice "if" carries a lot of load here.
Stas Malyshev 
smalyshev@gmail.com
There are quite a few downstream costs for making a new type of
methods in classes. All projects that analyze code (rector,
codesniffer, PHPStorm, PhpStan/Psalm, PHPUnit's code coverage
annotations etc) would have to add a non-trivial amount of code to not
bork when reading the new syntax. Requiring more code to be added and
maintained in PHP's builtin Reflection extension is also a cost.
That's quite a bit of work for a feature that has relatively rare
use-cases.
While I'm not suggesting that this isn't an important consideration for 
voters with this RFC, (I think it should be weighed for sure), I think 
that all of the things you mentioned will need similar updates to work 
correctly with this RFC even if it was done with plain old magic methods 
instead. Again, I'm not saying it's not important, but isn't this true of 
many RFCs that add new syntax, such as enums or even things like array 
unpacking?
As for Reflection, I hadn't gotten to that yet, but I actually pushed the 
commit for it with full test coverage of the changes last night. It wasn't 
actually that big of a change, and the latest commit on the PR can be 
checked out and played if you want to test it. With support for the enum 
casing as well. Operators get an additional function flag ZEND_ACC_OPERATOR 
that made it fairly simple to implement the Reflection changes with minimal 
additional code. The new methods I mentioned are actually smaller that the 
implementations for the existing ones. As far as maintainability goes, 
removing this aspect doesn't make this RFC more maintainable in core in my 
opinion. It becomes harder to maintain it I think, as it requires more 
special casing in other places that is more obtuse and obscure.
I don't get this. The magic methods in previous drafts of the RFC
don't have a problem with & as the methods are named with 'two
underscores' + name e.g. __bitwiseAnd. That does't appear to cause a
problem with an ampersand?
This was referring to continuing to use symbol names, but without a new 
keyword.
The name of the function (e.g. __add) always refers to the symbol used
where it is used, not what it is doing.
If the code is
$a + $bthen that is an addition operator, when the
code is read. If I was reading the code, and I saw that either $a or
$b were objects, I would know to go looking for an __add magic method.
True, and I don't think it would be ambiguous this way. However, method 
names for other methods tend to describe what the function does.
" '// This function unions, it does not add'"
Then that is probably an example of an inappropriate use of operator
overloads, and so shouldn't be used as a justification for a syntax
choice.
Larry's example with dot-product and cross-product is a better example, but 
one that is less accessible to those who don't know vector math/linear 
algebra. I used this example because this is exactly how PHP treats + for 
arrays, so I would think most Collections would implement + as the union 
operator in order to remain consistent with internals.
I think this is just wrong, and makes the RFC unacceptable to me.
Although most of the code I write is code that just performs
operations as I see fit, some of the time the operations need to be
driven by user data. Even something simple like a
calculator-as-a-service would need to call the operations dynamically
from user provided data.
I'm confused why this wouldn't still be possible? First, you can still get 
the closure for the operator implementation from Reflection if you really, 
really need it. But second, with user data couldn't you use a setter to 
change the object state prior to the op, use a method specifically for 
calling as a method, or just combine them with the operator?
$obj->setValue($userData);
$result = $obj + $userData;
I also have an aesthetic preference when writing tests to be explicit
as possible, rather than concise as possible e.g.$foo->__add(5, OperandPosition::LeftSide);
$foo->__add(5, OperandPosition::RightSide);instead of:
$foo + 5;
5 + $fooAs I find that easier to reason about.
This I can understand. Again, I don't think you're wrong here, I think this 
is a matter of opinion and taste. I can understand your position, but I 
think the long-term maintainability and support the keyword offers is worth 
the short term pain.
Larry:
I largely agree here. I don't know if it's because of the
operator
choice or not, but being able to call an operator dynamically is important
in many use cases. It doesn't have to be a pristine syntax, but some way
to do that dynamically (without having a big match statement everywhere you
need to) would be very welcome.
Overall, the point in making them non-callable was to force PHP developers 
to stop thinking about these as methods. They are modifications to engine 
behavior that are given directly to PHP devs. Using them as methods would 
usually indicate incorrect usage. If that's what you need, then probably 
you need a method, not an operator overload. If you need both, then what 
I would suggest is implementing the logic as a normal method, and then 
calling that method inside of the operator overload.
Another question: Can an interface or abstract class require an operator
to be implemented? That's not currently discussed at all. (I would expect
the answer to be Yes.)
Yes. Both abstract classes and interfaces can require implementations of 
the operator keyword as they would with a method.
Jordan
I think that all of the things you mentioned will need similar
updates to work correctly with this RFC even if it was done
with plain old magic methods instead.
No, that's not true. Codesniffer and all the other tools parse magic 
methods just fine. Stuff like the coverage notation for PHPUnit would 
understand @covers BigNumber::__plus just fine.
The main piece of work each of them would need to do to support this 
RFC, if it was based on magic methods, is being able to understand 
that objects can work with operators:
$foo = new BigNumber(5);
$foo + 5; // Check that BigNumber implements the magic method __plus
That is far less work than having to add stuff to parse a new way to 
declare functions.
Jordan LeDoux wrote:
Danack wrote:
"Non-Callable - Operand implementations cannot be called on an
instance of an object the way normal methods can."
I think this is just wrong, and makes the RFC unacceptable to me.First, you can still get the closure for the operator implementation
from Reflection if you really, really need it.
Sorry, but I just find that a bizarre thing to suggest. Introducing a 
new type of function that can only be called in a particular way needs 
to have really strong reasons for that, not "oh you can still call it 
through reflection".
I think you've taken the position that using the symbols are cool, and 
you're reasoning about how the RFC should operate from decision.
I'm not sure I can make a reasonable argument against it that you 
would find persuasive, but to me it's adding a non-trivial amount of 
complexity, which tips the RFC from being acceptable, to not.
cheers 
Dan 
Ack
RFC Link: https://wiki.php.net/rfc/user_defined_operator_overloads
I'm not strongly opinionated on either approach (magic __add vs operator +) 
although I have a very slight preference to your propose operator + syntax, 
however despite very occasionally thinking that this would be a useful 
feature, I always start to think about all the awful ways this could be 
used, making debugging and reasoning about code harder, since unexpected 
magic behaviour can (and probably will be) introduced.
The proposal is technically reasonable, pending consideration of reflection 
and so on, but I just think the desire to use this in ways it shouldn't be 
used will be too great, and we'll end up with horrendous complexity. 
Perhaps I'm too cynical.
The only mitigation for unnecessary complexity I can think of is to force 
overloaded operators to be "arrow functions" to encourage only minimal 
code, e.g.
operator +(Number $other, OperandPosition $operandPos): Number => return 
new Number ($this->value + $other->value);
RFC Link: https://wiki.php.net/rfc/user_defined_operator_overloads
I'm not strongly opinionated on either approach (magic __add vs operator +)
although I have a very slight preference to your propose operator + syntax,
however despite very occasionally thinking that this would be a useful
feature, I always start to think about all the awful ways this could be
used, making debugging and reasoning about code harder, since unexpected
magic behaviour can (and probably will be) introduced.The proposal is technically reasonable, pending consideration of reflection
and so on, but I just think the desire to use this in ways it shouldn't be
used will be too great, and we'll end up with horrendous complexity.
Perhaps I'm too cynical.The only mitigation for unnecessary complexity I can think of is to force
overloaded operators to be "arrow functions" to encourage only minimal
code, e.g.operator +(Number $other, OperandPosition $operandPos): Number => return
new Number ($this->value + $other->value);
I don't think that would be possible. As many of the examples in the RFC show, there are numerous cases where an operator function/callback/thing will need branching logic internally. Even if we did that, people could just sub-call to a single function which would be just as complex as if it were in the operator callback directly.
(Though I still would like to see "short functions" in the language, despite it having been rejected once already.)
--Larry Garfield
The only mitigation for unnecessary complexity I can think of is to force
overloaded operators to be "arrow functions" to encourage only minimal
code, e.g.operator +(Number $other, OperandPosition $operandPos): Number => return
new Number ($this->value + $other->value);
I don't think that would be possible. As many of the examples in the RFC show, there are numerous cases where an operator function/callback/thing will need branching logic internally. Even if we did that, people could just sub-call to a single function which would be just as complex as if it were in the operator callback directly.
I don't know if this would actually be helpful, but you could force 
the operator definition to be an alias for a normal method. That would 
(at least partially) solve the "dynamic call" problem, because the 
underlying method would be available with the existing dynamic call syntax.
Perhaps we could use an Attribute to bind the operator to the method, 
which would also reduce the impact on tools that need to parse class 
definitions:
class  Collection{ #[Operator('+')] 
public function  union(Collection$other,  OperandPosition$operandPos)  {} 
}
An interesting extension would be to have an optional argument to the 
Attribute which binds separate methods for each direction of arguments, 
rather than exposing it as a parameter:
class  Number{ #[Operator('/', OperandPosition::LeftSide)] 
public function  divideBy(Number $divisor)  {}
#[Operator('/', OperandPosition::RightSide)] 
publicfunction  fractionOf(Number $dividend)  {} 
}
Regards,
-- 
Rowan Tommins 
[IMSoP]
Perhaps we could use an Attribute to bind the operator to the method, which would also reduce the impact on tools that need to parse class definitions:
class Collection{ #[Operator('+')]
public function union(Collection$other, OperandPosition$operandPos) {}
}An interesting extension would be to have an optional argument to the Attribute which binds separate methods for each direction of arguments, rather than exposing it as a parameter:
class Number{ #[Operator('/', OperandPosition::LeftSide)]
public function divideBy(Number $divisor) {}#[Operator('/', OperandPosition::RightSide)]
publicfunction fractionOf(Number $dividend) {}
}
Sorry about the whitespace mess in the above examples; this may or may not show better:
class Collection{ 
#[Operator('+')] 
public function union(Collection $other,  OperandPosition $operandPos)  {} 
}
class Number{ 
#[Operator('/', OperandPosition::LeftSide)] 
public function divideBy(Number $divisor)  {}
 #[Operator('/', OperandPosition::RightSide)]
 public function fractionOf(Number $dividend)  {}
}
Regards,
-- 
Rowan Tommins 
[IMSoP]
The only mitigation for unnecessary complexity I can think of is to force
overloaded operators to be "arrow functions" to encourage only minimal
code, e.g.
The 'valid' use-cases for this aren't going to minimal pieces of code.
Things like a matrix object that supports multiplying by the various 
things that matrices can be multiplied by won't fit into an arrow 
function.
Also, I fundamentally disagree with making stuff difficult to use to 
'punsish' users who want to use that feature. If you want to enforce 
something like a max line length in a coding standard, or forbidding 
usage of a feature, that is up to you and any code style tool you use.
I just think the desire to use this in ways it shouldn't be
used will be too great,
It might be an idea to add a list of bad examples to the RFC, so 
people can refer to how not to use it, rather than each programming 
team having to make the same mistakes.
cheers 
Dan 
Ack
Hello internals,
I last brought this RFC up for discussion in August, and there was
certainly interesting discussion. Since then there have been many
improvements, and I'd like to re-open discussion on this RFC. I mentioned
in the first email to the list that I was planning on taking a while before
approaching a vote, however the RFC is much closer to vote-ready now, and
I'd like to open discussion with that in mind.RFC Link: https://wiki.php.net/rfc/user_defined_operator_overloads
Hi Jordan,
Thanks a lot for your work on this RFC! I like the direction this is going.
One thing that may be worthwhile looking into is the query builder use 
case. I mentioned it before:
https://externals.io/message/115648#115771
Basically it would enable using plain PHP expressions in stead of 
strings. So in stead of
 $query->where('product.price < ?1')->setParameter(1, 100);
one could write:
 $query->where(Price < 100);
Here Price is a class that represents a database column which has a 
(static) overload of the '<' operator. The operator overload yields an 
object representing a database expression, which gets passed to the 
where() method.
In general I don't like this sort of clever construct which makes one 
wonder what on earth is going on. The reason I do like this particular 
use case is that it can simplify code and enable static analysis of 
query expressions.
Now I'm not suggesting to support this creative use of operator 
overloading in the current RFC. It may however be useful to consider if 
this use case could be supported by a future RFC in a backward 
compatible way. Perhaps the RFC could mention it as a possible future 
extension.
Kind regards, 
Dik Takken
$query->where(Price < 100);
Here Price is a class that represents a database column which has a
(static) overload of the '<' operator. The operator overload yields an
object representing a database expression, which gets passed to the
where() method.
The biggest problem with this particular example is not the operator 
overloading, but the bare word "Price", which is currently a constant 
lookup, not a class reference, as in:
const Price = 50; 
var_dump(Price < 100);
However, with any version of operator overloading that didn't limit the 
return values of the overloaded operator, you could do something like:
$query->where(Product::$price < 100)
Where the static property Product::$price is an object which overloads 
the "<" operator to return some kind of Condition object which can be 
used by the query builder.
Regards,
-- 
Rowan Tommins 
[IMSoP]
$query->where(Price < 100);
Here Price is a class that represents a database column which has a
(static) overload of the '<' operator. The operator overload yields an
object representing a database expression, which gets passed to the
where() method.The biggest problem with this particular example is not the operator
overloading, but the bare word "Price", which is currently a constant
lookup, not a class reference, as in:const Price = 50;
var_dump(Price < 100);However, with any version of operator overloading that didn't limit the
return values of the overloaded operator, you could do something like:$query->where(Product::$price < 100)
Where the static property Product::$price is an object which overloads
the "<" operator to return some kind of Condition object which can be
used by the query builder.
Cool as that would be, it poses a problem as it would mean the Price object could either be directly compariable, or query-builder-comparable, but not both. There's no way to have multiple <=> overrides in different contexts.
Also, for <=> in particular, that one is restricted to only return -1 | 0 | 1 anyway, so it wouldn't be able to return a query builder object.
--Larry Garfield
Hi Jordan,
Thanks a lot for your work on this RFC! I like the direction this is going.
One thing that may be worthwhile looking into is the query builder use
case. I mentioned it before:https://externals.io/message/115648#115771
Basically it would enable using plain PHP expressions in stead of
strings. So in stead of$query->where('product.price < ?1')->setParameter(1, 100);one could write:
$query->where(Price < 100);Here Price is a class that represents a database column which has a
(static) overload of the '<' operator. The operator overload yields an
object representing a database expression, which gets passed to the
where() method.In general I don't like this sort of clever construct which makes one
wonder what on earth is going on. The reason I do like this particular
use case is that it can simplify code and enable static analysis of
query expressions.Now I'm not suggesting to support this creative use of operator
overloading in the current RFC. It may however be useful to consider if
this use case could be supported by a future RFC in a backward
compatible way. Perhaps the RFC could mention it as a possible future
extension.Kind regards,
Dik Takken
This is not a use case I highlighted because it's one that would be 
difficult to support with this RFC. But as you say, it could be a good 
future expansion. In particular, putting a query builder object into core 
with some more advanced overloads built in may be the best way to 
accomplish this, particularly if it is built with the idea in mind that the 
entities themselves may also have overloads.
I can certainly add it to the future scope of this RFC however.
--
RE: The operator keyword and operator implementations being non-callable.
This was a limitation that I purposely placed on the operator keyword, as I 
didn't like the idea of allowing syntax of the style $obj->{'+'}(); or 
$obj->$op(); and I wanted developers to clearly understand that they 
shouldn't treat these as normal methods in the vast majority of 
circumstances. However, it seems this is one of the largest sticking points 
for those who would otherwise support the RFC. To that end, I'm considering 
removing that restriction on the operator keyword. If I were to do that, 
you'd no longer need to wrap the operator in a closure to call it, though 
the parser would still have problems with $obj->+(...);
I suppose my question then would be, is this an acceptable compromise on 
the operator keyword? It removes one of the more annoying hurdles that 
Danack mentioned and that others have pointed out, but retains much of the 
benefits of the keyword that I expressed in my last email.
Jordan
Le 16/12/2021 à 05:01, Jordan LeDoux a écrit :
This is not a use case I highlighted because it's one that would be
difficult to support with this RFC. But as you say, it could be a good
future expansion. In particular, putting a query builder object into core
with some more advanced overloads built in may be the best way to
accomplish this, particularly if it is built with the idea in mind that the
entities themselves may also have overloads.I can certainly add it to the future scope of this RFC however.
--
RE: The operator keyword and operator implementations being non-callable.
This was a limitation that I purposely placed on the operator keyword, as I
didn't like the idea of allowing syntax of the style$obj->{'+'}();or
$obj->$op();and I wanted developers to clearly understand that they
shouldn't treat these as normal methods in the vast majority of
circumstances. However, it seems this is one of the largest sticking points
for those who would otherwise support the RFC. To that end, I'm considering
removing that restriction on theoperatorkeyword. If I were to do that,
you'd no longer need to wrap the operator in a closure to call it, though
the parser would still have problems with$obj->+(...);I suppose my question then would be, is this an acceptable compromise on
the operator keyword? It removes one of the more annoying hurdles that
Danack mentioned and that others have pointed out, but retains much of the
benefits of the keyword that I expressed in my last email.Jordan
Hello,
I'm not an internals hacker nor someone who can vote, but here is my 
opinion about operator overloading: I don't like it. Nevertheless, if it 
has to be done, I'd like it to be less challenging for PHP newcomers or 
everyday developers.
An operator is not much more than a function shortcut, gmp examples 
prove that point quite well. I don't see why it is necessary to create a 
new syntax. It seems in the discussion that the magic method ship has 
sailed, but I'd much prefer it.
I don't see why a user wouldn't be able to call an operator 
method/function outside of the operator context. It has a signature: 
left operand and right operand are its parameters, and it has a return 
type. The engine internally will just call this function as userland 
code could do.
I think that adding a new syntax for it, and allowing weird function 
names which are the operator symbols will probably create some mind fuck 
in people's mind when reading the code. I like things being simple, and 
I'd love operator overloads to be simple functions, no more no less, not 
"a new thing". The more PHP syntax grows the more complex it is to learn 
and read.
Regarding the naming debate about operator names being different 
depending upon the context, I agree, but I don't care, if operators have 
a name in the engine, I'd prefer methods to carry the same name even if 
it yields a different name in the domain semantics of the object: if you 
are the low level operator function developer, you know what you are 
doing furthermore you can still comment code. If you're the end user, 
you will read the domain API documentation, not the code itself in many 
cases.
If the names are a problem, why not registering those using an attribute 
? If I remember well it was mentioned somewhere in the mail thread, it 
would provide a way to explicitly register any method, with any name, as 
being an operator implementation, sus userland could keep both syntax 
and use the one they wish (i.e. $newNumber = $number->add($otherNumber); 
or $newNumber = $number + $otherNumber. It's not insane to keep this 
possibility open, on the contrary, it'd leverage the fact that operator 
overloading in some context is just a shiny eye candy way of writing 
some domain function shortcut.
That's my opinion and I don't if it worth a penny, but in my mind, an 
operator is a function, and there's no reason that it'd be a different 
thing.
Regards,
--
Pierre
Hello internals, 
some concerns I have about operator overloading. 
(I have seen and played with operator overloading long time ago in 
C++, this is my background for these points.)
- Searchable names. 
 Methods and functions have searchable and clickable names. Operators don't.
 The "searchable" applies to grep searches in code, but also google
 searches for documentation and support.
 This adds to the concerns already raised by others, that we will see
 arbitrary operators "just because we can".
- Symmetry 
 Operators like "+" or "==" are often expected to be symmetric / commutative.
 (for "*" I would not universally expect this, e.g. matrix
 multiplication is not symmetric)
 https://en.wikipedia.org/wiki/Commutative_property
 https://en.wikipedia.org/wiki/Symmetric_function
 Having one side of the operator "own" the implementation feels wrong,
 and could lead to problems with inheritance down the line.
From C++ I remember that at the time, there was a philosophy of 
defining and implementing these kinds of operations outside of the 
objects that hold the data.
- Lack of real parameter overloading 
 Unlike C++ (or C?), we don't have real method/function overloading
 based on parameters.
 I also don't see it being added in this operator RFC (and I would
 disagree with adding it here, if we don't add it for functions/methods
 first).
 We have to solve this with inheritance override, and with if ($arg
 instanceof ...) in the implementation.
 What this does not give us is conditional return types:
This is what I would do in a language with parameter-based overloading:
class Matrix { 
operator * (Matrix $other): Matrix {..} 
operator * (float $factor): Matrix {..} 
operator * (Vector $vector): Vector {..} 
}
class Vector { 
operator * (Vector $vector): float {..} 
operator * (float $factor): Vector {..} 
}
(Or if we also have templates/generics, we could put dimension 
constraints on the types, so that we cannot multiply matrices where 
dimensions mismatch.)
Without real parameter overloading, we have to use instanceof instead, 
and we cannot have the conditional type hints.
-- Andreas
Le 16/12/2021 à 05:01, Jordan LeDoux a écrit :
This is not a use case I highlighted because it's one that would be
difficult to support with this RFC. But as you say, it could be a good
future expansion. In particular, putting a query builder object into core
with some more advanced overloads built in may be the best way to
accomplish this, particularly if it is built with the idea in mind that the
entities themselves may also have overloads.I can certainly add it to the future scope of this RFC however.
--
RE: The operator keyword and operator implementations being non-callable.
This was a limitation that I purposely placed on the operator keyword, as I
didn't like the idea of allowing syntax of the style$obj->{'+'}();or
$obj->$op();and I wanted developers to clearly understand that they
shouldn't treat these as normal methods in the vast majority of
circumstances. However, it seems this is one of the largest sticking points
for those who would otherwise support the RFC. To that end, I'm considering
removing that restriction on theoperatorkeyword. If I were to do that,
you'd no longer need to wrap the operator in a closure to call it, though
the parser would still have problems with$obj->+(...);I suppose my question then would be, is this an acceptable compromise on
the operator keyword? It removes one of the more annoying hurdles that
Danack mentioned and that others have pointed out, but retains much of the
benefits of the keyword that I expressed in my last email.Jordan
Hello,
I'm not an internals hacker nor someone who can vote, but here is my
opinion about operator overloading: I don't like it. Nevertheless, if it
has to be done, I'd like it to be less challenging for PHP newcomers or
everyday developers.An operator is not much more than a function shortcut, gmp examples
prove that point quite well. I don't see why it is necessary to create a
new syntax. It seems in the discussion that the magic method ship has
sailed, but I'd much prefer it.I don't see why a user wouldn't be able to call an operator
method/function outside of the operator context. It has a signature:
left operand and right operand are its parameters, and it has a return
type. The engine internally will just call this function as userland
code could do.I think that adding a new syntax for it, and allowing weird function
names which are the operator symbols will probably create some mind fuck
in people's mind when reading the code. I like things being simple, and
I'd love operator overloads to be simple functions, no more no less, not
"a new thing". The more PHP syntax grows the more complex it is to learn
and read.Regarding the naming debate about operator names being different
depending upon the context, I agree, but I don't care, if operators have
a name in the engine, I'd prefer methods to carry the same name even if
it yields a different name in the domain semantics of the object: if you
are the low level operator function developer, you know what you are
doing furthermore you can still comment code. If you're the end user,
you will read the domain API documentation, not the code itself in many
cases.If the names are a problem, why not registering those using an attribute
? If I remember well it was mentioned somewhere in the mail thread, it
would provide a way to explicitly register any method, with any name, as
being an operator implementation, sus userland could keep both syntax
and use the one they wish (i.e. $newNumber = $number->add($otherNumber);
or $newNumber = $number + $otherNumber. It's not insane to keep this
possibility open, on the contrary, it'd leverage the fact that operator
overloading in some context is just a shiny eye candy way of writing
some domain function shortcut.That's my opinion and I don't if it worth a penny, but in my mind, an
operator is a function, and there's no reason that it'd be a different
thing.Regards,
--
Pierre
--
To unsubscribe, visit: https://www.php.net/unsub.php
Methods and functions have searchable and clickable names. Operators don't.
The "searchable" applies to grep searches in code, but also google
That's one of the reasons why I prefer a magic methods based approach.
function __plus(...){}
can be searched for...and for future scope, something like:
function __union(...){}
is more self-documenting (imo) than:
operator ∪(...){}
Lack of real parameter overloading
Unlike C++ (or C?), we don't have real method/function overloading
based on parameters.
Java is probably a better comparison language than C.
I have a note on the core problem that method overloading would face 
for PHP here: https://phpopendocs.com/rfc_codex/method_overloading
But although they both involved the word 'overloading', operator 
overloading, and method overloading are really separate features.
we cannot have the conditional type hints.
btw you can just say 'types'.
Unlike some lesser languages, in PHP parameter types are enforced at 
run-time; they aren't hints. I believe all references to hints (in 
relation to types at least) have been removed from the PHP manual.
cheers 
Dan 
Ack
Methods and functions have searchable and clickable names. Operators don't.
The "searchable" applies to grep searches in code, but also googleThat's one of the reasons why I prefer a magic methods based approach.
function __plus(...){}
can be searched for...and for future scope, something like:
function __union(...){}
is more self-documenting (imo) than:
operator ∪(...){}
I don't mind using magic methods for this, compared to an operator keyword. 
It is also what I found is happening in python.
However, this does not give us searchability in the calling place, 
only where it is declared / implemented.
Lack of real parameter overloading
Unlike C++ (or C?), we don't have real method/function overloading
based on parameters.Java is probably a better comparison language than C.
I have a note on the core problem that method overloading would face
for PHP here: https://phpopendocs.com/rfc_codex/method_overloadingBut although they both involved the word 'overloading', operator
overloading, and method overloading are really separate features.
I see the distinction in overloading based on the object type on the 
left, vs overloading based on parameter types.
For a method call $a->f($b), the implementation of ->f() is chosen 
based on the type of $a, but not $b. 
For an operator call "$a + $b", with the system proposed here, again, 
the implementation of "+" will be chosen based on the type of $a, but 
not $b. 
For native operator calls, the implementation is chosen based on the 
types of $a and $b, but in general they are cast to the same type 
before applying the operator. 
For global function calls f($a, $b), the implementation is always the same.
In a language with parameter-based overloading, the implementation can 
be chosen based on the types of $a and $b.
This brings me back to the "symmetry" concern. 
In a call "$a->f($b)", it is very clear that the implementation is owned by $a. 
However, in an operator expression "$a + $b", it looks as if both 
sides are on equal footing, whereas in reality $a "owns" the 
implementation.
Add to this that due to the weak typing and implicit casting, 
developers could be completely misled by looking at an operator 
invocation, if a value (in our case just the left side) has an 
unexpected type in some edge cases. 
Especially if it is not clear whether the value is a scalar or an object. 
With a named method call, at least it is constrained to classes that 
implement a method with that name.
we cannot have the conditional type hints.
btw you can just say 'types'.
Unlike some lesser languages, in PHP parameter types are enforced at
run-time; they aren't hints. I believe all references to hints (in
relation to types at least) have been removed from the PHP manual.
Ok, what I mean is return type declarations. 
In a class Matrix, operator(Matrix $other): Matrix {} can be declared 
to always return Matrix, and operator(float $factor): float {} can be 
declared to always return float. 
However, with a generic operator(mixed $other): Matrix|float {}, we 
cannot natively declare when the return value will be Matrix or float. 
(a tool like psalm could still do it)
But even for parameters, if I just say "type" it won't be clear if I 
mean the declared type on the parameter, or the actual type of the 
argument value.
cheers
Dan
Ack
I see the distinction in overloading based on the object type on the
left, vs overloading based on parameter types.For a method call $a->f($b), the implementation of ->f() is chosen
based on the type of $a, but not $b.
For an operator call "$a + $b", with the system proposed here, again,
the implementation of "+" will be chosen based on the type of $a, but
not $b.
For native operator calls, the implementation is chosen based on the
types of $a and $b, but in general they are cast to the same type
before applying the operator.
For global function calls f($a, $b), the implementation is always the same.In a language with parameter-based overloading, the implementation can
be chosen based on the types of $a and $b.This brings me back to the "symmetry" concern.
In a call "$a->f($b)", it is very clear that the implementation is owned by $a.
However, in an operator expression "$a + $b", it looks as if both
sides are on equal footing, whereas in reality $a "owns" the
implementation.Add to this that due to the weak typing and implicit casting,
developers could be completely misled by looking at an operator
invocation, if a value (in our case just the left side) has an
unexpected type in some edge cases.
Especially if it is not clear whether the value is a scalar or an object.
With a named method call, at least it is constrained to classes that
implement a method with that name.
The RFC covers all of this, and the way it works around it. Absent method overloading (which I don't expect any time soon, especially given how vehemently Nikita is against it), it's likely the best we could do.
In a class Matrix, operator(Matrix $other): Matrix {} can be declared
to always return Matrix, and operator(float $factor): float {} can be
declared to always return float.
However, with a generic operator(mixed $other): Matrix|float {}, we
cannot natively declare when the return value will be Matrix or float.
(a tool like psalm could still do it)
I... have no idea what you're talking about here. The RFC as currently written is not a "generic operator". It's
operator *(Matrix $other, bool $left): Matrix
The implementer can type both $other and the return however they want. That could be Matrix in both cases, or it could be Matrix|float, or whatever. That's... the same as every other return type we have now.
--Larry Garfield
I see the distinction in overloading based on the object type on the
left, vs overloading based on parameter types.For a method call $a->f($b), the implementation of ->f() is chosen
based on the type of $a, but not $b.
For an operator call "$a + $b", with the system proposed here, again,
the implementation of "+" will be chosen based on the type of $a, but
not $b.
For native operator calls, the implementation is chosen based on the
types of $a and $b, but in general they are cast to the same type
before applying the operator.
For global function calls f($a, $b), the implementation is always the same.In a language with parameter-based overloading, the implementation can
be chosen based on the types of $a and $b.This brings me back to the "symmetry" concern.
In a call "$a->f($b)", it is very clear that the implementation is owned by $a.
However, in an operator expression "$a + $b", it looks as if both
sides are on equal footing, whereas in reality $a "owns" the
implementation.Add to this that due to the weak typing and implicit casting,
developers could be completely misled by looking at an operator
invocation, if a value (in our case just the left side) has an
unexpected type in some edge cases.
Especially if it is not clear whether the value is a scalar or an object.
With a named method call, at least it is constrained to classes that
implement a method with that name.The RFC covers all of this, and the way it works around it. Absent method overloading (which I don't expect any time soon, especially given how vehemently Nikita is against it), it's likely the best we could do.
In a class Matrix, operator(Matrix $other): Matrix {} can be declared
to always return Matrix, and operator(float $factor): float {} can be
declared to always return float.
However, with a generic operator(mixed $other): Matrix|float {}, we
cannot natively declare when the return value will be Matrix or float.
(a tool like psalm could still do it)I... have no idea what you're talking about here. The RFC as currently written is not a "generic operator". It's
operator *(Matrix $other, bool $left): Matrix
The implementer can type both $other and the return however they want. That could be Matrix in both cases, or it could be Matrix|float, or whatever. That's... the same as every other return type we have now.
Basically the same as others have been saying in more recent comments.
In a class Matrix, you might want to implement three variations of the
- operator:
- Matrix * Matrix = Matrix.
- Matrix * float = Matrix.
- Matrix * Vector = Vector. 
 Same for other classes and operators:
- Money / float = Money
- Money / Money = float
- Distance * Distance = Area
- Distance * float = Distance
Without parameter-based overloading, this needs union return types, IF 
we want to support all variations with operators:
- Matrix * (Matrix|float|Vector) = Matrix|Vector.
- Money / (Money|float) = float|Money
- Distance * (Distance|float) = Area|Distance
Which gives you a return type with some ambiguity.
With methods, you could have different method names with dedicated return types. 
The naming can be awkward, so I am giving different possibilities here.
- Matrix->mulFloat(float) = Matrix->scale(float) = Matrix
- Matrix->mul(Matrix) = Matrix::product(Matrix, Matrix) = Matrix
- Matrix->mulVector(Vector) = Vector
To me, the best seems a method name that somehow predicts the return type.
Possible solutions for the developer who is writing a Matrix class and 
who wants to use overloaded operators:
- Accept the ambiguity of the return type, and use tools like psalm to 
 be more precise.
- Only use the * operator for one or 2 of the 3 variations (those that 
 return Matrix), and introduce a regular function for the third:- Matrix * Matrix|float = Matrix
- Matrix->mulVector(Vector) = Vector
 
This "concern" is not a complete blocker for the proposal. 
For math-related use cases like the above, the natural expectation to 
use operators can be so strong that we can live with some return type 
ambiguity.
-- Andreas
--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
In a class Matrix, you might want to implement three variations of the
- operator:
- Matrix * Matrix = Matrix.
- Matrix * float = Matrix.
- Matrix * Vector = Vector.
Same for other classes and operators:- Money / float = Money
- Money / Money = float
- Distance * Distance = Area
- Distance * float = Distance
these are bad examples and nightmare to maintain. I think even more with 
lovely typed languages. Matrix*float are better implemented as method here.
On Thu, Dec 16, 2021 at 4:21 AM Andreas Hennings andreas@dqxtech.net 
wrote:
Having one side of the operator "own" the implementation feels wrong,
and could lead to problems with inheritance down the line.From C++ I remember that at the time, there was a philosophy of
defining and implementing these kinds of operations outside of the
objects that hold the data.
This makes sense in C++ because they are all statically typed and compiled, 
but the types of many things in PHP are not necessarily known until 
runtime, so it's a bit different. First, there is no situation in which the 
overloads will ever be called outside the context of an object instance. 
Any situation that would trigger a lookup for overloads will involve an 
object instance that has a value of some kind.
$val = $obj + 5;
They are not static methods because they aren't static methods in the 
engine execution context. Making them static would break scope needlessly 
in direct calls (if those are ever made), and would be a lie within the 
engine. Python is a language that handles them exactly in this manner.
Under future scope I have a section titled Polymorphic Handler Resolution 
that addresses the issue of inheritance. I decided to make that a separate 
future scope for two reasons:
- It has some unique efficiency and optimization concerns around resolving 
 the inheritance structure during an operator opline. In general developers
 expect operators to be very, very fast operations. Overloaded operators
 will always be slower, but adding inheritance checks would exacerbate that.
 I wanted to consider that separately so that it could receive its own focus.
- For those who have limited personal experience with operator overloads 
 in the context of object instances (basically, those who haven't used
 Python or something similar), the reason for that requirement might be
 difficult to see. It will be easier to demonstrate the need to voters on
 this list after this RFC is implemented I think. There's already a lot of
 domain-specific background to this. The + operator isn't commutative in
 PHP right now, for instance. If you want to see how, considerarray + array. Whether an operator is commutative is always domain dependent.
The main edge-case that will be addressed by this future scope is:
$val = $parent + $child;
Where $child is a descendent of $parent, and both implement different 
overloads. In such a case, you want to execute the overload of the child 
class. An example of this would be a Number class and a Fraction class 
that extends Number. You would want to execute the overload on Fraction 
regardless of whether it was the left or right operand. This is how the 
instanced overloads behave in Python as well. I plan on bringing that as a 
follow up RFC before 8.2 feature freeze if this RFC passes.
--
As a note, I have been swayed by the comments of many others at this point 
to make operator implementations callable directly on objects, and remove 
the callable restriction on them. The RFC has been updated to reflect this, 
and I will update the PR for it as well when I have the time.
Jordan
Hi Jordan,
Thanks for the RFC. I have a couple questions:
Suppose I have classes Foo and Bar, and I want to support the following 
operations:
- Foo * Bar (returns Foo)
- Bar * Foo (returns Foo)
If I understand correctly, there are three possible ways I could implement 
this:
a) Implement the * operator in Foo, accepting a Foo|Bar, and use the 
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and 
implement the necessary logic accordingly. 
b) Implement the * operator in Bar, accepting a Foo|Bar, and use the 
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and 
implement the necessary logic accordingly. 
c) Implement the * operator in Foo, accepting a Bar (handles Foo * Bar 
side); Implement the * operator in Bar, accepting a Foo (handles Bar * Foo 
side)
Is this understanding correct? If so, which is the preferred approach and 
why? If not, can you clarify the best way to accomplish this?
Next, suppose I also want to support int * Foo (returns int). To do this, I 
must implement * in Foo, which would look like one of the following 
(depending on which approach above)
public operator *(Foo|int $other, OperandPos $pos): Foo|int { ... } 
public operator *(Foo|Bar|int $other, OperandPos $pos): Foo|int { ... }
Now, suppose I have an operation like 42 * $foo, which as described 
above, should return int. It seems it is not possible to enforce this via 
typing, is that correct? i.e. every time I use this, I am forced to do:
$result = 42 * $foo; 
if (is_int($result)) { 
// can't just assume it's an int because * returns Foo|int 
}
Thanks, 
--Matt
Hi!
Hi Jordan,
Thanks for the RFC. I have a couple questions:
Suppose I have classes Foo and Bar, and I want to support the following
operations:
- Foo * Bar (returns Foo)
- Bar * Foo (returns Foo)
If I understand correctly, there are three possible ways I could implement
this:
And that's one of the reasons I feel so uneasy with this. When reading 
this code: $foo * $bar - how do I know which of the ways you took and 
where should I look for the code that is responsible for it? When I see 
$foo->times($bar) it's clear who's in charge and where I find the code. 
Terse code is nice but not at the expense of making it write-only.
-- 
Stas Malyshev 
smalyshev@gmail.com
On Fri, Dec 17, 2021 at 10:36 AM Stanislav Malyshev smalyshev@gmail.com 
wrote:
And that's one of the reasons I feel so uneasy with this. When reading
this code: $foo * $bar - how do I know which of the ways you took and
where should I look for the code that is responsible for it? When I see
$foo->times($bar) it's clear who's in charge and where I find the code.
Terse code is nice but not at the expense of making it write-only.
I think that something on php.net that focuses on best practices and things 
to watch out for could go a long way towards this. In general, when people 
search for information on how to do something, if that information isn't in 
the PHP manual, they'll end up getting a random answer from stackoverflow. 
I'd definitely be willing to put in some work to help on such documentation.
I very much expect that this feature will result in community developed 
standards, such as a PSR.
Jordan
When reading
this code: $foo * $bar - how do I know which of the ways you took and
where should I look for the code that is responsible for it? When I see
$foo->times($bar) it's clear who's in charge and where I find the code.
Terse code is nice but not at the expense of making it write-only.
Well, there's only two places  to look with operator overloads, but 
yes you're right, using operator overloads for single operation is not 
a good example of how they make code easier to read. The more 
complicated example from the introduction to the RFC 
https://wiki.php.net/rfc/user_defined_operator_overloads#introduction 
shows how they make complex maths easier to read.
The exact position of where that trade-off is 'worth it' is going to 
be different for different people. But one of the areas where PHP is 
'losing ground' to Python is how Python is better at processing data 
with maths, and part of that is how even trivial things, such as 
complex numbers, are quite difficult to implement and/or use in 
userland PHP.
Stanislav Malyshev wrote:
And again, what's the intuitive
difference between operators +=+@-+ and ++--=!* ?
That's not part of the RFC.
There's enough trade-offs to discuss already; people don't need to 
imagine more that aren't part of what is being proposed.
I have encountered
toolkits where the authors think it's cute to define "+" to mean
something that has nothing to do with mathematical addition
Rather than leaving everyone to make the same mistakes again, this RFC 
might be improved by having a list of stuff that it really shouldn't 
be used for. At least then anyone who violates those guidelines does 
so at their own risk. Having guidelines would also help junior devs 
point out to more senior devs that "you're trying to be clever and the 
whole team is going to regret this".
I started a 'Guidelines for operator overloads' here 
(https://github.com/Danack/GuidelinesForOperatorOverloads/blob/main/guidelines.md)
- if anyone has horrorible examples they'd like to add, PR's are 
 welcome.
cheers 
Dan 
Ack
When reading
this code: $foo * $bar - how do I know which of the ways you took and
where should I look for the code that is responsible for it? When I see
$foo->times($bar) it's clear who's in charge and where I find the code.
Terse code is nice but not at the expense of making it write-only.Well, there's only two places to look with operator overloads, but
yes you're right, using operator overloads for single operation is not
a good example of how they make code easier to read. The more
complicated example from the introduction to the RFC
https://wiki.php.net/rfc/user_defined_operator_overloads#introduction
shows how they make complex maths easier to read.
I think the example in the RFC is interesting, but not ideal to 
advertise the RFC. 
The example is with native scalar types and build-in operator implementations. 
(I don't know how GMP works internally, but for an average user of PHP 
it does not make sense to call this "overloaded")
In fact, if we add overloaded operators as in the RFC, the example 
becomes less easy to read, because now we can no longer be sure by 
just looking at the snippet:
- Are those variables scalar values, or objects?
- Are the operators using the built-in implementation or some custom 
 overloaded implementation? (depends on the operand types)
- Are the return values or intermediate values scalars or objects?
We need really good variable names, and/or other contextual 
information, to answer those questions.
This said, I am sure we can find good examples. 
In this thread, people already mentioned Matrix/Vector, Money/Currency 
and Time/Duration. 
Others would be various numbers with physical measuring units.
The exact position of where that trade-off is 'worth it' is going to
be different for different people. But one of the areas where PHP is
'losing ground' to Python is how Python is better at processing data
with maths, and part of that is how even trivial things, such as
complex numbers, are quite difficult to implement and/or use in
userland PHP.
Could be interesting to look for examples in Python. 
I was not lucky so far, but there must be something..
Stanislav Malyshev wrote:
And again, what's the intuitive
difference between operators +=+@-+ and ++--=!* ?That's not part of the RFC.
There's enough trade-offs to discuss already; people don't need to
imagine more that aren't part of what is being proposed.I have encountered
toolkits where the authors think it's cute to define "+" to mean
something that has nothing to do with mathematical additionRather than leaving everyone to make the same mistakes again, this RFC
might be improved by having a list of stuff that it really shouldn't
be used for. At least then anyone who violates those guidelines does
so at their own risk. Having guidelines would also help junior devs
point out to more senior devs that "you're trying to be clever and the
whole team is going to regret this".I started a 'Guidelines for operator overloads' here
(https://github.com/Danack/GuidelinesForOperatorOverloads/blob/main/guidelines.md)
- if anyone has horrorible examples they'd like to add, PR's are
welcome.
I think it is a good start. 
I would avoid appealing to "common sense" or "logical sense" though, 
this can mean different things to different people, and is also 
somewhat tautological, like "do good things, avoid bad things". 
More meaningful terms can be "familiar", "expectations", 
"predictable", "non-ambiguous". 
(I see this language is coming from the C++ document, but either way I 
don't like it)
Possible alternative language snippets:
For designing operators:
- Replicate familiar notations from the subject domain, e.g. maths, 
 physics, commerce. (this has some overlap with the first point)
- Return the type and value that people expect based on their 
 expectations and mental models.
- Use identifiers (class names, method names, variable names) from the 
 same subject domain language that inspires the operators.
- Avoid ambiguity: If different people will have different 
 expectations for return type and value, introduce well-named methods
 instead of overloaded operators.
- Completeness trade-off: Understand the full range of operators, and 
 type combinations for the same operator, that is common in the subject
 domain. Then decide which of those should be supported with operator
 overloads, and which should be supported with methods instead.
- Take inspiration from code examples outside of PHP.
For using operators: 
Use descriptive variable names, method names, other identifiers, and 
other hints (@var comments etc), so that the type and role of each 
variable and value can be easily understood. 
E.g. "$duration = $tStart - $tEnd;".
"If you provide constructive operators, they should not change their operands."
I think we should use the term "immutable": 
Operators should be immutable to the operands. 
If operands are objects, the operator should return a new instance 
instead of modifying the existing instance.
For +=, we should still recommend immutable behavior. It is just so 
much better :)
$price = new PriceInDollar(5); 
assert($price->amount() === 5); 
$old_price = $price; 
assert($price === $old_price);  // same object. 
$price *= 2;  // Shortcut for "$price = $price * 2;", creating a new instance. 
assert($price->amount() === 10); 
assert($old_price->amount() === 5); 
assert($price !== $old_price);  // different object.
Btw it would be really interesting to find such a list of 
recommendations for Python. 
The reference you added, 
https://isocpp.org/wiki/faq/operator-overloading#op-ov-rules, is for 
C++, which is less comparable to PHP than Python is.
-- Andreas
cheers
Dan
Ack--
To unsubscribe, visit: https://www.php.net/unsub.php
On Mon, Dec 20, 2021 at 4:43 PM Andreas Hennings andreas@dqxtech.net 
wrote:
The exact position of where that trade-off is 'worth it' is going to
be different for different people. But one of the areas where PHP is
'losing ground' to Python is how Python is better at processing data
with maths, and part of that is how even trivial things, such as
complex numbers, are quite difficult to implement and/or use in
userland PHP.Could be interesting to look for examples in Python.
I was not lucky so far, but there must be something..
...
Btw it would be really interesting to find such a list of
recommendations for Python.
The reference you added,
https://isocpp.org/wiki/faq/operator-overloading#op-ov-rules, is for
C++, which is less comparable to PHP than Python is.
During my research phase of this RFC I was able to review many different 
takes on this from the Python space. Here is one example of a community 
discussion about it:
One of the interesting things here is that most of the warnings for 
Python users discussed at the link are actually designed to not be issues 
within this RFC. That's on purpose of course, I tried to think about how 
some of the design issues in Python could be improved.
In Python, the +, +=, and ++ operators are implemented independently. In 
this RFC, you may only overload the + operator, and then the VM handles the 
appropriate surrounding logic for the other operators. For instance, with 
++$obj or $obj++, you want to return either a reference or a copy, 
depending on if it's a pre- or post-increment. In this RFC, the handling of 
when the ZVAL is returned is handled by the VM automatically, and a 
subordinate call to the opcode for + is made when appropriate. The 
reassignment += works similarly, with the ZVAL's being assigned 
automatically and a subordinate call. This vastly reduces the surface for 
inconsistency.
Another warning that is discussed is around overloading the == operator. A 
big reason for this is that the Python overloads do NOT require the == 
overload to return a particular type. Because of this, overloading the == 
operator can result in situations in Python where it is difficult to 
compare objects for equality. However, in this RFC the == operator can only 
be overloaded to return a boolean, so the semantic meaning of the operator 
remains the same. Though you could of course do something terrible and 
mutate the object during an equality comparison, you must return a boolean 
value, ensuring that the operator cannot be co-opted for other purposes 
easily. Additionally, the != and == cannot be independently implemented in 
this RFC, but can in Python.
In Python the inequality operators can be implemented independently: >, >=, 
<=, <. They also are not required to return a boolean value. In this RFC, 
independent overloads for the different comparisons are not provided. 
Instead, you must implement the <=> operator and return an int. Further, 
the int value you return is normalized to -1, 0, 1 within the engine. This 
ensures that someone could not repurpose the > operator to pull something 
out of a queue, for instance. (They could still repurpose >> to do so, but 
since the shift left and shift right operators are not used in the context 
of boolean algebra often in PHP, that's far less dangerous.) A future scope 
that I plan on working on is actually having an Ordering enum that must be 
returned by the <=> overload instead, that even more explicitly defines 
what sorts of states can be returned from this overload.
A lot of years of language design experience have been invested into 
operator overloading across various languages. I wanted to at least try to 
take advantage of all this experience when writing this RFC. It's why I say 
that PHP will end up with the most restrictive operator overloads of any 
language I'm aware of. There will still be pain points (returning union 
types is not an easy thing to eliminate without full compile time type 
resolution), but as far as buggy or problematic code, there's a lot about 
this RFC that works to prevent it.
A determined programmer can still create problems, but I find this 
(personally) an uncompelling argument against the feature. There are many 
features in PHP that a determined programmer can create problems with. The 
__get, __set, __call, and __callStatic magic methods can actually allow you 
to overload the assignment operator for certain contexts. The __toString 
magic method can already be used to mutate the object through a simple 
concatenation. The ArrayAccess interface forces you to deal with the union 
of all types (mixed), even when that doesn't make sense. And these are 
just the PHP features that in some way already interact with operators and 
objects in special circumstances.
Jordan
The exact position of where that trade-off is 'worth it' is going to
be different for different people. But one of the areas where PHP is
'losing ground' to Python is how Python is better at processing data
with maths, and part of that is how even trivial things, such as
complex numbers, are quite difficult to implement and/or use in
userland PHP.Could be interesting to look for examples in Python.
I was not lucky so far, but there must be something..
...
Btw it would be really interesting to find such a list of
recommendations for Python.
The reference you added,
https://isocpp.org/wiki/faq/operator-overloading#op-ov-rules, is for
C++, which is less comparable to PHP than Python is.During my research phase of this RFC I was able to review many different takes on this from the Python space. Here is one example of a community discussion about it:
One of the interesting things here is that most of the warnings for Python users discussed at the link are actually designed to not be issues within this RFC. That's on purpose of course, I tried to think about how some of the design issues in Python could be improved.
Right. 
Your RFC might be the best we can do in the current PHP world, and 
better than what exists in some other languages. 
The remaining concerns would apply to any operator overloading RFC in 
current PHP, and would not be special to this one.
In Python, the +, +=, and ++ operators are implemented independently. In this RFC, you may only overload the + operator, and then the VM handles the appropriate surrounding logic for the other operators. For instance, with ++$obj or $obj++, you want to return either a reference or a copy, depending on if it's a pre- or post-increment. In this RFC, the handling of when the ZVAL is returned is handled by the VM automatically, and a subordinate call to the opcode for + is made when appropriate. The reassignment += works similarly, with the ZVAL's being assigned automatically and a subordinate call. This vastly reduces the surface for inconsistency.
I see the "Implied Operators" section. 
I assume this means that a new instance will be created, and stored on 
the same variable, if the original operator is written in an 
immutable way (which it should be)?
E.g.
$money = new Money(5); 
$orig = $money; 
$m2 = $money + new Money(2); 
assert($money === $orig);  // Checking object identity. 
assert($m2 !== $orig); 
$money += new Money(1);  // Equivalent to $money = $money + new Money(1); 
assert($money !== $orig); 
assert($orig->amount() === 5);
I think we need a strong recommendation to implement operators as immutable.
Another warning that is discussed is around overloading the == operator. A big reason for this is that the Python overloads do NOT require the == overload to return a particular type. Because of this, overloading the == operator can result in situations in Python where it is difficult to compare objects for equality. However, in this RFC the == operator can only be overloaded to return a boolean, so the semantic meaning of the operator remains the same. Though you could of course do something terrible and mutate the object during an equality comparison, you must return a boolean value, ensuring that the operator cannot be co-opted for other purposes easily. Additionally, the != and == cannot be independently implemented in this RFC, but can in Python.
In Python the inequality operators can be implemented independently: >, >=, <=, <. They also are not required to return a boolean value. In this RFC, independent overloads for the different comparisons are not provided. Instead, you must implement the <=> operator and return an int. Further, the int value you return is normalized to -1, 0, 1 within the engine. This ensures that someone could not repurpose the > operator to pull something out of a queue, for instance. (They could still repurpose >> to do so, but since the shift left and shift right operators are not used in the context of boolean algebra often in PHP, that's far less dangerous.) A future scope that I plan on working on is actually having an Ordering enum that must be returned by the <=> overload instead, that even more explicitly defines what sorts of states can be returned from this overload.
A lot of years of language design experience have been invested into operator overloading across various languages. I wanted to at least try to take advantage of all this experience when writing this RFC. It's why I say that PHP will end up with the most restrictive operator overloads of any language I'm aware of. There will still be pain points (returning union types is not an easy thing to eliminate without full compile time type resolution), but as far as buggy or problematic code, there's a lot about this RFC that works to prevent it.
A determined programmer can still create problems, but I find this (personally) an uncompelling argument against the feature. There are many features in PHP that a determined programmer can create problems with. The __get, __set, __call, and __callStatic magic methods can actually allow you to overload the assignment operator for certain contexts. The __toString magic method can already be used to mutate the object through a simple concatenation. The ArrayAccess interface forces you to deal with the union of all types (mixed), even when that doesn't make sense. And these are just the PHP features that in some way already interact with operators and objects in special circumstances.
Jordan
On Tue, Dec 21, 2021 at 5:47 AM Andreas Hennings andreas@dqxtech.net 
wrote:
I see the "Implied Operators" section.
I assume this means that a new instance will be created, and stored on
the same variable, if the original operator is written in an
immutable way (which it should be)?E.g.
$money = new Money(5);
$orig = $money;
$m2 = $money + new Money(2);
assert($money === $orig); // Checking object identity.
assert($m2 !== $orig);
$money += new Money(1); // Equivalent to $money = $money + new Money(1);
assert($money !== $orig);
assert($orig->amount() === 5);I think we need a strong recommendation to implement operators as
immutable.
Yes. The documentation for operator overloads should be much larger than 
this RFC, and if this passes my focus for the rest of 8.2 will be on two 
things:
- Working on a few smaller follow up RFCs (sorting/ordering enum, 
 polymorphic handler resolution)
- Working to help on the documentation of this feature
All of the examples in the documentation should be for immutable 
implementations, and there should be an explicit recommendation for 
immutable implementations as well. With operators, mutable versions are 
created with the operators under the "Implied" section instead of by 
creating an immutable implementation of the operator itself.
Jordan
I see the "Implied Operators" section.
I assume this means that a new instance will be created, and stored on
the same variable, if the original operator is written in an
immutable way (which it should be)?E.g.
$money = new Money(5);
$orig = $money;
$m2 = $money + new Money(2);
assert($money === $orig); // Checking object identity.
assert($m2 !== $orig);
$money += new Money(1); // Equivalent to $money = $money + new Money(1);
assert($money !== $orig);
assert($orig->amount() === 5);I think we need a strong recommendation to implement operators as immutable.
Yes. The documentation for operator overloads should be much larger than this RFC, and if this passes my focus for the rest of 8.2 will be on two things:
- Working on a few smaller follow up RFCs (sorting/ordering enum, polymorphic handler resolution)
- Working to help on the documentation of this feature
All of the examples in the documentation should be for immutable implementations, and there should be an explicit recommendation for immutable implementations as well. With operators, mutable versions are created with the operators under the "Implied" section instead of by creating an immutable implementation of the operator itself.
Right. But even for the "implied" operators, I would say "mutable" 
should refer to the variable, but not to the object. 
This is what I tried to communicate with the code example.
Jordan
I think the example in the RFC is interesting, but not ideal to
advertise the RFC.
The example is with native scalar types and build-in operator implementations.
(I don't know how GMP works internally, but for an average user of PHP
it does not make sense to call this "overloaded")
I think you have misunderstood the example. GMP doesn't work with scalar 
types, it works with its own objects; the general approach is to call 
gmp_init() with a string describing a large number that cannot be 
represented by a PHP integer. This gives you an object which doesn't 
have any methods (it replaced a resource in older versions), but can be 
used with the gmp_* functions, and with mathematical operators 
overloaded in the engine.
So the questions you posed are not hypothetical:
- Are those variables scalar values, or objects?
- Are the operators using the built-in implementation or some custom
overloaded implementation? (depends on the operand types)- Are the return values or intermediate values scalars or objects?
They are objects, using an overloaded implementation of the operators, 
and returning more objects.
The only difference is that right now, you can only overload operators 
in an extension, not in userland code.
Regards,
-- 
Rowan Tommins 
[IMSoP]
I think the example in the RFC is interesting, but not ideal to
advertise the RFC.
The example is with native scalar types and build-in operator implementations.
(I don't know how GMP works internally, but for an average user of PHP
it does not make sense to call this "overloaded")I think you have misunderstood the example. GMP doesn't work with scalar
types, it works with its own objects; the general approach is to call
gmp_init() with a string describing a large number that cannot be
represented by a PHP integer. This gives you an object which doesn't
have any methods (it replaced a resource in older versions), but can be
used with the gmp_* functions, and with mathematical operators
overloaded in the engine.
Wow, you are right. I should read more before I post. 
Thank you Rowan! 
Sorry everybody for the distraction.
So the questions you posed are not hypothetical:
Indeed. 
The "concern" already applies for those extension-provided operator overloads.
- Are those variables scalar values, or objects?
- Are the operators using the built-in implementation or some custom
overloaded implementation? (depends on the operand types)- Are the return values or intermediate values scalars or objects?
They are objects, using an overloaded implementation of the operators,
and returning more objects.
Well the initial values could be scalar or GMP. As soon as we hit any 
gmp_*() function, the return type is going to be GMP. 
In the rewritten example using mostly operators, the gmp_invert() is 
the only part that guarantees the return type to be GMP. 
Without that gmp_invert(), the return value could as well be scalar, 
if all initial variables are.
float|GMP * float|GMP = float|GMP 
gmp_mul(float|GMP, float|GMP) = GMP
The only difference is that right now, you can only overload operators
in an extension, not in userland code.
So the same "concern" already applies here, 
But it can be outweighed by the benefit.
Regards,
--
Rowan Tommins
[IMSoP]--
To unsubscribe, visit: https://www.php.net/unsub.php
Hi Jordan,
Thanks for the RFC. I have a couple questions:
Suppose I have classes Foo and Bar, and I want to support the following
operations:
- Foo * Bar (returns Foo)
- Bar * Foo (returns Foo)
If I understand correctly, there are three possible ways I could implement
this:a) Implement the * operator in Foo, accepting a Foo|Bar, and use the
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and
implement the necessary logic accordingly.
b) Implement the * operator in Bar, accepting a Foo|Bar, and use the
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and
implement the necessary logic accordingly.
c) Implement the * operator in Foo, accepting a Bar (handles Foo * Bar
side); Implement the * operator in Bar, accepting a Foo (handles Bar * Foo
side)Is this understanding correct? If so, which is the preferred approach and
why? If not, can you clarify the best way to accomplish this?
You are correct in your understanding. All three of these would accomplish 
what you want, but would have varying levels of maintainability. Which you 
choose would depend on the specifics of the Foo and Bar class. For 
instance, if the Bar class was one that you didn't ever expect to use on 
its own with operators, only in combination with Foo, then it would make 
sense to use option 1. The inverse would be true if Bar was the only one 
you ever expected to use with operators on its own.
The better way, in general, would be for Foo and Bar to extend a common 
class that implements the overload in the same way for both. In most 
circumstances, (but not all), if you have two different objects used with 
each other with operators, they should probably share a parent class or be 
instances of the same class. Like I said, this isn't always true, but for 
the majority of use cases I would expect it is.
Next, suppose I also want to support int * Foo (returns int). To do this,
I must implement * in Foo, which would look like one of the following
(depending on which approach above)public operator *(Foo|int $other, OperandPos $pos): Foo|int { ... }
public operator *(Foo|Bar|int $other, OperandPos $pos): Foo|int { ... }Now, suppose I have an operation like
42 * $foo, which as described
above, should return int. It seems it is not possible to enforce this via
typing, is that correct? i.e. every time I use this, I am forced to do:$result = 42 * $foo;
if (is_int($result)) {
// can't just assume it's an int because * returns Foo|int
}
In general I would say that returning a union from an operator overload is 
a recipe for problems. I would either always return an int, or always 
return an instance of the calling class. Mostly, this is because any scalar 
can be easily represented with a class as well.
Jordan
On Fri, Dec 17, 2021 at 10:37 AM Jordan LeDoux jordan.ledoux@gmail.com 
wrote:
Hi Jordan,
Thanks for the RFC. I have a couple questions:
Suppose I have classes Foo and Bar, and I want to support the following
operations:
- Foo * Bar (returns Foo)
- Bar * Foo (returns Foo)
If I understand correctly, there are three possible ways I could
implement this:a) Implement the * operator in Foo, accepting a Foo|Bar, and use the
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and
implement the necessary logic accordingly.
b) Implement the * operator in Bar, accepting a Foo|Bar, and use the
OperandPosition to determine if I am doing Foo * Bar or Bar * Foo and
implement the necessary logic accordingly.
c) Implement the * operator in Foo, accepting a Bar (handles Foo * Bar
side); Implement the * operator in Bar, accepting a Foo (handles Bar * Foo
side)Is this understanding correct? If so, which is the preferred approach and
why? If not, can you clarify the best way to accomplish this?You are correct in your understanding. All three of these would accomplish
what you want, but would have varying levels of maintainability. Which you
choose would depend on the specifics of the Foo and Bar class. For
instance, if the Bar class was one that you didn't ever expect to use on
its own with operators, only in combination with Foo, then it would make
sense to use option 1. The inverse would be true if Bar was the only one
you ever expected to use with operators on its own.The better way, in general, would be for Foo and Bar to extend a common
class that implements the overload in the same way for both. In most
circumstances, (but not all), if you have two different objects used with
each other with operators, they should probably share a parent class or be
instances of the same class. Like I said, this isn't always true, but for
the majority of use cases I would expect it is.Next, suppose I also want to support int * Foo (returns int). To do this,
I must implement * in Foo, which would look like one of the following
(depending on which approach above)public operator *(Foo|int $other, OperandPos $pos): Foo|int { ... }
public operator *(Foo|Bar|int $other, OperandPos $pos): Foo|int { ... }Now, suppose I have an operation like
42 * $foo, which as described
above, should return int. It seems it is not possible to enforce this via
typing, is that correct? i.e. every time I use this, I am forced to do:$result = 42 * $foo;
if (is_int($result)) {
// can't just assume it's an int because * returns Foo|int
}In general I would say that returning a union from an operator overload is
a recipe for problems. I would either always return an int, or always
return an instance of the calling class. Mostly, this is because any scalar
can be easily represented with a class as well.Jordan
Hi Jordan,
Thanks for the info. I share Stas's unease with having many different 
places we must look in order to understand what $foo * $bar actually 
executes. I'm also uneasy with the requirement of union typing in order for 
an operator to support multiple types. This will lead to implementations 
which are essentially many methods packed into one: one "method" for each 
type in the union, and potentially one "method" for each LHS vs. RHS. When 
combined, these two issues will make readability difficult. It will be 
difficult to know what $foo * $bar actually executes, and once we find it, 
the implementation may be messy.
I agree that returning a union is a recipe for a problem, but the fact that 
the input parameter must be a union can imply that the return value must 
also be a union. For example, Num * Num may return Num, but Num * Vector3 
may return Vector3, or Vector3 * Vector3 may represent dot product and 
return Num. But let's not get hung up on specific scenarios; it's a problem 
that exists in the general sense, and I believe that if PHP is to offer 
operator overloading, it should do so in a way that is type safe and 
unambiguous.
Method overloading could address both issues (LHS always "owns" the 
implementation, and has a separate implementation for each type allowed on 
the RHS). But I see this as a non-starter because it would not allow scalar 
types on the LHS.
It's difficult to think of a solution that addresses both of these issues 
without introducing more. One could imagine something like the following:
register_operator(, function (Foo $lhs, Bar $rhs): Foo { ...}); 
register_operator(, function (Bar $lhs, Foo $rhs): Foo { ...}); 
register_operator(*, function (int $lhs, Foo $rhs): int { ...});
But this just brings a new set of problems, including visibility issues 
(i.e. can't use private fields in the implementation), and the fact that 
this requires executing a function at runtime rather than being defined at 
compile time.
I don't have any ideas that address all of these issues, but I do think 
they deserve further thought.
Thanks, 
--Matt
Hello internals,
register_operator(, function (Foo $lhs, Bar $rhs): Foo { ...});
register_operator(, function (Bar $lhs, Foo $rhs): Foo { ...});
register_operator(*, function (int $lhs, Foo $rhs): int { ...});But this just brings a new set of problems, including visibility issues
(i.e. can't use private fields in the implementation), and the fact that
this requires executing a function at runtime rather than being defined at
compile time.
Since this is going deeply into magic land anyways, we could go another step further 
and make this a builtin/"macro" that does happen at compile-time, but also can 
impose additional restrictions on what is 
allowed - namely, that the registered function must not be inlined but a static 
method on one of the arguments.
For example: (syntax completely imaginary here but slightly inspired by rust):
register_operator!(+, lhs: Bar, rhs: Foo, ret: Bar, Bar::addFooBar); 
register_operator!(+, lhs: Bar, rhs: Bar, ret: Bar, Bar::addBar); 
register_operator!(+, lhs: int, rhs: Bar, ret: int, Bar::addBarInt); 
register_operator!(+, lhs: Foo, rhs: int, ret: Foo, Foo::addFooInt, commutative: true);
with
class Bar { 
... 
public static addFooBar (Bar $bar, Foo $foo): Bar { } 
// etc. 
}
Advantages:
- Explicitly named methods that can be called/tested separately
- Slightly improved searchability - grepping "register_operator" will show all operator 
 combinations inside a code base
- Cannot implement operators for arbitrary classes that one does not own - the method 
 must be from one of the operands
- Multiple distinct methods per operand/class without full method overloading
- No restrictions around having scalar types only as rhs
If I am not mistaken, the engine should also be able to typecheck the methods 
to ensure that the types are correct, and additionally also be able to disallow 
overlaps (eg. defining Foo+Bar commutitatively as well as Bar+Foo), 
which should throw an error as soon as the second definition is encountered.
Disadvantage: This sounds like a lot of work to implement, and I am not sure 
if the checks are even possible the way I'm imagining them (with classes being 
loaded on demand, etc.). 
Also, this syntax would definitely need work, I just wanted to point out that on the 
drawing board, many of these design problems are solvable.
Whether they are worth the effort, and whether this is a good idea at all, is left 
for others to decide.
Regards, 
Mel
Thanks for the info. I share Stas's unease with having many different
places we must look in order to understand what $foo * $bar actually
executes. I'm also uneasy with the requirement of union typing in order for
an operator to support multiple types. This will lead to implementations
which are essentially many methods packed into one: one "method" for each
type in the union, and potentially one "method" for each LHS vs. RHS. When
combined, these two issues will make readability difficult. It will be
difficult to know what $foo * $bar actually executes, and once we find it,
the implementation may be messy.I agree that returning a union is a recipe for a problem, but the fact
that the input parameter must be a union can imply that the return value
must also be a union. For example, Num * Num may return Num, but Num *
Vector3 may return Vector3, or Vector3 * Vector3 may represent dot product
and return Num. But let's not get hung up on specific scenarios; it's a
problem that exists in the general sense, and I believe that if PHP is to
offer operator overloading, it should do so in a way that is type safe and
unambiguous.Method overloading could address both issues (LHS always "owns" the
implementation, and has a separate implementation for each type allowed on
the RHS). But I see this as a non-starter because it would not allow scalar
types on the LHS.It's difficult to think of a solution that addresses both of these issues
without introducing more. One could imagine something like the following:register_operator(, function (Foo $lhs, Bar $rhs): Foo { ...});
register_operator(, function (Bar $lhs, Foo $rhs): Foo { ...});
register_operator(*, function (int $lhs, Foo $rhs): int { ...});But this just brings a new set of problems, including visibility issues
(i.e. can't use private fields in the implementation), and the fact that
this requires executing a function at runtime rather than being defined at
compile time.I don't have any ideas that address all of these issues, but I do think
they deserve further thought.
With respect, these are not things that were overlooked. Method overloads 
is something that I understand to be a complete non-starter within PHP. I 
do not want to speak for other people, but I have been told multiple times 
by multiple people that this is a feature which there is significant 
resistance to, to the point of being something which should be avoided. 
Certainly, it is a separate feature from operator overloading, and 
shouldn't be included as part of this RFC.
As you noted, all of the alternatives have multiple other issues. I 
considered many different ways to implement this, and I decided that this 
particular way of doing it presented the fewest problems. The reason I made 
that decision was that problems such as visibility issues would affect 
nearly every implementation. But the issue of non-sibling type resolution 
is something which would only affect a small subset of very complicated 
programs in general. So I chose to confine the issues to the more complex 
implementations, because these are likely also the ones where the developer 
is more experienced or has more resources to solve the issues presented.
In general, unioning types should be seen as a "code smell" with this 
feature in my personal opinion. If you start to see 4, 5, 6 different types 
in your parameters, it should be a signal that you want to re-examine how 
you are implementing them. I think it works well for this purpose, as many 
developers already try to refactor code which has very complicated type 
unions. Given that method overloads were off the table, and that the only 
realistic way to provide for visibility concerns was to place the overloads 
on classes, I see the requirement of union typing the operators as a guard 
rail to help developers avoid implementations which are prone to error or 
make the program excessively complex to understand.
If we created something instead that was a global register of type 
combinations, such as those suggested by Mel, the implementations would 
likely be all in one place (some kind of bootstrap or header file), but now 
would be completely separated from the actual implementations.
I did consider all these issues quite extensively. I think that the 
solution I'm presenting creates the smallest amount of issues for the 
smallest set of users. In practice, the two most common usages for this 
feature (in my estimation) are likely to be userland scalar object 
implementations, and currency objects. Both of these are very 
self-contained, and unlikely to want to interact with external objects. The 
main applications that would be interested in doing that are complex 
mathematical libraries (the kind of application that would fit your example 
of Vector * Num). Such libraries are very likely to make subordinate calls 
within the operator overloads, as the implementations of the mathematics 
themselves are already very complex and likely used in multiple ways at 
different times (spoken from experience as someone who maintains a complex 
mathematics library). For those kinds of applications, the library itself 
is inherently complex, and I very much doubt that operator overloads will 
be the main source of complexity and confusion. When dealing with such 
math, the more difficult parts to use are things that are related to the 
math itself, such as the idea that complex numbers don't have a <=> 
relationship to other numbers but do have a == relationship, or the concept 
of stochastic rounding for applications such as machine learning.
I am definitely open to improvements and suggestions, I just want to be 
clear that this wasn't overlooked. As you wrote out, the alternatives that 
are obvious to explore present problems that would be experienced on a more 
widespread basis, and I felt it was best to avoid that. I looked at how 
other languages implement this feature as well, including Python, R, and 
C++, to examine how those programming communities interact with different 
language designs. This RFC is closest to the design of Python, as the 
concerns within Python are much more similar to the concerns within PHP. If 
you find another alternative to explore I am happy to discuss it. These 
same trade-offs exist in other languages which have this feature. Again, 
I'd look at Python for the closest analogue to this RFC, where operator 
overloads are used extensively by many of the applications you would 
expect, but do not appear to present these unstoppable complexity problems 
to most applications.
They are more widely problematic in C++, but several of the most common 
sources of pain with C++ operator overloading are entirely avoided (on 
purpose) in this RFC. You cannot overload the assignment operator, you 
cannot overload the logical operators, you cannot implement == and != with 
different logic. Even Python allows for you to define > and < with 
different logic (it doesn't even require a boolean return value). If this 
RFC were to be accepted, PHP would have some of the most restrictive and 
logically consistent operator overloads of any language I've investigated 
as part of this RFC.
Is my proposal perfect? I very much doubt that. There is always room for 
improvement. But an extreme amount of care went into trying to limit the 
amount of "gunk" this feature will generate, some of it not obvious at 
first glance of the RFC.
Jordan
In general, unioning types should be seen as a "code smell" with this
feature in my personal opinion. If you start to see 4, 5, 6 different types
in your parameters, it should be a signal that you want to re-examine how
you are implementing them. I think it works well for this purpose, as many
developers already try to refactor code which has very complicated type
unions.
I'm not sure this argument really makes sense in context, because the 
usual way to refactor a method with a lot of unioned types would be to 
create multiple methods with different names; with operator overloads, 
you clearly can't do that.
In one of the previous discussions, I shared a real life C# Money 
example: https://externals.io/message/115648#115666 I thought it would 
be interesting to see how that would look in the current proposal. Most 
of the operators are straight-forward:
public operator - (Money $other, OperandPosition $operandPos): Money 
public operator + (Money $other, OperandPosition $operandPos): Money 
public operator * (float $multiple, OperandPosition $operandPos): Money 
public operator == (Money $other, OperandPosition $operandPos): bool 
public operator <=> (Money $other, OperandPosition $operandPos): int
The division cases however are a little awkward:
/** 
 * @param float|Money $divisor A float to calculate a fraction, or 
another Money to calculate a ratio 
 * @return Money|float Money if $divisor is float, float if $divisor is 
Money 
 * @throws TypeError if $divisor is float, and OperandPosition is 
OperandPosition::RightSide 
 */ 
public operator / (float|Money $divisor, OperandPosition $operandPos): 
Money|float
The intent is to support Money / float returning Money, and Money / 
Money returning float, but not float / Money.
I don't think this kind of type list would be unusual, but it may be a 
compromise we have to live with given PHP's type system.
Regards,
-- 
Rowan Tommins 
[IMSoP]
If the names are a problem, why not registering those using an attribute
?
If there is a strong reason to use attributes, then the argument 
should start from there.
Starting from "well we could just use an attribute" and then putting 
the pressure on other people to find a reason to not use an 
attribute, is a terrible design process.
Every language that has annotations ends up with far too many of them; 
PHP is likely to end up with too many of them also. The time to push 
back against using them is now, not when the damage has been done.
But to repeat, I don't think the names of magic methods are a problem. 
Documenting that 'the name refers to the operator sigil, not to what 
the function does', avoids it being a problem to be solved.
cheers 
Dan 
Ack
On Thu, Dec 9, 2021 at 12:11 PM Jordan LeDoux jordan.ledoux@gmail.com 
wrote:
Hello internals,
I last brought this RFC up for discussion in August, and there was
certainly interesting discussion. Since then there have been many
improvements, and I'd like to re-open discussion on this RFC. I mentioned
in the first email to the list that I was planning on taking a while before
approaching a vote, however the RFC is much closer to vote-ready now, and
I'd like to open discussion with that in mind.RFC Link: https://wiki.php.net/rfc/user_defined_operator_overloads
There is a patch for this RFC, however the latest commits are not
playable. It will build, but with various problems which are being worked
on related to enums. The last playable commit can be found by checking out
this commit:https://github.com/JordanRL/php-src/commit/e044f53830a9ded19f7c16a9542521601ac3f331
This commit however does not have the enum for operator position described
in the RFC. It uses a bool instead with true being the left side, and false
being the right side.Implementation details still left:
- There are issues related to opcache/JIT still, so if you want to play
around with the playable commit disable both.- Reflection has not been updated, but the proposed updates necessary are
described in the RFC.It is a long RFC, but operator overloads are a complicated topic if done
correctly. Please review the FAQ section before asking a question, as it
covers many of the main objections or inquiries to the feature. I'd be
happy to expand on any of the answers there if prompted however.Jordan
It seems that most of the discussion and questions have happened. As such, 
I'll be opening voting on the RFC on January 3rd unless anyone believes 
there are further outstanding issues which should be discussed prior.
I've put together a small set of rules for operator overloads, guidelines 
for implementations, that the PHP community could use to start learning the 
limitations of this feature while the implementation is being finished and 
the documentation for PHP.net is being worked on: 
https://github.com/JordanRL/operator-overloads-in-php/blob/master/README.md
Jordan