Newsgroups: php.internals,php.internals Path: news.php.net Xref: news.php.net php.internals:125594 php.internals:125595 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id DA0521A00BD for ; Tue, 17 Sep 2024 17:55:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1726595866; bh=1tyyimCvdYKw8SZ0RCKRMVenO3rkncwBp8sxpI0STIE=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=oFAq8dqR8gBvXxnJ34YTCr5hEhelRNjrFiN2rYOI95KLZZ9o1tuwvTg2K8HPHY8RH 5EcTvlw7/FEnJnZjbY+Mju6S7XPNQrl0InFI/pcQHmk1bti+xYOXpR0dcF6EcP+BND J5nQvJ1ykYGRIEp/rj+4lnya60T5C5SMpi6tOwcWYQMDA1EqA2EaiQantA/IebKa1/ MKAUo4KDc0QN9BY0hGlbi4x3fDu3F+QKae9oQ2ijayQEvMxmdJo2VHm5dHrVxnE+j4 3dBKoFqXoAkjP9h3mwb7TGSvuvkxdaA8bVqzRXda7ZG5GNin1l9wQOZdeQn04sFBCv ryEx30km/g5xg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B616D180081 for ; Tue, 17 Sep 2024 17:57:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from pv50p00im-ztdg10012101.me.com (pv50p00im-ztdg10012101.me.com [17.58.6.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 17 Sep 2024 17:57:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daveyshafik.com; s=sig1; t=1726595739; bh=qRJf8ZrEXo21FnvvfwzNOrb1/dPLmwuC+V40bXhQ9NI=; h=From:Message-Id:Content-Type:Mime-Version:Subject:Date:To; b=oQn09O0tfGGndMgGA0RVJSeHf+J417Koi1R5LY9pdS+kPo1gD5p2LFPY13UjVnXxK TGl9qa7mX/uYQRqUObEWdbxXftbgjlonrkVD7yrXLqQIo/3DaKF7H55va9epJ7QwrV SgG/AvWOxKqNjMFVXyUm5CI4SVdQTuwkvh7iveSmL2lPyxlz+Oim3HPA/M2ai8HBgH RlQB8x7bY/+LflvT+wPnpmkGsY8FJTP75y+7U2Rz/61zGa0OUfrGL8igsfvYbwfPvn oWCnDAqIhy+M2oJSSThAFdR18HGuYBmIPesNXuJ4q50Vm94E+90vQNr+Xrncmfohdh ReJnInbzabiqA== Received: from smtpclient.apple (pv50p00im-dlb-asmtp-mailmevip.me.com [17.56.9.10]) by pv50p00im-ztdg10012101.me.com (Postfix) with ESMTPSA id 4BC687404D5; Tue, 17 Sep 2024 17:55:35 +0000 (UTC) Message-ID: <8C83F906-5B45-4CB9-8E6B-D85D43E74A63@daveyshafik.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_98A9404C-180F-4A4D-B036-29CAAF1114B7" Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again) Date: Tue, 17 Sep 2024 10:55:23 -0700 In-Reply-To: Cc: "Rowan Tommins [IMSoP]" , internals@lists.php.net To: Jordan LeDoux References: <2551c06a-ec1f-4870-a590-aeb5752fc944@rwec.co.uk> X-Mailer: Apple Mail (2.3776.700.51) X-Proofpoint-GUID: hXWBpTxqz6xNpcF6QLyrs7I5SimY2pu1 X-Proofpoint-ORIG-GUID: hXWBpTxqz6xNpcF6QLyrs7I5SimY2pu1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-17_08,2024-09-16_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 phishscore=0 malwarescore=0 mlxscore=0 clxscore=1030 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2308100000 definitions=main-2409170127 From: me@daveyshafik.com (Davey Shafik) --Apple-Mail=_98A9404C-180F-4A4D-B036-29CAAF1114B7 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Sep 17, 2024, at 10:15, Jordan LeDoux = wrote: >=20 >=20 >=20 > On Tue, Sep 17, 2024 at 1:18=E2=80=AFAM Rowan Tommins [IMSoP] = > wrote: >> On 14/09/2024 22:48, Jordan LeDoux wrote: >> > >> > 1. Should the next version of this RFC use the `operator` keyword, = or=20 >> > should that approach be abandoned for something more familiar? Why = do=20 >> > you feel that way? >> > >> > 2. Should the capability to overload comparison operators be = provided=20 >> > in the same RFC, or would it be better to separate that into its = own=20 >> > RFC? Why do you feel that way? >> > >> > 3. Do you feel there were any glaring design weaknesses in the=20 >> > previous RFC that should be addressed before it is re-proposed? >> > >>=20 >> I think there are two fundamental decisions which inform a lot of the=20= >> rest of the design: >>=20 >> 1. Are we over-riding *operators* or *operations*? That is, is the = user=20 >> saying "this is what happens when you put a + symbol between two Foo=20= >> objects", or "this is what happens when you add two Foo objects = together"? >=20 > If we allow developers to define arbitrary code which is executed as a = result of an operator, we will always end up allowing the first one. > =20 >> 2. How do we despatch a binary operator to one of its operands? That = is,=20 >> given $a + $b, where $a and $b are objects of different classes, how = do=20 >> we choose which implementation to run? >>=20 >=20 > This is something not many other people have been interested in so = far, but interestingly there is a lot of prior art on this question in = other languages! :)=20 >=20 > The best approach, from what I have seen and developer usage in other = languages, is somewhat complicated to follow, but I will do my best to = make sure it is understandable to anyone who happens to be following = this thread on internals. >=20 > The approach I plan to use for this question has a name: Polymorphic = Handler Resolution. The overload that is executed will be decided by the = following series of decisions: >=20 > 1. Are both of the operands objects? If not, use the overload on the = one that is. (NOTE: if neither are objects, the new code will be = bypassed entirely, so I do not need to handle this case) > 2. If they are both objects, are they both instances of the same = class? If they are, use the overload of the one on the left. > 3. If they are not objects of the same class, is one of them a direct = descendant of the other? If so, use the overload of the descendant. > 4. If neither of them are direct descendants of the other, use the = overload of the object on the left. Does it produce a type error because = it does not accept objects of the type in the other position? Return the = error and abort instead of re-trying by using the overload on the right. >=20 > This results from what it means to `extend` a class. Suppose you have = a class `Foo` and a class `Bar` that extends `Foo`. If both `Foo` and = `Bar` implement an overload, that means `Bar` inherited an overload. It = is either the same as the overload from `Foo`, in which case it = shouldn't matter which is executed, or it has been updated with even = more specific logic which is aware of the extra context that `Bar` = provides, in which case we want to execute the updated implementation. >=20 > So the implementation on the left would almost always be executed, = unless the implementation on the right comes from a class that is a = direct descendant of the class on the left. >=20 > `Foo + Bar` > `Bar + Foo` >=20 > In practice, you would very rarely (if ever) use two classes from = entirely different class inheritance hierarchies in the same overload. = That would closely tie the two classes together in a way that most = developers try to avoid, because the implementation would need to be = aware of how to handle the classes it accepts as an argument. >=20 > The exception to this that I can imagine is something like a = container, that maybe does not care what class the other object is = because it doesn't mutate it, only store it. >=20 > But for virtually every real-world use case, executing the overload = for the child class regardless of its position would be preferred, = because overloads will tend to be confined to the core types of PHP + = the classes that are part of the hierarchy the overload is designed to = interact with. > =20 >>=20 >>=20 >> Finally, a very quick note on the OperandPosition enum: I think just = a=20 >> "bool $isReversed" would be fine - the "natural" expansion of "$a+$b" = is=20 >> "$a->operator+($b, false)"; the "fallback" is "$b->operator+($a, = true)" >>=20 >>=20 >> Regards, >>=20 >> --=20 >> Rowan Tommins >> [IMSoP] >=20 > This is similar to what I originally designed, and I actually moved to = an enum based on feedback. The argument was something like `$isReversed` = or `$left` or so on is somewhat ambiguous, while the enum makes it = extremely explicit. >=20 > However, it's not a design detail I am committed to. I just want to = let you know why it was done that way. >=20 > Jordan To be clear: I=E2=80=99m very much in favor of operator overloading. I = frequently work with both Money value objects, and DateTime objects that = I need to manipulate through arithmetic with others of the same type. What if I wanted to create a generic `add($a, $b)` function, how would I = type hint the params to ensure that I only get =E2=80=9Caddable=E2=80=9D = things? I would expect that to be: - Ints - Floats - Objects of classes with =E2=80=9Coperator+=E2=80=9D defined I think that an interface is the right solution for that, and you can = just union with int/float type hints: add(int | float | Addable = =E2=80=A6$operands) (or add(int | float | (Foo & Addable) =E2=80=A6$operan= ds) Is this type of behavior even allowed? I think the intention is that it = must be otherwise the decision over which overload method gets called is = drastically simplified. Perhaps for a first iteration, operator overloads only work between = objects of the same type or their descendants =E2=80=94 and if a = descendant overrides the overload, the descendants version is used = regardless of left/right precedence. I suspect this will simplify the complexity of the magic, and solve the = majority of cases where operator overloading is desired. - Davey= --Apple-Mail=_98A9404C-180F-4A4D-B036-29CAAF1114B7 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Sep 17, 2024, at 10:15, Jordan LeDoux = <jordan.ledoux@gmail.com> wrote:



On Tue, = Sep 17, 2024 at 1:18=E2=80=AFAM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> = wrote:
On 14/09/2024 = 22:48, Jordan LeDoux wrote:
>
> 1. Should the next version = of this RFC use the `operator` keyword, or 
> should that = approach be abandoned for something more familiar? Why do 
> you feel that = way?
>
> 2. Should the capability to overload comparison = operators be provided 
> in the same RFC, = or would it be better to separate that into its own 
> RFC? Why do you = feel that way?
>
> 3. Do you feel there were any glaring = design weaknesses in the 
> previous RFC that = should be addressed before it is re-proposed?
>

I think = there are two fundamental decisions which inform a lot of the 
rest of the = design:

1. Are we over-riding *operators* or *operations*? That = is, is the user 
saying "this is what = happens when you put a + symbol between two Foo 
objects", or "this is = what happens when you add two Foo objects = together"?

If we allow developers to = define arbitrary code which is executed as a result of an operator, we = will always end up allowing the first = one.
 
2. How do we despatch a binary operator to one of = its operands? That is, 
given $a + $b, where $a = and $b are objects of different classes, how do 
we choose which = implementation to run?


This is = something not many other people have been interested in so far, but = interestingly there is a lot of prior art on this question in other = languages! :) 

The best approach, from what I have seen and developer usage in other = languages, is somewhat complicated to follow, but I will do my best to = make sure it is understandable to anyone who happens to be following = this thread on internals.

The approach I plan = to use for this question has a name: Polymorphic Handler Resolution. The = overload that is executed will be decided by the following series of = decisions:

1. Are both of the operands objects? = If not, use the overload on the one that is. (NOTE: if neither are = objects, the new code will be bypassed entirely, so I do not need to = handle this case)
2. If they are both objects, are they both = instances of the same class? If they are, use the overload of the one on = the left.
3. If they are not objects of the same class, is one = of them a direct descendant of the other? If so, use the overload of the = descendant.
4. If neither of them are direct descendants of = the other, use the overload of the object on the left. Does it produce a = type error because it does not accept objects of the type in the other = position? Return the error and abort instead of re-trying by using the = overload on the right.

This results from what = it means to `extend` a class. Suppose you have a class `Foo` and a class = `Bar` that extends `Foo`. If both `Foo` and `Bar` implement an overload, = that means `Bar` inherited an overload. It is either the same as the = overload from `Foo`, in which case it shouldn't matter which is = executed, or it has been updated with even more specific logic which is = aware of the extra context that `Bar` provides, in which case we want to = execute the updated implementation.

So the = implementation on the left would almost always be executed, unless the = implementation on the right comes from a class that is a direct = descendant of the class on the left.

`Foo + = Bar`
`Bar + Foo`

In practice, you = would very rarely (if ever) use two classes from entirely different = class inheritance hierarchies in the same overload. That would closely = tie the two classes together in a way that most developers try to avoid, = because the implementation would need to be aware of how to handle the = classes it accepts as an argument.

The = exception to this that I can imagine is something like a container, that = maybe does not care what class the other object is because it doesn't = mutate it, only store it.

But for virtually = every real-world use case, executing the overload for the child class = regardless of its position would be preferred, because overloads will = tend to be confined to the core types of PHP + the classes that are part = of the hierarchy the overload is designed to interact = with.
 


Finally, a very quick note on the = OperandPosition enum: I think just a 
"bool $isReversed" = would be fine - the "natural" expansion of "$a+$b" is 
"$a->operator+($b, = false)"; the "fallback" is "$b->operator+($a, = true)"


Regards,

-- 
Rowan = Tommins
[IMSoP]

This is similar = to what I originally designed, and I actually moved to an enum based on = feedback. The argument was something like `$isReversed` or `$left` or so = on is somewhat ambiguous, while the enum makes it extremely = explicit.

However, it's not a design detail I = am committed to. I just want to let you know why it was done that = way.

Jordan

To be clear: I=E2=80=99m very much in favor of operator = overloading. I frequently work with both Money value objects, and = DateTime objects that I need to manipulate through arithmetic with = others of the same type.

What if I wanted to = create a generic `add($a, $b)` function, how would I type hint the = params to ensure that I only get =E2=80=9Caddable=E2=80=9D things? I = would expect that to be:

- Ints
- = Floats
- Objects of classes with =E2=80=9Coperator+=E2=80=9D = defined

I think that an interface is the right = solution for that, and you can just union with int/float type hints: = add(int | float | Addable =E2=80=A6$operands) (or add(int | float | (Foo = & Addable) =E2=80=A6$operands)

Is this type = of behavior even allowed? I think the intention is that it must be = otherwise the decision over which overload method gets called is = drastically simplified.

Perhaps for a first = iteration, operator overloads only work between objects of the same type = or their descendants =E2=80=94 and if a descendant overrides the = overload, the descendants version is used regardless of left/right = precedence.

I suspect this will simplify the complexity of = the magic, and solve the majority of cases where operator overloading is = desired.

- Davey
= --Apple-Mail=_98A9404C-180F-4A4D-B036-29CAAF1114B7--