Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125601 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id C5DD21A00BD for ; Tue, 17 Sep 2024 20:14:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1726604213; bh=qpPjFmWk/BD6y8kJKHEShdGfPb1i6UdjhcczfWA9JAM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=i2kwk4eRJ5kNerwWLu1VT5OhR/WScp38tIgKSQvKmiZ7i7yH9K7RSt5gOcpmpd0Y3 OZTUGAody7ejbs2y5NxKuvuCBc6173OKEzhJ5tfl2JPVTENmuXqqxIAOwYAwSvOoNJ cvwFMKLGAmIxvnirqwY9j/zvTJMV0taNLJrbflf9WiOt994OuADjCbKEpaLmTq0QRr +fG6XbFn+/s+S8qpeKl+oJXbJ+ys5oUV4uD3ezvtb4W/HEKz7pMIp8hnptBBRMI0NV eOyoBuuRb8rzm0OzNk422Nul3kHvvfQ90/HE0duQuOc+3cjoaecZ9zR//ff3Lh9+9A j+pKYl3BpUXSA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B08E9180055 for ; Tue, 17 Sep 2024 20:16:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 17 Sep 2024 20:16:52 +0000 (UTC) Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-71971d20a95so1111917b3a.3 for ; Tue, 17 Sep 2024 13:14:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726604086; x=1727208886; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=n6+p8byhe3fyKf1IsUGwX5SYixTcP1vMOqUfTJ02zjY=; b=lLWijhRJU091zFRMxARgGWt6dRUHJ/QYgOZg/4DvP5In79sCEyKg2atpqH+voAX/9d rb+UecKWE9BRy/NLWFyh55CdJbRKANachxxaTG54KeChzgH5cVy7K8y113HNA37HITIA 8VTjMdaqR3Lbk7ed9TbN2opXptEhMWZEsfBvlBGd5p6hmSjetUmHu7DGI2CReRGTalBk sJGrGEuYr/GMLFVTk6X+pKKFFoa/HqFTAYCta2jeNHZpR76sQUkZHDZoCDivDrydx4Ph AYvW9g3r0CoIypPQXWJwFxBTLhengu94NLBz0XhWTymrWHWsECOI1joLlfkQTkyg7wvt XPEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726604086; x=1727208886; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=n6+p8byhe3fyKf1IsUGwX5SYixTcP1vMOqUfTJ02zjY=; b=E5yno0UQFx0nYDQyDuYD9BZZZuAml2zYN5Bkjq7/2hM5au0Cb9Fx2h1tTX0QHB2ZEb 26hhT7o4fI15rL5CC230p+jINxJq7AS17H/jbs7AkqdjeZG5wSFCNDKAGZVDlr5TlsBv vtOc2PMsu9g5zHxi6UzWFJNH0hNRFRoI/5anMFJAHPCITLvBpe0nFEwd2wBr+lrkOSn3 LVJ6IM6HLSeMl+BUv5vT16iQ/JcLKxiRxz+0WcK04EE/WFPEySieSpJJqq6Ta534gIol AaAzIuKD415mq8k7WoPOuUj30dp7tZKzPOvKQI2EANE94BlpRAgz71naUZjFM1p8jVz9 glhw== X-Gm-Message-State: AOJu0YwHsCOLq+aCgnhHbPMu14IzBUu0SfIs7ODxMve2DFcsIDSyhS+h sTH91BCFKaCmBbogzD5IEZnSgUnoGc2TOnDTrP8ACFc/ljzG57chyjk2KXEZh194jNVMCkAFO/P oDGyBqFdFpqb9TYvv0+KdAF+MiBg= X-Google-Smtp-Source: AGHT+IG+Gz1pVOvPgrgZ7sfWq56Imv/0zxgciOOWpMk03BpGrVdHy+Nt11jw/Jas17zd5fnpk7nfFg8EMErJMIDPyws= X-Received: by 2002:a05:6a00:1748:b0:70d:1dcf:e2b4 with SMTP id d2e1a72fcca58-71936a3b70emr24184851b3a.1.1726604085793; Tue, 17 Sep 2024 13:14:45 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <2551c06a-ec1f-4870-a590-aeb5752fc944@rwec.co.uk> <0f0444eb-8fc5-4c56-8528-5aa528988e73@rwec.co.uk> In-Reply-To: <0f0444eb-8fc5-4c56-8528-5aa528988e73@rwec.co.uk> Date: Tue, 17 Sep 2024 13:14:33 -0700 Message-ID: Subject: Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again) To: "Rowan Tommins [IMSoP]" Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="000000000000339fb90622565754" From: jordan.ledoux@gmail.com (Jordan LeDoux) --000000000000339fb90622565754 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Sep 17, 2024 at 12:27=E2=80=AFPM Rowan Tommins [IMSoP] wrote: > On 17/09/2024 18:15, Jordan LeDoux wrote: > > > 1. Are we over-riding *operators* or *operations*? That is, is the user >> saying "this is what happens when you put a + symbol between two Foo >> objects", or "this is what happens when you add two Foo objects together= "? >> > > If we allow developers to define arbitrary code which is executed as a > result of an operator, we will always end up allowing the first one. > > > I don't think that's really true. Take the behaviour of comparisons in > your previous RFC: if that RFC had been accepted, the user would have had > no way to make $a < $b and $a > $b have different behaviour, because the > same overload would be called, with the same parameters, in both cases. > > Slightly less strict is requiring groups of operators: the Haskell "num" > typeclass (roughly similar to an interface) requires definitions for all = of > "+", "*", "abs", "signum", "fromInteger", and either unary or binary "-". > It also defines the type signatures for each. If this was the only way to > overload the "+" operator, users would have to really go out of their way > to use it to mean something unrelated addition. > > As it happens, Haskell *does* allow arbitrary operator overloads, and in > fact goes to the other extreme and allows entirely new operators to be > invented. The same is true in PostgreSQL - you can implement the > <> operator if you want to. > > I think it's absolutely possible - and desirable - to choose a > philosophical position on that spectrum, and use it to drive design > decisions. The choice of "__add" vs "operator+" is one such decision. > > > Ah, I see. I suppose I never really entertained an idea like this because in my mind it can't even handle non-trivial math, let alone the sorts of things that people might want to use overloads for. Once you get past arithmetic with real numbers into almost any other kind of math, which operators are meaningful, and what they mean exactly, begins to depend a lot on context. This is why I felt like even if we were limiting the use cases to math projects, things like commutativity should not necessarily be enforced. The line `$a + $b` and `$b + $a` are SUPPOSED to give different results for certain types of math objects, for instance. The line `$a - $b` and `$b - $a` more obviously give different results to most people, because subtraction is not commutative even for real numbers. My personal opinion is that the RFC should not assume the overloads are used in a particular domain (like real number arithmetic), and thus should not attempt to enforce these kinds of behaviors. But, opinions like this are actually what I was hoping to receive from this thread. This could be the way forward that voters are more interested in, even if it wouldn't be my own first preference as it will be highly limiting to the applicable domains. > > The approach I plan to use for this question has a name: Polymorphic > Handler Resolution. The overload that is executed will be decided by the > following series of decisions: > > 1. Are both of the operands objects? If not, use the overload on the one > that is. (NOTE: if neither are objects, the new code will be bypassed > entirely, so I do not need to handle this case) > 2. If they are both objects, are they both instances of the same class? I= f > they are, use the overload of the one on the left. > 3. If they are not objects of the same class, is one of them a direct > descendant of the other? If so, use the overload of the descendant. > 4. If neither of them are direct descendants of the other, use the > overload of the object on the left. Does it produce a type error because = it > does not accept objects of the type in the other position? Return the err= or > and abort instead of re-trying by using the overload on the right. > > > This is option (g) in my list, with the additional "prefer sub-classes" > rule (step 3), which I agree would be a good addition. > > As noted, it doesn't provide symmetry, because step 4 depends on the orde= r > in the source code. Option (c) is the same algorithm without step 4, so > guarantees that $a + $b and $b + $a will always call the same method. > > Options (d), (e), and (f) each add an extra step: one operand can signal > "I don't know" and the other operand gets a chance to answer. They're > essentially ways to "partially implement" an operator. > > Options (a) and (b) perform the same kind of polymorphic resolution on > *both* operands, which is how many languages work for functions and/or > methods already. > > > Reading the C# spec, if there is more than one candidate overload which i= s > equally specific, an error is raised. I guess you could do the same even > with one implementation per class, by replacing step 4 in your algorithm: > > > 4. If neither of them are direct descendants of the other, and only one > implements the operator, use it. > > 5. If neither of them are direct descendants of the other, and both > implement the operator, throw an error. > > Let's call that option (h) :) > > > By the way, searching online for the phrase "Polymorphic Handler > Resolution" finds no results other than you saying it is the name for thi= s > algorithm. > > > Hmmm, I will see if I can find where I came across the term in my original research then. I did about 4 months of research for my RFC, but that was several years ago at this point, so I might be mistaken. So I understand here that you're looking for commutativity in *which overload is actually called*, even if it doesn't create commutativity in the result of the operation. That the *executed overload* should be the same no matter the order of the operands. This was something I also was interested in, but I could not find a solution I was happy with. All of the things you have detailed here have tradeoffs that I'm unsure about. This is an open question of design that I feel requires more input and more voices from others who are interested, because I don't feel like any of these approaches (including the one that I went with) are better, they are just different. > This is similar to what I originally designed, and I actually moved to an > enum based on feedback. The argument was something like `$isReversed` or > `$left` or so on is somewhat ambiguous, while the enum makes it extremely > explicit. > > > Ah, fair enough. Explicitness vs conciseness is always a trade-off. My > thinking was that the "reversed" form would be far more rarely called tha= n > the "normal" form; but that depends a lot on which resolution algorithm i= s > used. > > > Regards, > > -- > Rowan Tommins > [IMSoP] > > It would also depend on whether it is used with scalars. For instance, `$numObj - 5` and `5 - $numObj`. For both of these, you want to call the overload on `$numObj`, because it's the only avenue that won't result in a fatal error (assuming that the overload knows how to work with int values). The case of an object with an overload being used with an operand that is a non-object will most likely result in reversed calls quite frequently. This will be a prominent issue for some use cases (like arbitrary precision math), and an almost non-existent issue for other use cases (like currency or time). Jordan --000000000000339fb90622565754 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Sep 17, 2024 at 12:27=E2=80= =AFPM Rowan Tommins [IMSoP] <ims= op.php@rwec.co.uk> wrote:
=20 =20 =20
On 17/09/2024 18:15, Jordan LeDoux wrote:
=20

1. Are we over-riding *operators* or *operations*? That is, is the user
saying "this is what happens when you put a + symbol betwe= en two Foo
objects", or "this is what happens when you add two F= oo objects together"?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.


I don't think that's really true. Take the behaviour of comparisons in your previous RFC: if that RFC had been accepted, the user would have had no way to make $a < $b and $a > $b have different behaviour, because the same overload would be called, with the same parameters, in both cases.

Slightly less strict is requiring groups of operators: the Haskell "num" typeclass (roughly similar to an interface) r= equires definitions for all of "+", "*", "abs",= "signum", "fromInteger", and either unary or binary "-". It also defines the type signatures for each. If this was the only way to overload the "+= " operator, users would have to really go out of their way to use it to mean something unrelated addition.

As it happens, Haskell *does* allow arbitrary operator overloads, and in fact goes to the other extreme and allows entirely new operators to be invented. The same is true in PostgreSQL - you can implement the <<//-^+^-//>>=C2=A0 operator if you want to= .

I think it's absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of "__add" vs "operator+&= quot; is one such decision.



Ah, I see. I suppose = I never really entertained an idea like this because in my mind it can'= t even handle non-trivial math, let alone the sorts of things that people m= ight want to use overloads for. Once you get past arithmetic with real numb= ers into almost any other kind of math, which operators are meaningful, and= what they mean exactly, begins to depend a lot on context. This is why I f= elt like even if we were limiting the use cases to math projects, things li= ke commutativity should not necessarily be enforced.

The line `$a + $b` and `$b + $a` are SUPPOSED to give different results = for certain types of math objects, for instance. The line `$a - $b` and `$b= - $a` more obviously give different results to most people, because subtra= ction is not commutative even for real numbers.

My= personal opinion is that the RFC should not assume the overloads are used = in a particular domain (like real number arithmetic), and thus should not a= ttempt to enforce these kinds of behaviors. But, opinions like this are act= ually what I was hoping to receive from this thread. This could be the way = forward that voters are more interested in, even if it wouldn't be my o= wn first preference as it will be highly limiting to the applicable domains= .
=C2=A0

=C2=A0
The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.


This is option (g) in my list, with the additional "prefer sub-classes" rule (step 3), which I agree would be a good addition.

As noted, it doesn't provide symmetry, because step 4 depends on the order in the source code. Option (c) is the same algorithm without step 4, so guarantees that $a + $b and $b + $a will always call the same method.

Options (d), (e), and (f) each add an extra step: one operand can signal "I don't know" and the other operand gets a chan= ce to answer. They're essentially ways to "partially implement&quo= t; an operator.

Options (a) and (b) perform the same kind of polymorphic resolution on *both* operands, which is how many languages work for functions and/or methods already.=C2=A0


Reading the C# spec, if there is more than one candidate overload which is equally specific, an error is raised. I guess you could do the same even with one implementation per class, by replacing step 4 in your algorithm:

> 4. If neither of them are direct descendants of the other, and only one implements the operator, use it.
> 5. If neither of them are direct descendants of the other, and both implement the operator, throw an error.

Let's call that option (h) :)


By the way, searching online for the phrase "Polymorphic Handle= r Resolution" finds no results other than you saying it is the nam= e for this algorithm.



Hmmm, I will see if I= can find where I came across the term in my original research then. I did = about 4 months of research for my RFC, but that was several years ago at th= is point, so I might be mistaken.

So I understand = here that you're looking for commutativity in *which overload is actual= ly called*, even if it doesn't create commutativity in the result of th= e operation. That the *executed overload* should be the same no matter the = order of the operands.

This was something I also w= as interested in, but I could not find a solution I was happy with. All of = the things you have detailed here have tradeoffs that I'm unsure about.= This is an open question of design that I feel requires more input and mor= e voices from others who are interested, because I don't feel like any = of these approaches (including the one that I went with) are better, they a= re just different.
=C2=A0

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like `$isReversed` or `$left` or so on is somewhat ambiguous, while the enum makes it extremely explicit.


Ah, fair enough. Explicitness vs conciseness is always a trade-off. My thinking was that the "reversed" form would b= e far more rarely called than the "normal" form; but that depends= a lot on which resolution algorithm is used.


Regards,

--=20
Rowan Tommins
[IMSoP]

It would also depend on= whether it is used with scalars.

For instance, `$= numObj - 5` and `5 - $numObj`. For both of these, you want to call the over= load on `$numObj`, because it's the only avenue that won't result i= n a fatal error (assuming that the overload knows how to work with int valu= es). The case of an object with an overload being used with an operand that= is a non-object will most likely result in reversed calls quite frequently= . This will be a prominent issue for some use cases (like arbitrary precisi= on math), and an almost non-existent issue for other use cases (like curren= cy or time).

Jordan
--000000000000339fb90622565754--