Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:125600
Feedback-ID: id5114917:Fastmail
Content-Type: multipart/alternative;
 boundary="------------UoRLhg2uzwzXU6Zpc6nRHVZz"
Message-ID: <0f0444eb-8fc5-4c56-8528-5aa528988e73@rwec.co.uk>
Date: Tue, 17 Sep 2024 20:25:43 +0100
Precedence: bulk
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads
 (again)
To: internals@lists.php.net
References: <CAMrTa2E2m00gOoGy5FbvkrLwM1AUP3ZmJXHkxh_R2qEszQ=NWA@mail.gmail.com>
 <2551c06a-ec1f-4870-a590-aeb5752fc944@rwec.co.uk>
 <CAMrTa2FLJwjwy9uGpRJJJFMUD1vs=vqzv8x2VxMsYeBqMN4m9Q@mail.gmail.com>
Content-Language: en-GB
In-Reply-To: <CAMrTa2FLJwjwy9uGpRJJJFMUD1vs=vqzv8x2VxMsYeBqMN4m9Q@mail.gmail.com>
From: imsop.php@rwec.co.uk ("Rowan Tommins [IMSoP]")

This is a multi-part message in MIME format.
--------------UoRLhg2uzwzXU6Zpc6nRHVZz
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

On 17/09/2024 18:15, Jordan LeDoux wrote:
>
>     1. Are we over-riding *operators* or *operations*? That is, is the
>     user
>     saying "this is what happens when you put a + symbol between two Foo
>     objects", or "this is what happens when you add two Foo objects
>     together"?
>
>
> If we allow developers to define arbitrary code which is executed as a 
> result of an operator, we will always end up allowing the first one.


I don't think that's really true. Take the behaviour of comparisons in 
your previous RFC: if that RFC had been accepted, the user would have 
had no way to make $a < $b and $a > $b have different behaviour, because 
the same overload would be called, with the same parameters, in both cases.

Slightly less strict is requiring groups of operators: the Haskell "num" 
typeclass (roughly similar to an interface) requires definitions for all 
of "+", "*", "abs", "signum", "fromInteger", and either unary or binary 
"-". It also defines the type signatures for each. If this was the only 
way to overload the "+" operator, users would have to really go out of 
their way to use it to mean something unrelated addition.

As it happens, Haskell *does* allow arbitrary operator overloads, and in 
fact goes to the other extreme and allows entirely new operators to be 
invented. The same is true in PostgreSQL - you can implement the 
<<//-^+^-//>>  operator if you want to.

I think it's absolutely possible - and desirable - to choose a 
philosophical position on that spectrum, and use it to drive design 
decisions. The choice of "__add" vs "operator+" is one such decision.


> The approach I plan to use for this question has a name: Polymorphic 
> Handler Resolution. The overload that is executed will be decided by 
> the following series of decisions:
>
> 1. Are both of the operands objects? If not, use the overload on the 
> one that is. (NOTE: if neither are objects, the new code will be 
> bypassed entirely, so I do not need to handle this case)
> 2. If they are both objects, are they both instances of the same 
> class? If they are, use the overload of the one on the left.
> 3. If they are not objects of the same class, is one of them a direct 
> descendant of the other? If so, use the overload of the descendant.
> 4. If neither of them are direct descendants of the other, use the 
> overload of the object on the left. Does it produce a type error 
> because it does not accept objects of the type in the other position? 
> Return the error and abort instead of re-trying by using the overload 
> on the right.


This is option (g) in my list, with the additional "prefer sub-classes" 
rule (step 3), which I agree would be a good addition.

As noted, it doesn't provide symmetry, because step 4 depends on the 
order in the source code. Option (c) is the same algorithm without step 
4, so guarantees that $a + $b and $b + $a will always call the same method.

Options (d), (e), and (f) each add an extra step: one operand can signal 
"I don't know" and the other operand gets a chance to answer. They're 
essentially ways to "partially implement" an operator.

Options (a) and (b) perform the same kind of polymorphic resolution on 
*both* operands, which is how many languages work for functions and/or 
methods already.


Reading the C# spec, if there is more than one candidate overload which 
is equally specific, an error is raised. I guess you could do the same 
even with one implementation per class, by replacing step 4 in your 
algorithm:

 > 4. If neither of them are direct descendants of the other, and only 
one implements the operator, use it.
 > 5. If neither of them are direct descendants of the other, and both 
implement the operator, throw an error.

Let's call that option (h) :)


By the way, searching online for the phrase "Polymorphic Handler 
Resolution" finds no results other than you saying it is the name for 
this algorithm.


> This is similar to what I originally designed, and I actually moved to 
> an enum based on feedback. The argument was something like 
> `$isReversed` or `$left` or so on is somewhat ambiguous, while the 
> enum makes it extremely explicit.


Ah, fair enough. Explicitness vs conciseness is always a trade-off. My 
thinking was that the "reversed" form would be far more rarely called 
than the "normal" form; but that depends a lot on which resolution 
algorithm is used.


Regards,

-- 
Rowan Tommins
[IMSoP]

--------------UoRLhg2uzwzXU6Zpc6nRHVZz
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit

<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">On 17/09/2024 18:15, Jordan LeDoux
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAMrTa2FLJwjwy9uGpRJJJFMUD1vs=vqzv8x2VxMsYeBqMN4m9Q@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr"><br>
        <div class="gmail_quote">
          <blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            1. Are we over-riding *operators* or *operations*? That is,
            is the user <br>
            saying "this is what happens when you put a + symbol between
            two Foo <br>
            objects", or "this is what happens when you add two Foo
            objects together"?<br>
          </blockquote>
          <div><br>
          </div>
          <div>If we allow developers to define arbitrary code which is
            executed as a result of an operator, we will always end up
            allowing the first one.<br>
          </div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>I don't think that's really true. Take the behaviour of
      comparisons in your previous RFC: if that RFC had been accepted,
      the user would have had no way to make $a &lt; $b and $a &gt; $b
      have different behaviour, because the same overload would be
      called, with the same parameters, in both cases.</p>
    <p>Slightly less strict is requiring groups of operators: the
      Haskell "num" typeclass (roughly similar to an interface) requires
      definitions for all of "+", "*", "abs", "signum", "fromInteger",
      and either unary or binary "-". It also defines the type
      signatures for each. If this was the only way to overload the "+"
      operator, users would have to really go out of their way to use it
      to mean something unrelated addition.</p>
    <p>As it happens, Haskell *does* allow arbitrary operator overloads,
      and in fact goes to the other extreme and allows entirely new
      operators to be invented. The same is true in PostgreSQL - you can
      implement the &lt;&lt;//-^+^-//&gt;&gt;  operator if you want to.<br>
    </p>
    <p>I think it's absolutely possible - and desirable - to choose a
      philosophical position on that spectrum, and use it to drive
      design decisions. The choice of "__add" vs "operator+" is one such
      decision.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAMrTa2FLJwjwy9uGpRJJJFMUD1vs=vqzv8x2VxMsYeBqMN4m9Q@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <div> </div>
          The approach I plan to use for this question has a name:
          Polymorphic Handler Resolution. The overload that is executed
          will be decided by the following series of decisions:
          <div><br>
          </div>
          <div>1. Are both of the operands objects? If not, use the
            overload on the one that is. (NOTE: if neither are objects,
            the new code will be bypassed entirely, so I do not need to
            handle this case)</div>
          <div>2. If they are both objects, are they both instances of
            the same class? If they are, use the overload of the one on
            the left.</div>
          <div>3. If they are not objects of the same class, is one of
            them a direct descendant of the other? If so, use the
            overload of the descendant.</div>
          <div>4. If neither of them are direct descendants of the
            other, use the overload of the object on the left. Does it
            produce a type error because it does not accept objects of
            the type in the other position? Return the error and abort
            instead of re-trying by using the overload on the right.</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>This is option (g) in my list, with the additional "prefer
      sub-classes" rule (step 3), which I agree would be a good
      addition.<br>
    </p>
    <p>As noted, it doesn't provide symmetry, because step 4 depends on
      the order in the source code. Option (c) is the same algorithm
      without step 4, so guarantees that $a + $b and $b + $a will always
      call the same method.</p>
    <p>Options (d), (e), and (f) each add an extra step: one operand can
      signal "I don't know" and the other operand gets a chance to
      answer. They're essentially ways to "partially implement" an
      operator.</p>
    <p>Options (a) and (b) perform the same kind of polymorphic
      resolution on *both* operands, which is how many languages work
      for functions and/or methods already. </p>
    <p><br>
    </p>
    <p>Reading the C# spec, if there is more than one candidate overload
      which is equally specific, an error is raised. I guess you could
      do the same even with one implementation per class, by replacing
      step 4 in your algorithm:<br>
    </p>
    <p>&gt; 4. If neither of them are direct descendants of the other,
      and only one implements the operator, use it.<br>
      &gt; 5. If neither of them are direct descendants of the other,
      and both implement the operator, throw an error.</p>
    <p>Let's call that option (h) :)<br>
    </p>
    <p><br>
    </p>
    <p>By the way, searching online for the phrase "Polymorphic Handler
      Resolution" finds no results other than you saying it is the name
      for this algorithm.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAMrTa2FLJwjwy9uGpRJJJFMUD1vs=vqzv8x2VxMsYeBqMN4m9Q@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <div>This is similar to what I originally designed, and I
            actually moved to an enum based on feedback. The argument
            was something like `$isReversed` or `$left` or so on is
            somewhat ambiguous, while the enum makes it extremely
            explicit.</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>Ah, fair enough. Explicitness vs conciseness is always a
      trade-off. My thinking was that the "reversed" form would be far
      more rarely called than the "normal" form; but that depends a lot
      on which resolution algorithm is used.<br>
    </p>
    <p><br>
    </p>
    <p>Regards,<br>
    </p>
    <pre class="moz-signature" cols="72">-- 
Rowan Tommins
[IMSoP]</pre>
  </body>
</html>

--------------UoRLhg2uzwzXU6Zpc6nRHVZz--