Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108311 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 75841 invoked from network); 29 Jan 2020 11:10:00 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 29 Jan 2020 11:10:00 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0B5BE1804D1 for ; Wed, 29 Jan 2020 01:20:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 29 Jan 2020 01:20:36 -0800 (PST) Received: by mail-lf1-f54.google.com with SMTP id c23so11373310lfi.7 for ; Wed, 29 Jan 2020 01:20:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jJPwRHoeVH60vNYGtIDx0J7NJyvKcRCVtoWYyOelnI4=; b=IEQghXh7FhAxI2HdiiWs51J8bFTS1+g+vLdyRfU0lZt7+tERGTMIGQ78yOh1Hb337b Z6S6mZe2tfMqI+MKtOL3tEL5hziPR1bfFdNMb2SPmFgqWPl+VQIMhgtHWoRRSaFKwcb9 SBdseLLO992JJErKqYHzepF3dtyrQpbcs92MijdXK60+JTTOJPLV/Pcre0TGCJJfJkM8 8XQXzcks1jRcrF3rh4Bo4ADcO3DNWy77OCYIKVW5RFOjzJFwX0+xweT/Q2A77HeA5ggm xHoab6HGBRMyktiYHykZmKnYef/cd5316cGDUGmGJG1Wx0anQnIxALx0PCKJdQN9IU/K d1SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jJPwRHoeVH60vNYGtIDx0J7NJyvKcRCVtoWYyOelnI4=; b=PBd//NHO1SxtFTC4Ju8bniyllFahB+BGNMhWE16AO0ZSmPXCCpKOGT+OfBxBQxLQ+W BSqHr1rMW8C8quXImQ4gKiss4dwwFoFcUvnx3an2ZthVneX2TehMxWVEsnnZ0RJ7oVXt xwgXia58sMD2bh9pBYKyUuDG0yTwFCI6VSGRsVNnrX5JGaL/WOz0k3le9p7BZ/OKps1F sIPamTDw8moPjWHhdQXAkquvZ6PsRDf4AiziTX+soDbSVNQyfSYo9/SYx5KaFzPci3hH hrU775yY6zwRxQ3rCXmtvXvXj9GHoFvLG8SYlMsiCLq7tgIf2167zVMNy8O2MyREc6Xx Eejg== X-Gm-Message-State: APjAAAUBpzbnAXixw3N9CVj2fbfJ9GdmOlrhOWLRrbnp2nI7lwwT5v2y BYkwEZADDCJKi8AFtLKK5kaPwllflspYippLsDc= X-Google-Smtp-Source: APXvYqyMIec+SG+W+rcgNuKJy8wxBSmNXZfo7Yt7mDarr8KkoRJqfyY2M2SIOYFNpdJDzBTmGsPJzRXXgLXAo7DRN/0= X-Received: by 2002:a19:2351:: with SMTP id j78mr4994071lfj.173.1580289632092; Wed, 29 Jan 2020 01:20:32 -0800 (PST) MIME-Version: 1.0 References: <00ea01d5d630$b18d4f20$14a7ed60$@gmx.de> In-Reply-To: <00ea01d5d630$b18d4f20$14a7ed60$@gmx.de> Date: Wed, 29 Jan 2020 10:20:16 +0100 Message-ID: To: jan.h.boehmer@gmx.de Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000029c73c059d43da83" Subject: Re: [PHP-DEV] Operator overloading for userspace objects From: nikita.ppv@gmail.com (Nikita Popov) --00000000000029c73c059d43da83 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Jan 29, 2020 at 12:14 AM wrote: > Hello everybody, > > > > the last days I have experimented a bit with operator overloading in > userspace classes (redefing the meaning of arithmetic operations like +, = -, > *, etc. for your own classes). > > This could be useful for different libraries which implements custom > arithmetic objects (like money values, tensors, etc.) or things like > Symfony > string component (concatenate) operator, because it improves readability > much: > > $x * ($a + $b) instead of $x->multiply($a->add($b)) > > > > 4 years ago, there was a RFC about this topic ( > > https://wiki.php.net/rfc/operator-overloading), which was discussed a bit > ( > https://externals.io/message/89967), > but there was no real Outcome. > > > > I have tried to implement a proof of concept of the RFC, I encountered so= me > problems, when implementing the operator functions as (non-static) class > members and pass them only the =E2=80=9Cother=E2=80=9D argument: What hap= pens when we > encounter an expression like 2/$a and how can the class differ this from > $a/2. Also not every operation on every structure is e.g on commutative > (e.g. for matrices A*B =3D/=3D B*A). So I tried a C#-like approach, where= the > operator implementations are static functions in the class, and both > arguments are passed. In my PHP implementation this would look something > like this: > > > > Class X { > > public static function __add($lhs, $rhs) { > > //... > > } > > } > > > > The class function can so decide what to do, based on both operands (so i= t > can decide if the developer wrote 2/$a or $a/2). Also that way an > implementor can not return $this by accident, which could lead to > unintended > side effect, if the result of the operation is somehow mutated. > Using static methods sounds reasonable to me. I have taken over the idea of defining a magic function for each operation > (like Python does), because I think that way it is the clearest way to se= e, > what operators a class implements (could be useful for static analysis). > The > downside to this approach is that this increases the number of magic > functions highly (my PoC-code defines 13 additional magic functions, and > the > unary operators are missing yet), so some people in the original discussi= on > suggest to define a single (magic) function, where the operator is passed= , > and the user code decides, what to do. Advantageous is very extensible > (with > the right parser implementation, you could even define your own new > operators), with the cost that this method will become very complex for > data > structures which use multiple operators (large if-else or switch > constructions, which delegate the logic to the appropriate functions). An > other idea mentioned was to extract interfaces with common functionality > (like Arithmetically, Comparable, etc.) like done with the ArrayAccess or > Countable interfaces. The problem that I see here, is that this approach = is > rather unflexible and it would be difficult to extract really universal > interfaces (e.g. vectors does not need a division (/) operation, but the > concatenation . could be really useful for implementing dot product). Thi= s > would lead to either that only parts of the interfaces are implemented (a= nd > the other just throw exceptions) or that the interfaces contain only one = or > two functions (so we would have many interfaces instead of magic function= s > in the end). > Yes, i don't think it makes sense to group these operations in interfaces, the use-cases are just too diverse. It's possible to define one interface per operator (e.g. what Rust does), though I don't think this is going to be particularly useful in PHP. I would not want to see functions accepting int|float|(Add&Mul) show up, because someone is trying to be overly generic in their interfaces ;) As to whether it should be a single method or multiple, I would go for multiple methods, as that makes it more clear which operators are overloaded from an API perspective. On the topic which operators should be overloadable: My PoC-implementation > has magic functions for the arithmetic operators (+, -, *, /, %, **), > string > concatenation (.), and bit operations (>>, <<, &, |, ^). Comparison and > equality checks are implement using a common __compare() function, which > acts like an overload of the spaceship operator. Based if -1, 0 or +1 is > returned by the comparison operators (<, >, <=3D, >=3D, =3D=3D) are eval= uated. I > think this way we can enforce, that the assumed standard logic (e.g > !($a<$b)=3D($a>=3D$b) and ($a<$b)=3D($b>$a)) of comparison is implemented= . Also I > don=E2=80=99t think this would restrict real world applications much (if = you have > an > example, where a separate definition of < and >=3D could be useful, pleas= e > comment it). > I would recommend not handling overloading of comparisons in the same proposal. Comparison is more widely useful than other overloading and has a more complex design space (especially when it comes to accommodating objects that can only be compared for equality/inequality for example). Comparison may also benefit more from having an interface than the other operators. > Unlike the original idea, I don=E2=80=99t think it should be possible to = overwrite > identity operator (=3D=3D=3D), because it should always be possible to ch= eck if > two objects are really identical (also every case should be coverable by > equality). The same applies to the logic operators (!, ||, &&), I think > they > should always work like intended (other languages like Python and C# > handles > it that way too). > I agree that =3D=3D=3D should not be overloadable. || and && are short-circuiting, so overloading them in any meaningful way would be pretty hard anyway (we'd have to implicitly wrap the RHS into a closure or ... something?) > For the shorthand assignment operators like +=3D, -=3D the situation is a= bit > more complicated: On the one hand the user has learned that $a+=3D1 is ju= st > an > abbreviation of $=3D$a+1, so this logic should apply to overloaded operat= ors > as well (in C# it is implemented like this). On the other hand it could b= e > useful to differentiate between the two cases, so you can mutate the obje= ct > itself (in the +=3D case) instead of returning a new object instance (the > class cannot know it is assigned to its own reference, when $a + 1 is > called). Personally I don=E2=80=99t think that this would be a big proble= m, so my > PoC-Code does not provide a possibility to override the short hand > operators.) For the increment/decrement operators ($a++) it is similar, i= t > would be nice if it would be possible to overload this operator but on th= e > other hand the use cases of this operator is really limited besides integ= er > incrementation and if you want to trigger something more complex, you > should > call a method, to make clear of your intent. > At least to start with, I don't think we should offer separate overloading of $a +=3D 1 and treat it as $a =3D $a +1, as the existing operator overloa= ding implementation does. Operators currently only work on values that use by-value passing semantics, so if you write something like $b =3D $a =3D 1; $a +=3D 1; then $a will be 2, but $b will be 1. Using the $a =3D $a + 1 expansion for operator overloading ensures that this is also the case for objects. Of course there are performance concerns here, and it could in some cases be significantly more efficient to perform an in-place modification. It is possible to allow that while still keeping the above semantics by only allowing an in-place modification if $a has no over users (something that we can check in the VM). But I don't think this should be part of an initial proposal. > On the topic in which order the operators should be executed: Besides th= e > normal priority (defined by PHP), my code checks if the element on the le= ft > side is an object and tries to call the appropriate magic function on it. > If > this is not possible the same is done for the right argument. This should > cover the most of the use cases, except some cases: Consider a expression > like $a / $b, where $a and $b has different classes (class A + class B). = If > class B knows how to divide class A, but class A does not know about clas= s > B, we encounter a problem when evaluating just from left to right (and > check > if the magic method exists). A solution for that would be that object $a > can > express that he does not know how to handle class B (e.g. by returning > null, > or throwing a special exception) and PHP can call the handler on object $= b. > I'm not sure how common this problem would be, so I don=E2=80=99t have an= idea how > useful this feature would be. > That sounds reasonable to me. > My proof-of-concept implementation can be found here: > > https://github.com/jbtronics/php-src Unfortunately, this implementation goes in the wrong direction: PHP already has full internal support for operator overloading through the do_operation object handler. Operator overloading should be exposed to userland through that handler as well. > Here you can find some basic demo code using it: > > https://gist.github.com/jbtronics/ee6431e52c161ddd006f8bb7e4f5bcd6 > > > > I would be happy to hear some opinions for this concept, and the idea of > overloadable operators in PHP in general. > Thanks for working on this :) I think overloaded operators are a reasonable addition to the language at this point. I think the main concern people tend to have in this area is that operator overloading is going to be abused (see for example << in C++). There are many very good use-cases for operator overloading though (as mentioned, vector/matrix calculations, complex, rationals, money, ...) Some of those are not common in PHP, but maybe the lack of operator overloading is part of the problem there ;) Regards, Nikita --00000000000029c73c059d43da83--