Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108302 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 85967 invoked from network); 29 Jan 2020 01:37:04 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 29 Jan 2020 01:37:04 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 6CD32180511 for ; Tue, 28 Jan 2020 15:47:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f43.google.com (mail-yw1-f43.google.com [209.85.161.43]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 28 Jan 2020 15:47:33 -0800 (PST) Received: by mail-yw1-f43.google.com with SMTP id b186so7573594ywc.1 for ; Tue, 28 Jan 2020 15:47:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=benramsey.com; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=phZkLm5EMp0pYY9JYenh/VXb9Tw7kU5byiqVdIEGNQo=; b=MYvDHHlzX/vnmyxVkoSNE6xhn31FEOGPckzI08sbEU6pB0s+AjPeafBk3ydl6dZ4bA PadxXl3gx6u4zSN7fhWbhQMmc2ZmYbowvzOTeZLkxepuREG7sEl5a8VxkulOmx/LouXl fBRwU9e97EJvQVzwgUb4RVg0ldHoDL0RG6gpQCZ+3pb1l/zAphLswCXseMz+Okh3LsZ+ e0RBmg26qzSG3MAK5HdsAT0I6FK4tN5Nb/F8ywF3Uc2hg7yIcJzOc5wFuqzEBXX+SOlZ hm85vVzcvXDg78HeGfotuBvjH4A5ewYRFnHA1W17fnjlOlNrmsDzXkYaf6ofDwYZ0c/F LbAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=phZkLm5EMp0pYY9JYenh/VXb9Tw7kU5byiqVdIEGNQo=; b=PBF/7iC0DAs7aOKmCrxj3pjNuk9FbdYhUFm1tl3c2bXO2pUE7K7TfZCifNZZ8Tsv57 lEohtI73hC2PGI4ep2CDIDAuSZ6r9DLVrAOFVGOjPtK/GRqLwgx6izDr/Ba/5wmKm0LX XRcJAQewCFlv56qS8uzYOInNIvsVydEzFvLDEJEXZTZf1rAiHr73/ZrupfIMi2F+S5QM /ZlI/5oDmWI4LUHiMM0i4xV3UBln6/QjmX+7O8hw8wW3eGiAt9f/H2pa7GLk7AOxNOA1 KA6XCyjax/CWdzwxXplBvYpBWo/3feP+WdGWPcVjgdFCgjBlHDC01BTKK4TtN0fvDOB2 neqQ== X-Gm-Message-State: APjAAAXfbOU/BJfcbHiibpXKMvCLs85DUjWo7oCrVR3ZvUh/gn086oZw VtJka6RuhxmwnRMW3/mEy9ynfA== X-Google-Smtp-Source: APXvYqzyQzkmWo0W2T8cJ0+GaKi9fF2IcBHNv/WN+PBUwDFZpYKUvEf1EJaRZU6E8oxohYvUWEqkRg== X-Received: by 2002:a0d:e1cb:: with SMTP id k194mr17314057ywe.491.1580255252209; Tue, 28 Jan 2020 15:47:32 -0800 (PST) Received: from [10.10.42.56] (h96-61-170-50.lvrgtn.dsl.dynamic.tds.net. [96.61.170.50]) by smtp.gmail.com with ESMTPSA id o189sm222557ywe.15.2020.01.28.15.47.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 28 Jan 2020 15:47:31 -0800 (PST) Message-ID: <55C25E7C-596F-46E8-A812-A7DF86A17546@benramsey.com> Content-Type: multipart/signed; boundary="Apple-Mail=_0708EE45-1886-406B-A167-5B776A5849A3"; protocol="application/pgp-signature"; micalg=pgp-sha256 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) Date: Tue, 28 Jan 2020 17:47:30 -0600 In-Reply-To: <00ea01d5d630$b18d4f20$14a7ed60$@gmx.de> Cc: internals@lists.php.net To: jan.h.boehmer@gmx.de References: <00ea01d5d630$b18d4f20$14a7ed60$@gmx.de> X-Mailer: Apple Mail (2.3608.40.2.2.4) Subject: Re: [PHP-DEV] Operator overloading for userspace objects From: ben@benramsey.com (Ben Ramsey) --Apple-Mail=_0708EE45-1886-406B-A167-5B776A5849A3 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jan 28, 2020, at 17:14, = wrote: >=20 > Hello everybody, >=20 >=20 >=20 > the last days I have experimented a bit with operator overloading in > userspace classes (redefing the meaning of arithmetic operations like = +, -, > *, etc. for your own classes). >=20 > This could be useful for different libraries which implements custom > arithmetic objects (like money values, tensors, etc.) or things like = Symfony > string component (concatenate) operator, because it improves = readability > much: >=20 > $x * ($a + $b) instead of $x->multiply($a->add($b)) >=20 >=20 >=20 > 4 years ago, there was a RFC about this topic ( > > https://wiki.php.net/rfc/operator-overloading), which was discussed a = bit ( > = https://externals.io/message/89967), > but there was no real Outcome. >=20 >=20 >=20 > I have tried to implement a proof of concept of the RFC, I encountered = some > problems, when implementing the operator functions as (non-static) = class > members and pass them only the =C2=93other=C2=94 argument: What = happens when we > encounter an expression like 2/$a and how can the class differ this = from > $a/2. Also not every operation on every structure is e.g on = commutative > (e.g. for matrices A*B =3D/=3D B*A). So I tried a C#-like approach, = where the > operator implementations are static functions in the class, and both > arguments are passed. In my PHP implementation this would look = something > like this: >=20 >=20 >=20 > Class X { >=20 > public static function __add($lhs, $rhs) { >=20 > //... >=20 > } >=20 > } >=20 >=20 >=20 > The class function can so decide what to do, based on both operands = (so it > can decide if the developer wrote 2/$a or $a/2). Also that way an > implementor can not return $this by accident, which could lead to = unintended > side effect, if the result of the operation is somehow mutated. >=20 >=20 >=20 > I have taken over the idea of defining a magic function for each = operation > (like Python does), because I think that way it is the clearest way to = see, > what operators a class implements (could be useful for static = analysis). The > downside to this approach is that this increases the number of magic > functions highly (my PoC-code defines 13 additional magic functions, = and the > unary operators are missing yet), so some people in the original = discussion > suggest to define a single (magic) function, where the operator is = passed, > and the user code decides, what to do. Advantageous is very extensible = (with > the right parser implementation, you could even define your own new > operators), with the cost that this method will become very complex = for data > structures which use multiple operators (large if-else or switch > constructions, which delegate the logic to the appropriate functions). = An > other idea mentioned was to extract interfaces with common = functionality > (like Arithmetically, Comparable, etc.) like done with the ArrayAccess = or > Countable interfaces. The problem that I see here, is that this = approach is > rather unflexible and it would be difficult to extract really = universal > interfaces (e.g. vectors does not need a division (/) operation, but = the > concatenation . could be really useful for implementing dot product). = This > would lead to either that only parts of the interfaces are implemented = (and > the other just throw exceptions) or that the interfaces contain only = one or > two functions (so we would have many interfaces instead of magic = functions > in the end). >=20 >=20 >=20 > On the topic which operators should be overloadable: My = PoC-implementation > has magic functions for the arithmetic operators (+, -, *, /, %, **), = string > concatenation (.), and bit operations (>>, <<, &, |, ^). Comparison = and > equality checks are implement using a common __compare() function, = which > acts like an overload of the spaceship operator. Based if -1, 0 or +1 = is > returned by the comparison operators (<, >, <=3D, >=3D, =3D=3D) are = evaluated. I > think this way we can enforce, that the assumed standard logic (e.g > !($a<$b)=3D($a>=3D$b) and ($a<$b)=3D($b>$a)) of comparison is = implemented. Also I > don=C2=92t think this would restrict real world applications much (if = you have an > example, where a separate definition of < and >=3D could be useful, = please > comment it). >=20 > Unlike the original idea, I don=C2=92t think it should be possible to = overwrite > identity operator (=3D=3D=3D), because it should always be possible to = check if > two objects are really identical (also every case should be coverable = by > equality). The same applies to the logic operators (!, ||, &&), I = think they > should always work like intended (other languages like Python and C# = handles > it that way too). >=20 > For the shorthand assignment operators like +=3D, -=3D the situation = is a bit > more complicated: On the one hand the user has learned that $a+=3D1 is = just an > abbreviation of $=3D$a+1, so this logic should apply to overloaded = operators > as well (in C# it is implemented like this). On the other hand it = could be > useful to differentiate between the two cases, so you can mutate the = object > itself (in the +=3D case) instead of returning a new object instance = (the > class cannot know it is assigned to its own reference, when $a + 1 is > called). Personally I don=C2=92t think that this would be a big = problem, so my > PoC-Code does not provide a possibility to override the short hand > operators.) For the increment/decrement operators ($a++) it is = similar, it > would be nice if it would be possible to overload this operator but on = the > other hand the use cases of this operator is really limited besides = integer > incrementation and if you want to trigger something more complex, you = should > call a method, to make clear of your intent. >=20 >=20 >=20 > On the topic in which order the operators should be executed: Besides = the > normal priority (defined by PHP), my code checks if the element on the = left > side is an object and tries to call the appropriate magic function on = it. If > this is not possible the same is done for the right argument. This = should > cover the most of the use cases, except some cases: Consider a = expression > like $a / $b, where $a and $b has different classes (class A + class = B). If > class B knows how to divide class A, but class A does not know about = class > B, we encounter a problem when evaluating just from left to right (and = check > if the magic method exists). A solution for that would be that object = $a can > express that he does not know how to handle class B (e.g. by returning = null, > or throwing a special exception) and PHP can call the handler on = object $b. > I'm not sure how common this problem would be, so I don=C2=92t have an = idea how > useful this feature would be. >=20 >=20 >=20 > My proof-of-concept implementation can be found here: > = https://github.com/jbtronics/php-src >=20 > Here you can find some basic demo code using it: > > https://gist.github.com/jbtronics/ee6431e52c161ddd006f8bb7e4f5bcd6 >=20 >=20 >=20 > I would be happy to hear some opinions for this concept, and the idea = of > overloadable operators in PHP in general. On the subject of mutation, it seems awkward to me that `$a + 1` would = alter the value of $a or that `2/$b` should alter $b. Rather, I would = expect a new value to be *returned* as a result of this operation. If you take mutation off the table, then things become easier, IMO. We = only need two magic methods: * __toInteger(): int * __toFloat(): float Then, in any mathematical context, PHP could call the appropriate method = and use the number returned in the calculation. So, we could have something like this: class MyNumber { public function __toInteger(): int { return (int) $this->number; } } $x =3D new MyNumber(1); $y =3D $x + 1; And the value of $y would be 2. Of course, there=E2=80=99s the question of what we do if a class defines = both __toInteger() and __toFloat(), so perhaps a __toNumber() is more = appropriate, though that leads us into discussions what the return type = of this method should be and whether a `number` scalar type is needed, = but I think I=E2=80=99m getting ahead of the discussion here. TL;DR: mutating an object in the context of a mathematical operation = (unless I=E2=80=99m explicitly calling a method on the object with the = expectation of mutating it) could result in confusing and unexpected = results for programmers. Cheers, Ben --Apple-Mail=_0708EE45-1886-406B-A167-5B776A5849A3 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- iHUEAREIAB0WIQToXQMR3fpbrPOmEOewLZeYnIwHGwUCXjDIEgAKCRCwLZeYnIwH G1uxAP0UgNtbjmg+xoLPLPkTsClXloVkE5/DZFPLbuEtUfzhqwD9FWeqHc861K0w LDbQOzKu5//ZL6w2ln5ipsW3l2JPo+g= =L7yP -----END PGP SIGNATURE----- --Apple-Mail=_0708EE45-1886-406B-A167-5B776A5849A3--