Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122476 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 6578F1ACEBF for ; Thu, 22 Feb 2024 23:56:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1708646194; bh=d+xaIVKVLomyPYKzZFKtTgLhLwMBibQiVVBk50MEq3Q=; h=In-Reply-To:References:Date:From:To:Subject:From; b=WaopljL5sVClewxEIpIhDw8iJ35+wtSXRdqs5GXkUihkkJNRr2YHFOWayRpO1yxhA mxt92YXSsIuXVpc7bSAM8fqxJaftnAHBLIJ7nXF4PpUGuW4c+fvFVjoRQ+02VxrsiL 3ACnXiozy/QvGbHLRAyB79XwO4zUeS3wQ6xjHbhWWB3stUewccCf/CNIl+VKPTpHrg Uj9+ul9qd9/9q1XVbmFg4HAbzpjh5tig1CjpZuYCAck09gTg9obC4/NaP1RWpMmnbh LH+fUE8feDIHTKzSpbt2NZHIAV4Twpi3RzUHgCXcFB3Rk8dgiRLzSTsI8PRjmi+et/ Us6Vx1ylHdC3A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 311D218004F for ; Thu, 22 Feb 2024 23:56:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 22 Feb 2024 15:56:32 -0800 (PST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id DF1185C00B0 for ; Thu, 22 Feb 2024 18:56:25 -0500 (EST) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Thu, 22 Feb 2024 18:56:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm1; t=1708646185; x=1708732585; bh=dB4CbMpZZMkkoaruooXG4 /rzVqF0n6XAMfkpE4JVV/I=; b=ltAWZeBtSJj2AT58kzYghJ8QDCm4rTcMdOjZA kGBC6WLnBeULnVVv7t/F5hwCWUKhf6odAHDziwMdWjA8M8e3Rj9bn13ihyl0lGZv Qok+b2TWkY8StpCm+RI+aM95byVfNvkshmolytNNjo7crU2jiU4ibIZfBTKI5ILy XmFwoA2aXOdRe+yJ6TFL8DrgYGmlYPGRODwtwgghhXA/U2KrSHQXIwyUqFXmE84Y R4Hu2FAvrK07o/sm7IOEEs5zgAEb3FPh9Y7KyR9JBSFAlGLK8DTHNDDZcb+rAwPY i9iJKlIzlXs/g3oBLw2qUUiIM9NIV/sM7gVk/AndW7liHJ1Qg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1708646185; x= 1708732585; bh=dB4CbMpZZMkkoaruooXG4/rzVqF0n6XAMfkpE4JVV/I=; b=D TiQ8WM6dKpJN5nNxgqmC6cdy9Prp+Te/vWjhjtEoKTknGIXL/Or8FMUuR5JahO9j t9Dk9NMqf9iDz4rGr+ZXak2jXYhg+F0MKhQTXb5RxBoUzVNT3Pg6stcEK0kr54b8 4Oq8VykgiK2Nk4m0L4NqibYkqHhZ2E3Ma89+bXp1PIrObQ5zYun9TtlhW91pNpNW RGT/spfWiW9vGTWQasr0TuFuDK89iQ4IZdCX/s2sktysYA9PNW/72CbIoNcB1n5n +VpjtqUgJTtXqadR9W5lkZMoRUj+fKbrjQFMV3cLxCMM9MqnIloEQgv7fh65CeTN mnmlLQqmBGa5DtCrLobSA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrfeehgdduhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdfnrghr rhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtoh hmqeenucggtffrrghtthgvrhhnpeeggeehgfetjeehgefggefhleeugefgtdejieevvdet hfevgeeuudefleehvdetieenucffohhmrghinhepphhhphdrnhgvthenucevlhhushhtvg hrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhi vghlughtvggthhdrtghomh X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 5F53C1700093; Thu, 22 Feb 2024 18:56:25 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-153-g7e3bb84806-fm-20240215.007-g7e3bb848 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: <790b5b4e-f51b-4050-a12a-5fa903d0568f@app.fastmail.com> In-Reply-To: References: Date: Thu, 22 Feb 2024 23:56:05 +0000 To: "php internals" Subject: Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Wed, Feb 21, 2024, at 11:02 PM, Matthew Weier O'Phinney wrote: > On Wed, Feb 21, 2024 at 12:57=E2=80=AFPM Larry Garfield wrote: >> After much on-again/off-again work, Ilija and I are back with a more = polished property access hooks/interface properties RFC. It=E2=80=99s 9= 9% unchanged from last summer; the PR is now essentially complete and mo= re robust, and we were able to squish the last remaining edge cases. >>=20 >> Baring any major changes, we plan to bring this to a vote in mid-Marc= h. >>=20 >> https://wiki.php.net/rfc/property-hooks >>=20 >> It=E2=80=99s long, but that=E2=80=99s because we=E2=80=99re handling = every edge case we could think of. Properties involve dealing with both= references and inheritance, both of which have complex implications. W= e believe we=E2=80=99ve identified the most logical handling for all cas= es, though. > > Once again in reading the proposal, the first thing I'm struck by are=20 > the magic "$field" and "$value" variables inside accessors. The first=20 > time they are used, they're used without explanation, and they're=20 > jarring. > > Additionally, once you start defining the behavior of accessors... you=20 > don't start with the basics, but instead jump into some of the more=20 > esoteric usage, which does nothing to help with the questions I have. > > So, first: > > - Start with the most basic, most expected usage for each of reading=20 > and writing properties. > - I need a better argument for why the $field and $value variables=20 > exist. Saying they're macros doesn't help those not deep into=20 > internals. As a user, why do they exist? For $field, it's not a requirement. It's mostly for copy-paste convenie= nce. A number of people have struggled on this point so if the consensu= s is to leave out $field and just use $this->propName directly, we can a= ccept that. They can be re-added if reusable hook packages are added in= the future (as noted in Future Scope). For $value, it's to avoid boilerplate. For the majority case, you'll be= just operating on an individual value trivially. Checking it's range, = or uppercasing it, or whatever. Requiring the developer to provide a na= me explicitly is just extra work; it's much the same as how PHP doesn't = require you to pass $this as the first argument to a method explicitly, = the way Python and Rust do. It's just understood that $this exists, and= once you learn that it's obvious. On the occasions where you do want to specify an alternate name for some= reason, or more likely you're providing a wider type, you can. But in = the typical case it would just be one more thing for the dev to have to = type out. This is especially true in what I expect to be a common case,= which is promoted constructor arguments with an extra validator set hoo= k on them. It also introduces some ambiguity. If I specify only the name, does tha= t mean I'm widening the type to mixed? Or just that I'm omitting the na= me? If specifying the name is rare, that's not really a big deal. If i= t's required in every case, it's a confusion point in every case. In the interest of transparency. for comparison: * Kotlin appears to require an argument name, but by convention recommen= ds using "value". * Swift makes it optional, with a default name of "newValue". (Same log= ic as in the RFC.) * C# ... as far as I can tell, doesn't support a custom name at all; it'= s always called "value", implicitly. > Second: you don't have examples of defining BOTH get and set OTHER tha= n=20 > when using expressions for both accessors or a mix. I'm actually=20 > unclear what the syntax is when both are defined. Is there supposed to=20 > be a `;` terminating each? Or a `,`? Or just an empty line? Again,=20 > this is one of the more common scenarios. It needs to be covered early= ,=20 > and clearly. ... huh. I thought we had one in there somewhere. I will add one, than= ks. Though to clarify, there's no separator character. public string $foo { get { // ... } set { // ... } } > Third: the caveats around usage with arrays... give me pause. While I'= m=20 > personally trying to not use arrays as much as possible, a lot of code=20 > I see or contribute to still does, and the fact that an array property=20 > that uses a write accessor doesn't allow the same level of access as a=20 > normal array property is something I see leading to a lot of confusion=20 > and errors. I don't have a solution, but I worry that this one thing=20 > alone could be enough to prevent the passage of the RFC. We completely agree that it's a suboptimal situation. But as explained,= it is the way it is because it's not possible (as far as we can tell) t= o fully support hooks on array properties. If you can think of one, ple= ase share, because we'd love to make this part better. I don't like it = either, but we haven't found a way around it. And that caveat doesn't s= eem like a good enough reason to not support hooks everywhere we actuall= y can. > Fourth: the syntax around inheritance is not intuitive, as it does not=20 > work in the same way as the rest of the language. I'm talking about=20 > this: > > public int $x { > get: 2 * parent::$x::get() > } > > Why do we need to use the accessors here? Why wouldn't it just be `par= ent::$x`? Almost. Ilija spent some time looking into this, and it's possible with= caveats. First, there's then no way to differentiate between "access parent hook"= and "read the static property $x on the parent". Arguably that's not a= common case, but it is a point of confusion. The larger issue is that parent::$x can't be just a stand-in for the bac= king value in all cases. While supporting that for the =3D operator is = straightforward enough, it wouldn't give us access to ++, --, <=3D, and = the dozen or so other operators that could conceivably apply. In theory= those could all be implemented manually, but Ilija described that as "h= undreds to thousands of lines of code" to do, which... is not time or co= de well spent. :-) Especially as this is a very edge-case situation to = begin with. (In practice, I expect auto-generated ORM proxy code to be = the primary user of accessing a parent property from a child hook. I ca= n think of very few other cases where I'd want to use it.) So we have the choice between making $a =3D parent::$prop and parent::$p= rop =3D $a work, but *nothing else*, inexplicably (creating confusion) o= r the slightly longer syntax that wouldn't support those other operation= s anyway so there's no confusion. We feel the current approach is the better trade off, but if the consens= us generally is for the shorter-but-inconsistent syntax, that can be cha= nged. > I want to be clear: I really like the idea behind this feature, and=20 > overall, I appreciate the design. From a user perspective, though, the=20 > above are things that I found jarring as they vary quite a bit from ou= r=20 > current language design. >>=20 >> Note the FAQ question at the end, which explains some design choices. >>=20 >> There=E2=80=99s one outstanding question, which is slightly painful t= o ask: Originally, this RFC was called =E2=80=9Cproperty accessors,=E2=80= =9D which is the terminology used by most languages. During early devel= opment, when we had 4 accessors like Swift, we changed the name to =E2=80= =9Chooks=E2=80=9D to better indicate that one was =E2=80=9Chooking into=E2= =80=9D the property lifecycle. However, later refinement brought it bac= k down to 2 operations, get and set. That makes the =E2=80=9Chooks=E2=80= =9D name less applicable, and inconsistent with what other languages cal= l it. >>=20 >> However, changing it back at this point would be a non-small amount o= f grunt work. There would be no functional changes from doing so, but it= =E2=80=99s lots of renaming things both in the PR and the RFC. We are wi= lling to do so if the consensus is that it would be beneficial, but want= to ask before putting in the effort. > > I personally would go with just "accessors".=20 On Thu, Feb 22, 2024, at 12:06 AM, Deleu wrote: > This is a long, huge and comprehensive work, congratz to the authors. > > It clearly shows that so much thought and work has been put into it=20 > that it makes me cautious to even ask for further clarification.=20 > >> Javascript have similar features via a different syntax, although tha= t syntax would not be viable for PHP > > Why not? > It feels quite a natural syntax for PHP and from someone oblivious to=20 > the internal work, it appears to be a slight marginal change to the=20 > existing RFC. Given the extensive work of this RFC, it seems pretty=20 > obvious that this syntax will not work, I just don't know why. I've added an FAQ section explaining why the Python/JS approach wouldn't= really work. To be clear, Ilija and I spent 100+ hours doing research = and design before we started implementation (back in mid-late 2022). We= did seriously consider the JS-style syntax, but in the end we found it = created more problems than it solves. For the type of language PHP is (= explicit typed properties), doing it on the property itself is a much cl= eaner approach. On Thu, Feb 22, 2024, at 1:14 AM, Robert Landers wrote: > I apologize if this has already been covered: > >> There are two shorthand notations supported, beyond the optional argu= ment to set. >> First, if a hook's body is a single expression, then the { } and retu= rn statement may be omitted and replaced with =3D>, just like with arrow= functions. > > Does =3D> do any special auto-capturing of variables like arrow > functions or is it just a shorthand?=20 No, there is nothing to capture. Inside the hook body you have $this th= e same as any other method. > Also, is this a meaningful > shorthand to the example a little further down: > > public string $phone { > set =3D $this->sanitizePhone(...); > } > > or do we always have to write it out? > > public string $phone { > set =3D> $field =3D $this->sanitizePhone($value); > } Currently no, the set is always void; you have to write the value yourse= lf. See the FAQ section on the topic for a detailed explanation. However, I just had a long discussion with Ilija and there is one possib= ility we could consider: Use the return value only on the shorthand (arr= ow-function-like) syntax. So you could do either of these, which would be equivalent: set { $this->phone =3D $this->santizePhone($value); } set =3D> $this->santizePhone($value); This would have the advantage of offering a return-to-set mechanism, as = well as even shorter syntax in the simple case (and no question of $fiel= d vs $this->propName). But it would have the disadvantage of being pote= ntially inconsistent between the short and long version. It also would = mean the short version is incompatible with virtual properties; using a = short-set would create a backing value, so it's non-virtual. But since = "simple validation, maybe in a promoted constructor property" is likely = to be one of the main use cases, it would simplify that case. Not sure if that's a trade up or not. Thoughts from the list? > Would __PROPERTY__ be set inside sanitizePhone() as well? No. Like __CLASS__, it is materialized at compile time and has no meani= ng outside of its intended scope. > You mention several ways values are displayed (whether or not they use > the get-hook), but what does the default implementation of > `__debugInfo()` look like now (or is that out of scope or a silly > question?) var_dump() shows the underlying backing value, bypassing get, since it's= debugging the object state. If you implement __debugInfo(), that's a m= ethod like any other and you can do what you like, though you'll be read= ing through get hooks (just like using __serialize()). > For attributes, it would be nice to be able to target hooks > specifically with attributes instead of also all methods (e.g., a > Attribute::TARGET_GET_HOOK const). For example, if I were writing a > serialization library, I may want to specify #[UseRawValue] only on > getters to ensure that only the raw value is serialized instead of the > getter (which may be specific to the application logic, or > #[GetFromMethod] to tell the serialization library to get the value > from a completely different method. It wouldn't make sense to target > just any method with that attribute. This feels very niche, honestly. Those would naturally have to be sub-c= ases of TARGET_METHOD anyway, so method-targeted attributes would need t= o be supported regardless. That makes hook-specific-targeting an easy a= nd non-breaking add-on for a future RFC if it turns out to be useful in = practice. --Larry Garfield