Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122529 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 6A5D01AD8F6 for ; Tue, 27 Feb 2024 23:18:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1709075924; bh=7I7hvKgWOcjHrYzWncz/2esxIwqQLrjoitvdejhLTZA=; h=In-Reply-To:References:Date:From:To:Subject:From; b=TAMz37dMzA4LGeSL91z4Nj2UDbpehHLdmBg5cjdTBmB0G/59Ubo1yLCTBPJJyg3sh Us9j9bqPJKwjq47ykwy35FcNeoHyNwoHz74TlLYBOhSGU6lb0NGPtB/NVZplHxjCuM p0iiG6vp0kBgVGMhN9O1eDQF/sstn60WxYtIj4//uJpDG7BZ5zhCk5sxbIbOEcYnSF b9Z82299irNO8DKN+kyrwE9aKUljChLy2hb7Efr6mS99x6XzF1znTEWG7SvK5BYD4L v7FDN3i5wSp54+L4NbsuGAub12GD+VZGX9t9RgL4ltEs90i29gN+puAkh1vuXeuXiR ctcU2o7CBpAhg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 019B61831BD for ; Tue, 27 Feb 2024 23:18:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 27 Feb 2024 23:18:42 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id 7770C3200A5A for ; Tue, 27 Feb 2024 18:18:32 -0500 (EST) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Tue, 27 Feb 2024 18:18:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm1; t=1709075911; x=1709162311; bh=zQ7OS4bVKJh0kcFuRN5NQ Ylm8uZvJgjiGRHZW4QsRdw=; b=Vic2myTNPs3kpbDrO1m0VgFSExFoY4UmyHsjC UzcI/rRTn0A75SI41wqZdV+ba/rm3BSwstMrslbb4SMfKPvo08/HgQiUEzrWkDOc DpX6TFMntGhfDW3PXEPWxLzwq6NT5uiEsrsFf8p9y7hLi6e7/nxW/Zcnh6IUam4X DKm5C6UeBQp7BNrUWdrNRLbAc6ADr74NyR3BUxYiq3tdkZpfzlviqt6Sos8axBtp 4Y/Z/3IfinIXTE4UlcuhD6+Q0CYofc74HE9G3+YGGI8511eyGxxzju+/mr8TB7ao 6GtqECPWKohwWa25T/BUPb7p1GjlJwiOnhVMtwUFEzkIBe/Kw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1709075911; x= 1709162311; bh=zQ7OS4bVKJh0kcFuRN5NQYlm8uZvJgjiGRHZW4QsRdw=; b=E vpCyxTfZyHkYO60u56aBCDCmsaAMnwkUws7fWhHxB2XKCOAxrzG9WIfUtsO401Xw s4JRWL03VOEI4EiokirgDAYE9ziQzDhSFF4pvHYr2WtHvDTMZaKyd5ZcKA74Tru1 U9d78mcBu1PNY/+RFuP7+N1dA7gnYfwkdnJwpR51NfeQ+PvJlsCzI1+btD6S2guQ gA2o1hh3kl3Uzx+7gaYn5BQv92a4W6nckVxVbV29c5OXYg/vu95wWsqp3CYJsDdC rK+h0sWndzTLarB/0jE/aEF1OU/uBJphPkhp37geaW1ohzuop3oIoaBi9VufwSbn qlx+nqXAxGpuzepy/DdCw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrgeeigddtkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreertdenucfhrhhomhepfdfnrghr rhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtoh hmqeenucggtffrrghtthgvrhhnpeduvdegjefffedvudeiheeifeelleejkedvteefhfeu hfefjedvgeefjeefkefhieenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghlughtvggthhdrtghomh X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id AA9BD1700096; Tue, 27 Feb 2024 18:18:31 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-182-gaab6630818-fm-20240222.002-gaab66308 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: <2299271f-50ea-48c1-81fb-b64fa10c9bbb@app.fastmail.com> In-Reply-To: <59619244-917d-4936-8f21-2854840a9bf8@rwec.co.uk> References: <59619244-917d-4936-8f21-2854840a9bf8@rwec.co.uk> Date: Tue, 27 Feb 2024 23:17:35 +0000 To: "php internals" Subject: Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote: > [Including my full previous reply, since the list and gmail currently=20 > aren't being friends. Apologies that this leads to rather a lot of=20 > reading in one go...] Eh, I'd prefer a few big emails that come in slowly to lots of little em= ails that come in fast. :-) >> On 21/02/2024 18:55, Larry Garfield wrote: >>> Hello again, fine Internalians. >>> >>> After much on-again/off-again work, Ilija and I are back with a more= polished property access hooks/interface properties RFC. >> >> >> Hello, and a huge thanks to both you and Ilija for the continued work=20 >> on this. I'd really like to see this feature make it into PHP, and=20 >> agree with a lot of the RFC. >> >> >> My main concern is the proliferation of things that look the same but=20 >> act differently, and things that look different but act the same: *snip* >> - a and b are both what we might call "traditional" properties, and=20 >> equivalent to each other; a uses legacy syntax which we haven't=20 >> removed for some reason I don't know why we haven't removed `var` either. I can't recall the la= st time I saw it in real code. But that's out of scope here. *snip* > I think there's some really great functionality in the RFC, and would=20 > love for it to succeed in some form, but I think it would benefit from=20 > removing some of the "magic". > > > Regards, > > --=20 > Rowan Tommins > [IMSoP] I'm going to try and respond to a couple of different points together he= re, including from later in the thread, as it's just easier. =3D=3D Re, design philosophy: > In C#, all "properties" are virtual - as soon as you have any=20 > non-default "get", "set" or "init" definition, it's up to you to decla= re=20 > a separate "field" to store the value in. Swift's "computed properties= "=20 > are similar: if you have a custom getter or setter, there is no backin= g=20 > store; to add behaviour to a "stored property", you use the separate=20 > "property observer" hooks. > > Kotlin's approach is philosophically the opposite: there are no fields= ,=20 > only properties, but properties can access a hidden "backing field" vi= a=20 > the special keyword "field". Importantly, omitting the setter doesn't=20 > make the property read-only, it implies set(value) { field =3D value } A little history here to help clarify how we ended up where we are: The = original RFC as we designed it modeled very closely on Swift, with 4 hoo= ks. Using get/set at all would create a virtual property and you were o= n your own, while the beforeSet/afterSet hooks would not. We ran that d= esign by some PHP Foundation sponsors a year ago (I don't actually know = who, Roman did it for us), and the general feedback was "we like the ide= a, but woof this is complicated with all these hooks and having to make = my own backing property for all these little things. Couldn't this be s= implified?" We thought a bit more, and I off-handedly suggested to Ilij= a "I mean, would it be possible to just detect if a get/set hook is usin= g a backing store and make it automatically? Then we could get rid of t= he before/after hooks." He gave it a quick try and found that was strai= ghtforward, so we pivoted to that simplified version. We then realized = that we had... mostly just recreated Kotlin's design, so shrugged happil= y and went on with life. As noted in an earlier email, C#, Kotlin, and Swift all have different s= tances on the variable name for the incoming value. We originally model= ed on Swift so had that model (optional newVal name), and also because w= e liked how compact it was. When we switched to the simplified, inciden= tally Kotlin-esque approach, we just kept the optional variable as it wo= rks. I think where that ended up is pretty nice, personally, even if it is no= t a direct map of any particular other language. =3D=3D Re asymmetric typing: This is capability already present today if using a setter method. =20 class Person { private $name; public function setName(UnicodeString|string $name) { $this->name =3D $value instanceof UnicodeString ? $value : ne= w UnicodeString($value); =20 } } And widening the parameter type in a child class is also entirely legal.= As the goal of the RFC is, essentially, "make most common getter/sette= r patterns easy to add to a property without making an API-breaking meth= od, so people don't have to add redundant just-in-case getters and sette= rs all the time," covering an easy-to-cover use case seems like a good t= hing to do. =20 It also ties into the question of the explict/implicit name, for the rea= son you mentioned earlier (unspecified means mixed), not by intent. Mor= e on that in another section. =3D=3D Re virtual properties: Ilija and I talked this through, and there's pros and cons to a `virtual= ` keyword. Ilija also suggested a `backed` keyword, which forces a back= ed property to exist even if it's not used in the hook itself. * Adding `virtual` adds more work for the developer, but more clarity. = It would also mean $this->$propName or $this->{__PROPERTY__} would work = "as expected", since there's no auto-detection for virtual-ness. On the= downside, if you have a could-be-virtual property but never actually us= e the backing value, you have an extra backing value hanging around in m= emory that is inaccessible normally, but will still show up in some seri= alization formats, which could be unexpected. If you omit one of the ho= oks and forget to mark it virtual, you'll still get the default of the o= ther operation, which could be unexpected. (Mostly this would be for a = virtual-get that accidentally has a default setter because you forgot to= mark it `virtual`.) * Doing autodetection as now, but with an added "make a backing value an= yway" flag would resolve the use case of "My set hook just calls a metho= d, and that method sets the property, but since the hook doesn't mention= the property it doesn't get created" problem. It would also allow for = $this->$propName to work if a property is explicitly backed. On the fli= pside, it's one more thing to think about, and the above example it solv= es would be trivially solved by having the method just return the value = to set and letting the set hook do the actual write, which is arguably b= etter and more reliable code anyway. * The status quo (auto-detection based on the presence of $this->propNam= e). This has the advantage it "just works" in the 95% case, without hav= ing to think about anything extra. The downside is it does have some od= d edge cases, like needing $this->propName to be explicitly used. =20 I don't think any is an obvious winner. My personal preference would be= for status quo (auto-detect) or explicit-virtual always. I could proba= bly live with either, though I think I'd personally favor status quo. T= houghts from others? =3D=3D Re reference-get Allowing backed properties to have a reference return creates a situatio= n where any writes would then bypass the set hook, and thus any validati= on implemented there. That is, it makes the validation unreliable. A m= ajor footgun. The question is, do we favor caveat-emptor flexibility or= correct-by-construction safety? Personally I always lead toward the la= tter, though PHP in general is... schizophrenic about it, I'd say. :-) At this point, we'd much rather leave it blocked to avoid the issue; it'= s easier to enable that loophole in the future if we really want it than= to get rid of it if it turns out to have been a bad idea. There is one edge case that *might* make sense: If there is no set hook = defined, then there's no set hook to worry about bypassing. So it may b= e safe to allow &get on backed properties IFF there is no set hook. I w= orry that is "one more quirky edge case", though, so as above it may be = better to skip for now as it's easier to add later than remove. But if = the consensus is to do that, we're open to it. (Question for everyone.) =3D=3D Re=20 =3D=3D Re arrays > Regarding arrays, have you considered allowing array-index writes if an &get hook is defined? i.e. "$x->foo['bar'] =3D 42;" could be treated as semantically equivalent to "$_temp =3D& $x->foo; $_temp['bar'] =3D 42; unset($_temp);" That's already discussed in the RFC: > The simplest approach would be to copy the array, modify it accordingl= y, and pass it to set hook. This would have a large and obscure performa= nce penalty, and a set implementation would have no way of knowing which= values had changed, leading to, for instance, code needing to revalidat= e all elements even if only one has changed.=20 Unless we were OK with that bypassing the set hook entirely if defined, = which, as noted above, means any safety guarantees provided by a set hoo= k are bypassed, leading to untrustworthy code. =3D=3D Re hook shorthands and return values Ilija and I have been discussing this for a bit, and we've both budged a= little. :-) Here's our counter-proposal: - Drop the "top level" shorthand, for get-only hooks. - Keep the =3D> shorthand for the get hook itself. - For a set hook, the {} form has no return value; set the value yoursel= f however you want. - For a set hook, the =3D> form implies a backed value and will set the = property to whatever value that evaluates to. So these are equivalent: public $foo { set { $this->foo =3D $value; } } public $foo { set =3D> $value; } These are equivalent: public string $foo { get { return strtoupper($this->foo); } } public string $foo { get =3D> strtoupper($this->foo); } And this goes away: public string $foo =3D> strtoupper($this->foo); That covers the common cases with an arrow-function-like syntax that beh= aves as you'd expect (it returns things), and allows a longer version wi= th arbitrarily complex logic if desired. It also means that each syntax= variant does mean something importantly different. Would that be an acceptable compromise? (Question for everyone.) =3D=3D Re the $value variable in set Honestly, Rowan's earlier point here is the strongest argument in favor = for me of the current RFC approach. Anywhere else in PHP, something tha= t looks like a parameter and has no type, like `($param)`, means its typ= e is `mixed`. It would be weird and confusing to be different here. Th= at's above and beyond the issue of forcing people to retype something ob= vious every time. (I cite again, recent PHP's trend toward removing nee= dless boilerplate, which is very good.) Requiring that the type be spec= ified, for consistency, makes little sense if the type is not allowed to= vary. You're just repeating a string from earlier on the same line, fo= r no particular benefit. I genuinely don't understand the pushback on $value. It's something you= learn once and never have to think about again. It's consistent. Ilija jokingly suggested making it always $value, unconditionally, and a= llowing only the type to be specified if widening: public int $foo { set(int|float) =3D> floor($value); } Though I suspect that won't go over well, either. :-) So what makes the most sense to me is to keep $value optional, but IF yo= u specify an alternate name, you must also specify a type (which may be = wider). So these are equivalent: public int $foo { set (int $value) =3D> $value + 1 } public int $foo { set =3D> $value + 1 } And only those forms are legal. But you could also do this, if the situ= ation called for it: public int $foo { set(int|float $num) =3D> floor($num) + 1; } This "all or nothing" approach seems like it strikes the best balance, g= ives the most flexibility where needed while still having the least redu= ndancy when not needed, and when a name/type is provided, its behavior i= s the same as for a method being inherited. Does that sound acceptable? (Again, question for everyone.) The alternative that gives the most future-flexibility is to do neither:= The variable is called $value, period, you can't change it, and you can= 't change the type, either. There is no () after set, ever. Punt both = of those to a later follow-up. I'd prefer to include both now, but incl= uding neither now is the next-safer option. ## Regarding $field Sigh, now y'all like it. :-P Most of the feedback on this has been nega= tive, so I'm inclined to leave it out at this point, unless there's a ma= jor swing in feedback to bring it back. But the RFC seems more likely t= o pass without it than with right now. --Larry Garfield