Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122544 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id DD7001AD8F6 for ; Mon, 4 Mar 2024 15:48:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1709567153; bh=zxPddreTiy4VjXAYZcqWCx6290t00hum9pfd+x+MavQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=jL606B0n/LtH7BBCxa/qEuSY+xOv3OMmRpKJcpHTgCHl5Ip6RT6lM4v8QOAyOHEvm 2sAmlp3VlLu9X1ekmrj4j0fH6v2iFc4DFDtM42TGWW7OAe/FoDmSieF6PvGmFepitw zT1sqayeXMw1W3/+1ia0mc0XQTj4/6sMvEuhL+VReYLqJiBdU+qpAZUNGcAWHAm88t 6i0oQVrcwLyjylxShjmSBPvdr8m1dsVcOGJUeELNky5Dn/5TsgvxAOlvHxXijVt2c0 puX4RIN+jRkPMrbDWjZuyu3SOHaUMM16AIF4X05gCMTu4jmjCxDOVzwZ9HSOYtxVFC SRbC/YxuoDS9Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7CCEC18110D for ; Mon, 4 Mar 2024 15:45:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_40,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 4 Mar 2024 15:45:47 +0000 (UTC) Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5a128e202b6so911123eaf.2 for ; Mon, 04 Mar 2024 07:45:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709567135; x=1710171935; darn=lists.php.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FDKcsuyVPadlCGeug7qCAheKoNWeXqy9Hz/XM7ASKiE=; b=EMYG6xkrmeJ3y//sKO9vpWyh12yLsJ0ilW/ERkBNhoboBocXf8VXyTieYCn1Jmx940 swQ+pk4Y82ellkWO5V1cL4p9+i3I3H7UdlM2z5yiBjVnju4ztwVzrk1Kkkba+DLbNnGE qMA/WNKTUGEsVJERD2+1wReLsznANvEm4avzQJSoQAh3Et0eGseHHq2OuZsjqPNbQsa2 w2BVL61XPKiToFaBgSm0myLwLG0vTh5f1MoanMTabPe4pC7Gx2n+lVPr8z8Hs5VL+eWx KhU0pnkHiaZYPuC1mzJJGqkF6lGw2KQ5XAsOKkULgpcxjsesmu09XgCsAXWkSIcIGtlQ tpPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709567135; x=1710171935; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FDKcsuyVPadlCGeug7qCAheKoNWeXqy9Hz/XM7ASKiE=; b=W2pMH5AUKUDp98RdcWrdwLCIpxEB/y7u21HNpv8TAP/GmZXLtB04resk9BLy4tPvxZ 2agtQei5IYFdquwwc2WwtrUwkjjo0rQxtdKJTp9EaGBRsJGXcHmeyHYztU8SAMwu23Bo YE5BJ6DkAfCywOZ99Am1z7A1k8bpE+ZNRQKZsXW8siSZmh8hnMHd17v+CN8XcTo5OxQl qlZQRahCXPe2xKrPwOsgsR+ReJn50trZ4D0y1TMMTzi2IWod+BmSlLcatbmqChazXlp2 xzphlOoFewp21BaxnFcRtNczn2jv8N/K5abVkfEumOIiX+KnCASe9j6tIFGMOwCr3fHO kGZQ== X-Forwarded-Encrypted: i=1; AJvYcCUzdidXB/2QPoiu7+BlMGdy7KGE070KfMRnb7rsf9UVsaIFx89vS1FptJai4G/TJq9is7rkXNA6m4eQ6iwxJDZyVL2mu3fsRg== X-Gm-Message-State: AOJu0YyTeOSOFuNN5QkgzZfSFxBup2Eq2beEQin5sR6EFOTzv13ISNGw c4W3Fe845eRaOBLhz3beewCqPC/ZSYBAZVMUfP75J1LSQRWseZLyJm7+eYSgdz6fzr1eu2JX2DR SyVUVrttVsUj54Frb56fKxVLrUi8= X-Google-Smtp-Source: AGHT+IGky5S2kF9mGevcyV7fZFgaKLZBrD3REiICWdyGbls4zKGQJOm/BktaoQy+DqGBV1SnkAbtio4JOdgLtuQJkjI= X-Received: by 2002:a05:6820:1692:b0:5a1:316c:2d8c with SMTP id bc18-20020a056820169200b005a1316c2d8cmr4252427oob.2.1709567133170; Mon, 04 Mar 2024 07:45:33 -0800 (PST) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <59619244-917d-4936-8f21-2854840a9bf8@rwec.co.uk> <2299271f-50ea-48c1-81fb-b64fa10c9bbb@app.fastmail.com> <1204BFC3-B976-4FEE-BE01-E668699C84E2@koalephant.com> In-Reply-To: <1204BFC3-B976-4FEE-BE01-E668699C84E2@koalephant.com> Date: Mon, 4 Mar 2024 16:45:21 +0100 Message-ID: Subject: Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2 To: Stephen Reay Cc: Larry Garfield , php internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: landers.robert@gmail.com (Robert Landers) On Mon, Mar 4, 2024 at 4:40=E2=80=AFPM Stephen Reay wrote: > > > > > On 28 Feb 2024, at 06:17, Larry Garfield wrote= : > > > > On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote: > >> [Including my full previous reply, since the list and gmail currently > >> aren't being friends. Apologies that this leads to rather a lot of > >> reading in one go...] > > > > Eh, I'd prefer a few big emails that come in slowly to lots of little e= mails that come in fast. :-) > > > >>> On 21/02/2024 18:55, Larry Garfield wrote: > >>>> Hello again, fine Internalians. > >>>> > >>>> After much on-again/off-again work, Ilija and I are back with a more= polished property access hooks/interface properties RFC. > >>> > >>> > >>> Hello, and a huge thanks to both you and Ilija for the continued work > >>> on this. I'd really like to see this feature make it into PHP, and > >>> agree with a lot of the RFC. > >>> > >>> > >>> My main concern is the proliferation of things that look the same but > >>> act differently, and things that look different but act the same: > > > > *snip* > > > >>> - a and b are both what we might call "traditional" properties, and > >>> equivalent to each other; a uses legacy syntax which we haven't > >>> removed for some reason > > > > I don't know why we haven't removed `var` either. I can't recall the l= ast time I saw it in real code. But that's out of scope here. > > > > *snip* > > > >> I think there's some really great functionality in the RFC, and would > >> love for it to succeed in some form, but I think it would benefit from > >> removing some of the "magic". > >> > >> > >> Regards, > >> > >> -- > >> Rowan Tommins > >> [IMSoP] > > > > > > I'm going to try and respond to a couple of different points together h= ere, including from later in the thread, as it's just easier. > > > > =3D=3D Re, design philosophy: > > > >> In C#, all "properties" are virtual - as soon as you have any > >> non-default "get", "set" or "init" definition, it's up to you to decla= re > >> a separate "field" to store the value in. Swift's "computed properties= " > >> are similar: if you have a custom getter or setter, there is no backin= g > >> store; to add behaviour to a "stored property", you use the separate > >> "property observer" hooks. > >> > >> Kotlin's approach is philosophically the opposite: there are no fields= , > >> only properties, but properties can access a hidden "backing field" vi= a > >> the special keyword "field". Importantly, omitting the setter doesn't > >> make the property read-only, it implies set(value) { field =3D value } > > > > A little history here to help clarify how we ended up where we are: The= original RFC as we designed it modeled very closely on Swift, with 4 hooks= . Using get/set at all would create a virtual property and you were on you= r own, while the beforeSet/afterSet hooks would not. We ran that design by= some PHP Foundation sponsors a year ago (I don't actually know who, Roman = did it for us), and the general feedback was "we like the idea, but woof th= is is complicated with all these hooks and having to make my own backing pr= operty for all these little things. Couldn't this be simplified?" We thou= ght a bit more, and I off-handedly suggested to Ilija "I mean, would it be = possible to just detect if a get/set hook is using a backing store and make= it automatically? Then we could get rid of the before/after hooks." He g= ave it a quick try and found that was straightforward, so we pivoted to tha= t simplified version. We then realized that we had... mostly just recreate= d Kotlin's design, so shrugged happily and went on with life. > > > > As noted in an earlier email, C#, Kotlin, and Swift all have different = stances on the variable name for the incoming value. We originally modeled= on Swift so had that model (optional newVal name), and also because we lik= ed how compact it was. When we switched to the simplified, incidentally Ko= tlin-esque approach, we just kept the optional variable as it works. > > > > I think where that ended up is pretty nice, personally, even if it is n= ot a direct map of any particular other language. > > > > =3D=3D Re asymmetric typing: > > > > This is capability already present today if using a setter method. > > > > class Person { > > private $name; > > > > public function setName(UnicodeString|string $name) > > { > > $this->name =3D $value instanceof UnicodeString ? $value : ne= w UnicodeString($value); > > } > > } > > > > And widening the parameter type in a child class is also entirely legal= . As the goal of the RFC is, essentially, "make most common getter/setter = patterns easy to add to a property without making an API-breaking method, s= o people don't have to add redundant just-in-case getters and setters all t= he time," covering an easy-to-cover use case seems like a good thing to do. > > > > It also ties into the question of the explict/implicit name, for the re= ason you mentioned earlier (unspecified means mixed), not by intent. More = on that in another section. > > > > =3D=3D Re virtual properties: > > > > Ilija and I talked this through, and there's pros and cons to a `virtua= l` keyword. Ilija also suggested a `backed` keyword, which forces a backed= property to exist even if it's not used in the hook itself. > > > > * Adding `virtual` adds more work for the developer, but more clarity. = It would also mean $this->$propName or $this->{__PROPERTY__} would work "a= s expected", since there's no auto-detection for virtual-ness. On the down= side, if you have a could-be-virtual property but never actually use the ba= cking value, you have an extra backing value hanging around in memory that = is inaccessible normally, but will still show up in some serialization form= ats, which could be unexpected. If you omit one of the hooks and forget to= mark it virtual, you'll still get the default of the other operation, whic= h could be unexpected. (Mostly this would be for a virtual-get that accide= ntally has a default setter because you forgot to mark it `virtual`.) > > * Doing autodetection as now, but with an added "make a backing value a= nyway" flag would resolve the use case of "My set hook just calls a method,= and that method sets the property, but since the hook doesn't mention the = property it doesn't get created" problem. It would also allow for $this->$= propName to work if a property is explicitly backed. On the flipside, it's= one more thing to think about, and the above example it solves would be tr= ivially solved by having the method just return the value to set and lettin= g the set hook do the actual write, which is arguably better and more relia= ble code anyway. > > * The status quo (auto-detection based on the presence of $this->propNa= me). This has the advantage it "just works" in the 95% case, without havin= g to think about anything extra. The downside is it does have some odd edg= e cases, like needing $this->propName to be explicitly used. > > > > I don't think any is an obvious winner. My personal preference would b= e for status quo (auto-detect) or explicit-virtual always. I could probabl= y live with either, though I think I'd personally favor status quo. Though= ts from others? > > > > I agree that a flag to make the field *virtual* (thus disabling the backi= ng store) makes more sense than a flag to make it backed; It's also easier = to understand when comparing hooked properties with regular properties (ess= entially, backed is the default, you have to opt-in to it being virtual). I= don't think the edge cases of "auto" make it worthwhile just to not need "= virtual". > > > =3D=3D Re reference-get > > > > Allowing backed properties to have a reference return creates a situati= on where any writes would then bypass the set hook, and thus any validation= implemented there. That is, it makes the validation unreliable. A major = footgun. The question is, do we favor caveat-emptor flexibility or correct= -by-construction safety? Personally I always lead toward the latter, thoug= h PHP in general is... schizophrenic about it, I'd say. :-) > > > > At this point, we'd much rather leave it blocked to avoid the issue; it= 's easier to enable that loophole in the future if we really want it than t= o get rid of it if it turns out to have been a bad idea. > > > > There is one edge case that *might* make sense: If there is no set hook= defined, then there's no set hook to worry about bypassing. So it may be = safe to allow &get on backed properties IFF there is no set hook. I worry = that is "one more quirky edge case", though, so as above it may be better t= o skip for now as it's easier to add later than remove. But if the consens= us is to do that, we're open to it. (Question for everyone.) > > > > I don't have strong feeling about this, but in general I usually tend to = prefer options that are consistent, and give power/options to the developer= . If references are opt-in anyway, I see that as accepting the trade-offs. = If a developer doesn't want to allow by-ref modifications of the property, = why would they make it referenceable in the first place? This sounds a bit = like disallowing regular public properties because they might be modified o= utside the class - that's kind of the point, surely. > > > =3D=3D Re > > > > =3D=3D Re arrays > > > >> Regarding arrays, have you considered allowing array-index writes if > > an &get hook is defined? i.e. "$x->foo['bar'] =3D 42;" could be treated > > as semantically equivalent to "$_temp =3D& $x->foo; $_temp['bar'] =3D 4= 2; > > unset($_temp);" > > > > That's already discussed in the RFC: > > > >> The simplest approach would be to copy the array, modify it accordingl= y, and pass it to set hook. This would have a large and obscure performance= penalty, and a set implementation would have no way of knowing which value= s had changed, leading to, for instance, code needing to revalidate all ele= ments even if only one has changed. > > > > Unless we were OK with that bypassing the set hook entirely if defined,= which, as noted above, means any safety guarantees provided by a set hook = are bypassed, leading to untrustworthy code. > > > > =3D=3D Re hook shorthands and return values > > > > Ilija and I have been discussing this for a bit, and we've both budged = a little. :-) Here's our counter-proposal: > > > > - Drop the "top level" shorthand, for get-only hooks. > > - Keep the =3D> shorthand for the get hook itself. > > - For a set hook, the {} form has no return value; set the value yourse= lf however you want. > > - For a set hook, the =3D> form implies a backed value and will set the= property to whatever value that evaluates to. > > > > So these are equivalent: > > > > public $foo { set { $this->foo =3D $value; } } > > public $foo { set =3D> $value; } > > > > These are equivalent: > > > > public string $foo { > > get { > > return strtoupper($this->foo); > > } > > } > > public string $foo { get =3D> strtoupper($this->foo); } > > > > And this goes away: > > > > public string $foo =3D> strtoupper($this->foo); > > > > That covers the common cases with an arrow-function-like syntax that be= haves as you'd expect (it returns things), and allows a longer version with= arbitrarily complex logic if desired. It also means that each syntax vari= ant does mean something importantly different. > > > > Would that be an acceptable compromise? (Question for everyone.) > > > > I think the examples given are clear, and the lack of the top-level short= closure-esque version makes it more obvious. Forgive me, I must have misse= d some of the previous comments - is there a reason the 'full' setter can't= return a value, for the sake of consistency? I understand that you don't w= ant "return to set" to be the *only* option, for the sake of e.g. change/au= dit logging type functionality (i.e. set and then some action to record tha= t the change was made), but it seems a little odd and inconsistent to me th= at the return value of a short closure would be used when the return value = of the long version isn't. This isn't really a major issue, I'm just curiou= s if there was some explanation about it? > > > =3D=3D Re the $value variable in set > > > > Honestly, Rowan's earlier point here is the strongest argument in favor= for me of the current RFC approach. Anywhere else in PHP, something that = looks like a parameter and has no type, like `($param)`, means its type is = `mixed`. It would be weird and confusing to be different here. That's abo= ve and beyond the issue of forcing people to retype something obvious every= time. (I cite again, recent PHP's trend toward removing needless boilerpl= ate, which is very good.) Requiring that the type be specified, for consis= tency, makes little sense if the type is not allowed to vary. You're just = repeating a string from earlier on the same line, for no particular benefit= . > > > > I genuinely don't understand the pushback on $value. It's something yo= u learn once and never have to think about again. It's consistent. > > > > Ilija jokingly suggested making it always $value, unconditionally, and = allowing only the type to be specified if widening: > > > > public int $foo { set(int|float) =3D> floor($value); } > > > > Though I suspect that won't go over well, either. :-) > > > > So what makes the most sense to me is to keep $value optional, but IF y= ou specify an alternate name, you must also specify a type (which may be wi= der). So these are equivalent: > > > > public int $foo { set (int $value) =3D> $value + 1 } > > public int $foo { set =3D> $value + 1 } > > > > And only those forms are legal. But you could also do this, if the sit= uation called for it: > > > > public int $foo { set(int|float $num) =3D> floor($num) + 1; } > > > > This "all or nothing" approach seems like it strikes the best balance, = gives the most flexibility where needed while still having the least redund= ancy when not needed, and when a name/type is provided, its behavior is the= same as for a method being inherited. > > > > Does that sound acceptable? (Again, question for everyone.) > > > > My only question with this is the same as I had in an earlier reply (and = I'm not sure it was ever answered directly?), and you allude to this yourse= lf: everywhere *else*, `($var)` means a parameter with type `mixed`. Why is= the type *required* here, when you've specifically said you want to avoid = boilerplate? If we're going to assume people can understand that `(implicit= property-type $value) is implicit, surely we can also assume that they wil= l understand "specifying a parameter without a type" means the parameter ha= s no type (i.e. is `mixed`). > > Again, for myself I'd be likely to type it (or regular parameters, proper= ties, etc) as `mixed` if that's what I want *anyway*, but the inconsistency= here seems odd, unless there's some until-now unknown drive to deprecate t= ype-less parameters/properties/etc. > > > > The alternative that gives the most future-flexibility is to do neither= : The variable is called $value, period, you can't change it, and you can't= change the type, either. There is no () after set, ever. Punt both of th= ose to a later follow-up. I'd prefer to include both now, but including ne= ither now is the next-safer option. > > > > > > ## Regarding $field > > > > Sigh, now y'all like it. :-P Most of the feedback on this has been neg= ative, so I'm inclined to leave it out at this point, unless there's a majo= r swing in feedback to bring it back. But the RFC seems more likely to pas= s without it than with right now. > > > > --Larry Garfield > > > > > Cheers > > > Stephen I would think that simply using return-to-set would be the simplest solution, if you need to run something after it's set, you can use the regular way of running code after a return: try { return $value + 100; } finally { // this runs after returning } Robert Landers Software Engineer Utrecht NL