Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122537 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 6D1C21AD8F6 for ; Fri, 1 Mar 2024 11:17:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1709274520; bh=WPwZjvYQ2utDFrKN0uCADYzBiUGWcSnPB/zsVIEf12A=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=klmBqBP3x6d6UhbJrZdwu7eLSmFvyee2Zo5+yFRhDQwBmyR7uZZVicOt5JhdWZQ4k UNb82FQi73Su5ETIPwQ0M7BoDwvSPC6nnowFby0M5hal+b+/GkVpUBxB/9vMf2t8SX dAf+QDkPCfpnp9N1caPiWV4kl5YJn5mXiV6Zt8oLsExWmA5l05+rJAh+4P5o+DoEpc U7IVw6RPXHxzmepu7ZOhm9sjQQ6+hxmoOHGNyAvBTOWQFTkjPZggBWml01TwdWdZ5w P+C5Rub5hlQOFKU7w1Fl8jOw6O3VsPOcD8DNVrD2LxUWUIDdQdQu6ZniAZFfxHRq0C z6BBiv/2Gto3Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 630DC187AF3 for ; Fri, 1 Mar 2024 06:28:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DMARC_MISSING, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail1.25mail.st (mail1.25mail.st [206.123.115.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 1 Mar 2024 06:28:38 +0000 (UTC) Received: from smtpclient.apple (unknown [49.48.219.94]) by mail1.25mail.st (Postfix) with ESMTPSA id 2F82F6056A; Fri, 1 Mar 2024 06:28:21 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.400.31\)) Subject: Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2 In-Reply-To: Date: Fri, 1 Mar 2024 13:28:08 +0700 Cc: php internals Content-Transfer-Encoding: quoted-printable Message-ID: <115C4A93-120E-45A4-85E2-D91649B66E70@koalephant.com> References: <59619244-917d-4936-8f21-2854840a9bf8@rwec.co.uk> <2299271f-50ea-48c1-81fb-b64fa10c9bbb@app.fastmail.com> <1204BFC3-B976-4FEE-BE01-E668699C84E2@koalephant.com> To: Larry Garfield X-Mailer: Apple Mail (2.3774.400.31) From: php-lists@koalephant.com (Stephen Reay) > On 28 Feb 2024, at 21:25, Larry Garfield = wrote: >=20 > On Wed, Feb 28, 2024, at 7:49 AM, Stephen Reay wrote: >=20 > *snip* >=20 >>> =3D=3D Re virtual properties: >>>=20 >>> Ilija and I talked this through, and there's pros and cons to a = `virtual` keyword. Ilija also suggested a `backed` keyword, which = forces a backed property to exist even if it's not used in the hook = itself. >>>=20 >>> * Adding `virtual` adds more work for the developer, but more = clarity. It would also mean $this->$propName or $this->{__PROPERTY__} = would work "as expected", since there's no auto-detection for = virtual-ness. On the downside, if you have a could-be-virtual property = but never actually use the backing value, you have an extra backing = value hanging around in memory that is inaccessible normally, but will = still show up in some serialization formats, which could be unexpected. = If you omit one of the hooks and forget to mark it virtual, you'll still = get the default of the other operation, which could be unexpected. = (Mostly this would be for a virtual-get that accidentally has a default = setter because you forgot to mark it `virtual`.) >>> * Doing autodetection as now, but with an added "make a backing = value anyway" flag would resolve the use case of "My set hook just calls = a method, and that method sets the property, but since the hook doesn't = mention the property it doesn't get created" problem. It would also = allow for $this->$propName to work if a property is explicitly backed. = On the flipside, it's one more thing to think about, and the above = example it solves would be trivially solved by having the method just = return the value to set and letting the set hook do the actual write, = which is arguably better and more reliable code anyway. >>> * The status quo (auto-detection based on the presence of = $this->propName). This has the advantage it "just works" in the 95% = case, without having to think about anything extra. The downside is it = does have some odd edge cases, like needing $this->propName to be = explicitly used. =20 >>>=20 >>> I don't think any is an obvious winner. My personal preference = would be for status quo (auto-detect) or explicit-virtual always. I = could probably live with either, though I think I'd personally favor = status quo. Thoughts from others? >>>=20 >>=20 >> I agree that a flag to make the field *virtual* (thus disabling the=20= >> backing store) makes more sense than a flag to make it backed; It's=20= >> also easier to understand when comparing hooked properties with = regular=20 >> properties (essentially, backed is the default, you have to opt-in to=20= >> it being virtual). I don't think the edge cases of "auto" make it=20 >> worthwhile just to not need "virtual". >=20 > Uh, I can't tell if you're saying "use status quo" or "use the virtual = keyword." Please clarify. :-) I'm saying I think an explicit `virtual` keyword is the better option of = the three (because the benefits of the 'auto' mode don't outweigh the = edge cases it introduces). >=20 >>> =3D=3D Re reference-get >>>=20 >>> Allowing backed properties to have a reference return creates a = situation where any writes would then bypass the set hook, and thus any = validation implemented there. That is, it makes the validation = unreliable. A major footgun. The question is, do we favor = caveat-emptor flexibility or correct-by-construction safety? Personally = I always lead toward the latter, though PHP in general is... = schizophrenic about it, I'd say. :-) >>>=20 >>> At this point, we'd much rather leave it blocked to avoid the issue; = it's easier to enable that loophole in the future if we really want it = than to get rid of it if it turns out to have been a bad idea. >>>=20 >>> There is one edge case that *might* make sense: If there is no set = hook defined, then there's no set hook to worry about bypassing. So it = may be safe to allow &get on backed properties IFF there is no set hook. = I worry that is "one more quirky edge case", though, so as above it may = be better to skip for now as it's easier to add later than remove. But = if the consensus is to do that, we're open to it. (Question for = everyone.) >>>=20 >>=20 >> I don't have strong feeling about this, but in general I usually tend=20= >> to prefer options that are consistent, and give power/options to the=20= >> developer. If references are opt-in anyway, I see that as accepting = the=20 >> trade-offs. If a developer doesn't want to allow by-ref modifications=20= >> of the property, why would they make it referenceable in the first=20 >> place? This sounds a bit like disallowing regular public properties=20= >> because they might be modified outside the class - that's kind of the=20= >> point, surely. >>=20 >>> =3D=3D Re=20 >>>=20 >>> =3D=3D Re arrays >>>=20 >>>> Regarding arrays, have you considered allowing array-index writes = if >>> an &get hook is defined? i.e. "$x->foo['bar'] =3D 42;" could be = treated >>> as semantically equivalent to "$_temp =3D& $x->foo; $_temp['bar'] =3D = 42; >>> unset($_temp);" >>>=20 >>> That's already discussed in the RFC: >>>=20 >>>> The simplest approach would be to copy the array, modify it = accordingly, and pass it to set hook. This would have a large and = obscure performance penalty, and a set implementation would have no way = of knowing which values had changed, leading to, for instance, code = needing to revalidate all elements even if only one has changed. >>>=20 >>> Unless we were OK with that bypassing the set hook entirely if = defined, which, as noted above, means any safety guarantees provided by = a set hook are bypassed, leading to untrustworthy code. >>>=20 >>> =3D=3D Re hook shorthands and return values >>>=20 >>> Ilija and I have been discussing this for a bit, and we've both = budged a little. :-) Here's our counter-proposal: >=20 > *snip* >=20 >> I think the examples given are clear, and the lack of the top-level=20= >> short closure-esque version makes it more obvious. Forgive me, I must=20= >> have missed some of the previous comments - is there a reason the=20 >> 'full' setter can't return a value, for the sake of consistency? I=20 >> understand that you don't want "return to set" to be the *only* = option,=20 >> for the sake of e.g. change/audit logging type functionality (i.e. = set=20 >> and then some action to record that the change was made), but it = seems=20 >> a little odd and inconsistent to me that the return value of a short=20= >> closure would be used when the return value of the long version = isn't.=20 >> This isn't really a major issue, I'm just curious if there was some=20= >> explanation about it? >=20 > Mainly because it introduces a lot more complexity. As far as I'm = aware, determining at runtime whether or not to make use of the return = value is impossible. (Ie, we cannot differentiate between "return", "no = return statement", and "return null".) So it would require compile time = branching to generate two different pathways after lexically detecting = if there is a "return" token present somewhere in the hook body. That = is probably doable, technically, but introduces more complexity to an = already necessarily-large RFC. It's also something that is simple = enough to add later as an option without any BC breaks or implications = for other parts of the design, so it's safe to punt. (Some things are = harder to punt on than others, as noted, but this one is easy/safe to = skip for now.) >=20 Right, i hadn't considered that whole "everything implicitly returns, = even if it's null" scenario. Makes sense. The inconsistency is still a = bit jarring but I understand the reasoning now, thanks. >>> =3D=3D Re the $value variable in set >=20 > *snip* >=20 >>> So what makes the most sense to me is to keep $value optional, but = IF you specify an alternate name, you must also specify a type (which = may be wider). So these are equivalent: >>>=20 >>> public int $foo { set (int $value) =3D> $value + 1 } >>> public int $foo { set =3D> $value + 1 } >>>=20 >>> And only those forms are legal. But you could also do this, if the = situation called for it: >>>=20 >>> public int $foo { set(int|float $num) =3D> floor($num) + 1; } >>>=20 >>> This "all or nothing" approach seems like it strikes the best = balance, gives the most flexibility where needed while still having the = least redundancy when not needed, and when a name/type is provided, its = behavior is the same as for a method being inherited. >>>=20 >>> Does that sound acceptable? (Again, question for everyone.) >>>=20 >>=20 >> My only question with this is the same as I had in an earlier reply=20= >> (and I'm not sure it was ever answered directly?), and you allude to=20= >> this yourself: everywhere *else*, `($var)` means a parameter with = type=20 >> `mixed`. Why is the type *required* here, when you've specifically = said=20 >> you want to avoid boilerplate? If we're going to assume people can=20 >> understand that `(implicit property-type $value) is implicit, surely = we=20 >> can also assume that they will understand "specifying a parameter=20 >> without a type" means the parameter has no type (i.e. is `mixed`). >>=20 >> Again, for myself I'd be likely to type it (or regular parameters,=20 >> properties, etc) as `mixed` if that's what I want *anyway*, but the=20= >> inconsistency here seems odd, unless there's some until-now unknown=20= >> drive to deprecate type-less parameters/properties/etc. >=20 > If we went this route, then an untyped set param would likely imply = "mixed", just like on methods. Which, since mixed is the super type of = everything, would still technically work, but would weaken the type = enforcement and thus static analysis potential. (Just as a method param = can be widened in a child class all the way to mixed/omitted, and it = would be unwise for all the same reasons.) >=20 Having a `mixed` param in the set hook shouldn't weaken the actual = backing parameter though - when the hook writes to `$this->prop`, the = parent type is still enforced, *surely*? If not, why not? As for how static analysis tools handle this concept - I'd have thought = it's too early to suggest what static analysis tools will or won't = support given how much they already support based on less-formal syntax = like docblocks. It's *already* possible to have a property that is = reported (to static analysis tools/IDEs/etc) as a fixed type, but = accepts a wider type on write, with the current mish-mash of typed = properties, docblocks, magic getters and setters, and the bizarre = `unset` behaviour. This is simply converting that into standardised = syntax. The RFC itself proposes a scenario where a wider type is = accepted in the set hook. I find it hard to believe that a static = analysis tool can model "this property is `Foo` but it accepts = `string|Foo` on write" but not "this property is `Foo` but it accepts = `mixed` on write". Heck if that's such a problem what is said tool going = to do when someone *explicitly* widens the parameter to `mixed` in a set = hook?=20 I would argue that if the language is providing better support for typed = properties by adding 'hooks' like this, the need for static analysis of = those specific parts reduces greatly - if someone wants to accept = `mixed` when storing as a string, and convert it in the hook, and the = language can **enforce those types at runtime**, why should some = hypothetical static analysis be a hangup for that?=20 > In the RFC as currently written, omitted means "derive from the = property," which is a concept that doesn't exist in methods; the closest = equivalent would be if omitting a type in a child method parameter meant = "use the parent's type implicitly," which is not how that works right = now. For the third time: I'm well aware of how parameter types work = everywhere else, and that's why I'm asking why the same behaviour isn't = being followed here?=20 - You've said you want to avoid boilerplate; and - You've said you would expect most people to just use the implicit = `same-type $value` parameter; and - You've acknowledged that the existing 'standard' is that a parameter = without a type is considered `mixed`; and - You've acknowledged in your RFC that there is a use-case for wanting = to accept a wider type than what a property stores internally. So why then is it so unacceptable that the existing standard be = followed, such that a set hook with an "untyped" parameter would be = treated as `mixed` as it is everywhere else?=20 Yes, I know you said "widening to `mixed` is unwise". I don't seem to = recall amongst all the type-related previous RFCs, any that suggested = that child parameters widening to `mixed` (either explicitly or = implicitly) should be deprecated, so I'm sorry but I don't see much = value in that argument. >=20 > --Larry Garfield >=20 Cheers Stephen=20=