Consensus gathering: allowing unsetting of backed property hooks

4 months ago by tight.fork3192@fastmail.com — view source

unread

Hello! I started GitHub issue requesting that property hooks be allowed to be unset(): https://github.com/php/php-src/issues/17922

While discussing in the issue it was suggested I pitch this to the mailing list, so here I am.

You can review the GitHub issue so I won't rehash the whole thing here, but I'll summarize the main points for unset() of backed property hooks:

It is well-established in PHP that one can unset() object properties. If we cannot unset() property hooks - which to a consumer are indistinguishable from plain properties - then the consumer must now be aware of the object's internal implementation before doing what used to be explicitly allowed. This is both poor encapsulation and not compatible with the commonly-understood behavior of unset(), leading to surprising cases where a consumer of an object could cause a fatal error by trying to unset() a property that unbeknownst to them is implemented as a property hook.
Getters may perform a calculation, or fetch from a database, and cache the result in the backing value, but without unset() we can end up in a state where the logic to determine that value can never be executed again. The workaround is exposing a public method that contains such logic, but that defeats the point of putting logic in property hooks in the first place, and exposes internal class implementation details that should not concern the consumer. Getting and caching values is a common case in ORM layers, one of the biggest users of PHP.
It would be convenient to be able to unset() backed properties when changing data elsewhere in a class, for example to trigger a recalculation or re-fetching from the database. It's not enough to set such properties to null because 1) the actual type of the property may in fact not be nullable, leading to inconsistency in the property's PHP type and the database column type, and 2) in practice null is a real value that can come from the database, which is a distinct concept from "uninitialized at the PHP level". (This is the old isset() vs is_initialized() problem, a tired struggle going on decades now.)

Some arguments raised against were internal implementation details; that unset() itself is a bad design choice so we should not cater to it; and that catering to uninitialized makes typing more complex.

I can't comment on implementation details.

To the last two points, I would say that for better or worse, unset() and uninitialized are part of PHP and I assume they're here to stay. Therefore if we're extending the language, we should extend it in ways that embrace, not ignore, its warts, because they're what make the language and they're what decades of developers are accustomed to. Ignoring the warts, or leaving them in the language but calling them bad practice that must be avoided, results in footguns and a language that is inconsistent with itself. (As I'm sure you're all aware, inconsistency is one of the favorite charges leveled at PHP by its detractors.)

And while uninitialized may make types more complex, it can often be very useful, especially when filling objects from database rows - one of the most common tasks a web developer will ever do. It might be technically purer to require that objects only be created with lengthy constructors that initialize every property with strongly typed values, and unset() and isset() never be used, but for better or worse that's just not how PHP evolved. Having the flexibility of objects with uninitialized, and therefore unsettable, properties is very useful. This flexibility is part of what makes PHP so attractive and what made it so popular in the first place.

In the GitHub thread it was mentioned that an RFC would be required. I've never written one, nor am I familiar with PHP internals, but I can try putting one together if there's interest.

4 months ago by Rowan Tommins [IMSoP] — view source

unread

Hello! I started GitHub issue requesting that property hooks be allowed to be unset():https://github.com/php/php-src/issues/17922

While discussing in the issue it was suggested I pitch this to the mailing list, so here I am.

Place me firmly in the "unset() is already weird enough" camp.

I actually started writing an RFC to rationalise some of this behaviour,
but gave up because it's such a mess.

Here are some of the things that might happen as a result of
unset($foo->bar):

Nothing at all, because unsetting a non-existent property isn't even a
Warning
A dynamic property (not declared, but created because it was assigned
a value) is deleted without trace
A property declared with no type becomes "undefined", and appears to
be deleted (e.g. disappears from var_dump) ...
... but retains its visibility (e.g. you'll still get an error writing
to it from outside the class if it's private)
A property declared with a type (even if that type is "mixed")
instead becomes "uninitialized" - it still appears in var_dump, and
accessing it is an Error, instead of a Warning
An __unset() handler might be called ...
... even if the property is declared, but is currently "undefined" or
"uninitialized" ...
... and that handler could do absolutely anything, including throwing
an exception

If we allow a hooked property to directly "unset" the backing value,
what exactly will it do? And what if it's a virtual property, so there
is nothing to unset? If we add an "unset" hook, what's the default
behaviour if one isn't defined?

I think it's one of those features that sounds simple when you look at a
single use case, but actually specify the behaviour in all cases is
going to be a lot of work.

--
Rowan Tommins
[IMSoP]

4 months ago by tight.fork3192@fastmail.com — view source

unread

I actually started writing an RFC to rationalise some of this behaviour

I'm glad I'm not the only one who considers this an issue worth pursuing!

Here are some of the things that might happen as a result of unset($foo->bar):

I don't disagree that there's a lot of weirdness. But for better or worse that's where PHP is now - it's a fundamentally weird language. I think it's better to be consistently weird than to be inconsistently weird. It would be inconsistent to allow unsetting some types of properties but not others, and which ones can and can't be unset are indistinguishable to a 3rd-party consumer. That's a footgun - the fact that it's caused by an evolution of weirdness is neither here nor there.

If we allow a hooked property to directly "unset" the backing value, what exactly will it do?

I can't comment from an implementation perspective, but as a user of PHP I would expect unsetting a backed property to return the property the "uninitialized" state, and subsequent access would proceed as if it were the first access of the uninitialized property.

Unsetting a virtual property could simply do nothing, but not result in a fatal error. I don't think a warning is even necessary because no action is taken.

If we add an "unset" hook, what's the default behaviour if one isn't defined?

Adding an unset hook could be out of scope of this proposal, but if there's sentiment that one should be added, not defining one would result in the behavior described above. I certainly don't want to be required to define an unset hook for every single backed property; rather unset() should have a default behavior.

I think it's one of those features that sounds simple when you look at a single use case, but actually specify the behaviour in all cases is going to be a lot of work.

Definitely. I do think that hard things are worth pursuing in the interest of consistency and functionality. But then again I'm not the one implementing it!

4 months ago by Larry Garfield — view source

unread

I actually started writing an RFC to rationalise some of this behaviour

I'm glad I'm not the only one who considers this an issue worth pursuing!

Here are some of the things that might happen as a result of unset($foo->bar):

I don't disagree that there's a lot of weirdness. But for better or
worse that's where PHP is now - it's a fundamentally weird language. I
think it's better to be consistently weird than to be inconsistently
weird. It would be inconsistent to allow unsetting some types of
properties but not others, and which ones can and can't be unset are
indistinguishable to a 3rd-party consumer. That's a footgun - the fact
that it's caused by an evolution of weirdness is neither here nor there.

If we allow a hooked property to directly "unset" the backing value, what exactly will it do?

I can't comment from an implementation perspective, but as a user of
PHP I would expect unsetting a backed property to return the property
the "uninitialized" state, and subsequent access would proceed as if it
were the first access of the uninitialized property.

Unsetting a virtual property could simply do nothing, but not result in
a fatal error. I don't think a warning is even necessary because no
action is taken.

If we add an "unset" hook, what's the default behaviour if one isn't defined?

Adding an unset hook could be out of scope of this proposal, but if
there's sentiment that one should be added, not defining one would
result in the behavior described above. I certainly don't want to be
required to define an unset hook for every single backed property;
rather unset() should have a default behavior.

I think it's one of those features that sounds simple when you look at a single use case, but actually specify the behaviour in all cases is going to be a lot of work.

Definitely. I do think that hard things are worth pursuing in the
interest of consistency and functionality. But then again I'm not the
one implementing it!

Most of the use cases you talk about are easily emulated from within the class. If you're calling unset($foo->bar) from outside of the class, I would argue the code is bad and needs to be rethought to begin with.

From within the class, the RFC linked to several examples, the first few of which involved caching of derived properties.

https://github.com/Crell/php-rfcs/blob/master/property-hooks/examples.md

Most notably for now:

class User
{
private array $cache = [];

// ...

public string $fullName { get => $this->cache[__PROPERTY__] ??= $this->first . " " . $this->last; }

}

And then internal to the class, unset($this->cache['fullName']) works perfectly. Problem solved.

And as noted, if you're calling unset() on a property from outside the object, PHP isn't the problem. Bad code design is the problem. Fix that instead.

--Larry Garfield

4 months ago by tight.fork3192@fastmail.com — view source

unread

Most of the use cases you talk about are easily emulated from within
the class. If you're calling unset($foo->bar) from outside of the
class, I would argue the code is bad and needs to be rethought to begin
with.

That may be so - we can judge the code PHP enables all we want, but the fact remains that PHP does allow this kind of thing and it can and will happen in the future. As the language evolves we should not be adding more inconsistency - like unset() now only working on some things and not others, and which ones work and which ones throw fatal errors being opaque to the consumer. That's leaving a footgun lying around and a violation of encapsulation.

From within the class, the RFC linked to several examples, the first
few of which involved caching of derived properties.
Problem solved.

Of course, one can work around almost anything. I'm hoping to envision a solution in which workarounds are not necessary by following established PHP idioms.

In the workarounds in the RFC, we either 1) double the number of properties in the class by including separate private cache variables, which is annoying to maintain and extend, and becomes a code convention instead of natural PHP. (Ironically, we unset() those private cache variables!!)

Or we 2) introduce a more complex situation with private cache arrays and magic constants. If we're talking about bad code, magic constants are certainly up there.

IMHO it's much more natural to simply unset the property from inside the class, like PHP has always allowed us to do and is what one of the very workarounds in the RFC does. The fact that the workaround in the RFC does it demonstrates how useful it is!

4 months ago by Rowan Tommins [IMSoP] — view source

unread

I actually started writing an RFC to rationalise some of this behaviour

I'm glad I'm not the only one who considers this an issue worth pursuing!

Sorry, I wasn't clear: I was looking at the existing typed vs untyped, unrefined vs uninitialised mess, before property hooks even existed.

Here are some of the things that might happen as a result of unset($foo->bar):

I don't disagree that there's a lot of weirdness. But for better or worse that's where PHP is now - it's a fundamentally weird language. I think it's better to be consistently weird than to be inconsistently weird.

My point is that it's already inconsistently weird.

It would be inconsistent to allow unsetting some types of properties but not others, and which ones can and can't be unset are indistinguishable to a 3rd-party consumer.

unset() can already have a bunch of different effects depending on the implementation of the class. As well as __unset being able to do absolutely anything, I missed from my list "readonly" properties, which reasonably enough always throw an error for unset, just like hooked properties.

I can't comment from an implementation perspective, but as a user of PHP I would expect unsetting a backed property to return the property the "uninitialized" state, and subsequent access would proceed as if it were the first access of the uninitialized property.

An untyped property is currently never in the "uninitialised" state, only a different "undefined" state. Presumably this inconsistency would need to be preserved (for consistency)

Unsetting a virtual property could simply do nothing, but not result in a fatal error. I don't think a warning is even necessary because no action is taken.

I don't see how that would be useful. The user presumably expected it to do something, so informing them that it didn't seems preferable to silently ignoring their request. It would also be inconsistent: a virtual property with no "set" hook throws an error when you try to set it, it doesn't silently discard the value.

I certainly don't want to be required to define an unset hook for every single backed property; rather unset() should have a default behavior.

I think you're focusing too closely on one use case, rather than all the ways people will want to use hooked properties. Imagine you have two properties which you want to keep in sync: setting either of them recalculates the other, using set hooks.

It would be really surprising to the class author if a user of the class could "reach in" and invalidate the state by calling unset() on one of the properties.
It would be really surprising to the user if doing so worked on one of the properties, but silently did nothing on the other because it was implemented as virtual.
It might be appropriate for the class author to add "unset" hooks to both properties, and for the user to see that unsetting one unset the other, just as setting one sets the other.

That's just one scenario, I'm sure there are others where you could picture different expectations, particularly accounting for some of the other behaviour of unset(). That's what I mean by it being hard to specify the behaviour; nothing to do with the implementation.

Regards,
Rowan Tommins
[IMSoP]

4 months ago by Rob Landers — view source

unread

I actually started writing an RFC to rationalise some of this behaviour

I'm glad I'm not the only one who considers this an issue worth pursuing!

Sorry, I wasn't clear: I was looking at the existing typed vs untyped, unrefined vs uninitialised mess, before property hooks even existed.

Here are some of the things that might happen as a result of unset($foo->bar):

I don't disagree that there's a lot of weirdness. But for better or worse that's where PHP is now - it's a fundamentally weird language. I think it's better to be consistently weird than to be inconsistently weird.

My point is that it's already inconsistently weird.

It would be inconsistent to allow unsetting some types of properties but not others, and which ones can and can't be unset are indistinguishable to a 3rd-party consumer.

unset() can already have a bunch of different effects depending on the implementation of the class. As well as __unset being able to do absolutely anything, I missed from my list "readonly" properties, which reasonably enough always throw an error for unset, just like hooked properties.

I can't comment from an implementation perspective, but as a user of PHP I would expect unsetting a backed property to return the property the "uninitialized" state, and subsequent access would proceed as if it were the first access of the uninitialized property.

An untyped property is currently never in the "uninitialised" state, only a different "undefined" state. Presumably this inconsistency would need to be preserved (for consistency)

Unsetting a virtual property could simply do nothing, but not result in a fatal error. I don't think a warning is even necessary because no action is taken.

I don't see how that would be useful. The user presumably expected it to do something, so informing them that it didn't seems preferable to silently ignoring their request. It would also be inconsistent: a virtual property with no "set" hook throws an error when you try to set it, it doesn't silently discard the value.

I certainly don't want to be required to define an unset hook for every single backed property; rather unset() should have a default behavior.

I think you're focusing too closely on one use case, rather than all the ways people will want to use hooked properties. Imagine you have two properties which you want to keep in sync: setting either of them recalculates the other, using set hooks.

It would be really surprising to the class author if a user of the class could "reach in" and invalidate the state by calling unset() on one of the properties.

It would be really surprising to the user if doing so worked on one of the properties, but silently did nothing on the other because it was implemented as virtual.

It might be appropriate for the class author to add "unset" hooks to both properties, and for the user to see that unsetting one unset the other, just as setting one sets the other.

That's just one scenario, I'm sure there are others where you could picture different expectations, particularly accounting for some of the other behaviour of unset(). That's what I mean by it being hard to specify the behaviour; nothing to do with the implementation.

Regards,
Rowan Tommins
[IMSoP]

I do think it makes sense to have an unset hook though, so long as it is thought out well. For example, would the unset hook be called automatically during garbage collection? Is it only called via unset() or are there other cases where it could be called too?

Regardless of how wise it is to unset from outside a class, doing the suggested workaround from inside the class seems a bit odd as well.

— Rob

4 months ago by Ilija Tovilo — view source

unread

I do think it makes sense to have an unset hook though, so long as it is thought out well. For example, would the unset hook be called automatically during garbage collection?

I would be very wary of adding more ways for destruction to be
observable. Destructors themselves cause enough issues in the engine
as is.

Ilija