[RFC][Discussion] Harmonise "untyped" and "typed" properties

1 year ago by Jakub Zelenka — view source

unread

Hi,

On Thu, Nov 16, 2023 at 8:41 PM Rowan Tommins rowan.collins@gmail.com
wrote:

Hi all,

I have finally written up an RFC I have been considering for some time:
Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

Currently, changing a property declaration from "private $foo" to
"private mixed $foo" changes its behaviour in significant ways, even
though no actual type checks are added. This RFC seeks to remove those
differences, by extending the "uninitialized" state currently reserved
for typed properties to cover all declared properties:

Properties without an initial value will be "uninitialized", not "null"

Calling "unset" on a declared property will make it "uninitialized",
rather than the current complex behaviour

There are a handful of open questions still in the RFC, but I wanted to
present an initial draft to start the discussion, because the current
behaviour is quite hard to explain in a short e-mail.

Please let me know your thoughts.

This sounds like a huge BC break. Probably bigger and a bit harder to fix
than disallowing dynamic props. I have seen is_null and similar checks done
on uninitialized props many times and this will certainly break such cases.
You can say that it just requires adding default null value but that might
not be that simple if you depend on some library or code base is big.

Maybe it would be better to do this as some sort of opt in behaviour using
class attributes which could even disallow untyped props completely as it
would be opt in...

Cheers

Jakub

1 year ago by Rowan Tommins — view source

unread

This sounds like a huge BC break. Probably bigger and a bit harder to fix
than disallowing dynamic props.

More common, maybe; but trivial to fix: add "=null" at the end of all untyped property declarations that don't already have an initializer. It would be trivial to automate with something like Rector.

Maybe it would be better to do this as some sort of opt in behaviour

We already have the opt-in behaviour: add "mixed" to a property declaration, and it no longer gets initialized to null. The aim of the RFC is to eliminate that distinction, not to encourage typed properties.

I do however agree that the initial value part is quite disruptive, and am open to suggestions on how to minimise that. A couple that have occurred to me:

Change the unset() behaviour, which we're already planning to make produce errors in 9.0; but keep the implicit initializer. In other words, make "public $foo;" equivalent to "public mixed $foo=null;" The obvious downside is that it's just as weird a special-case for users to learn as what we have now.
Have a longer timeline: fix the unset() behaviour in 9.0, and the initializer in 10.0. But that just postpones both the pain and the gain.
Actually require all properties to have a type at some point in the future, so the behaviour of untyped properties becomes a moot point. I'm pretty sure this is a non-starter since we didn't even manage to remove "var" in favour of "public", but I wanted to say it for completeness.

Maybe someone can come up with some other variation or compromise?

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Claude Pache — view source

unread

Le 17 nov. 2023 à 11:47, Rowan Tommins rowan.collins@gmail.com a écrit :

This sounds like a huge BC break. Probably bigger and a bit harder to fix
than disallowing dynamic props.

More common, maybe; but trivial to fix: add "=null" at the end of all untyped property declarations that don't already have an initializer. It would be trivial to automate with something like Rector.

The fix is trivial, but the number of files to be touched is huge.

Maybe it would be better to do this as some sort of opt in behaviour

We already have the opt-in behaviour: add "mixed" to a property declaration, and it no longer gets initialized to null. [...]

Yes, except that an untyped (respectively mixed) property cannot be redeclared as mixed (resp. untyped) in a subclass. A small step in the right direction is to allow that.

—Claude

1 year ago by Rowan Tommins — view source

unread

Yes, except that an untyped (respectively mixed) property cannot be redeclared as mixed (resp. untyped) in a subclass. A small step in the right direction is to allow that.

Huh, I didn't know that. I'll add it to the RFC, at least to consider.

The RFC to add "mixed" gives an example of removing the type as invariance, but doesn't seem to justify why "untyped" and "mixed" should be considered different, from a type system point of view. https://wiki.php.net/rfc/mixed_type_v2

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Claude Pache — view source

unread

Le 17 nov. 2023 à 14:53, Rowan Tommins rowan.collins@gmail.com a écrit :

Yes, except that an untyped (respectively mixed) property cannot be redeclared as mixed (resp. untyped) in a subclass. A small step in the right direction is to allow that.

Huh, I didn't know that. I'll add it to the RFC, at least to consider.

The RFC to add "mixed" gives an example of removing the type as invariance, but doesn't seem to justify why "untyped" and "mixed" should be considered different, from a type system point of view. https://wiki.php.net/rfc/mixed_type_v2

Note that untyped is different from mixed in the case of return values of functions: in that context, untyped is equivalent to mixed|void. In all other contexts, untyped and mixed are effectively equivalent, because void is void of sense.

—Claude

1 year ago by Larry Garfield — view source

unread

This sounds like a huge BC break. Probably bigger and a bit harder to fix
than disallowing dynamic props.

More common, maybe; but trivial to fix: add "=null" at the end of all
untyped property declarations that don't already have an initializer.
It would be trivial to automate with something like Rector.

Maybe it would be better to do this as some sort of opt in behaviour

We already have the opt-in behaviour: add "mixed" to a property
declaration, and it no longer gets initialized to null. The aim of the
RFC is to eliminate that distinction, not to encourage typed properties.

I do however agree that the initial value part is quite disruptive, and
am open to suggestions on how to minimise that. A couple that have
occurred to me:

Change the unset() behaviour, which we're already planning to make
produce errors in 9.0; but keep the implicit initializer. In other
words, make "public $foo;" equivalent to "public mixed $foo=null;" The
obvious downside is that it's just as weird a special-case for users to
learn as what we have now.

This seems like the easiest way forward, with the fewest breaks. In particular, I would therefore expect reflection to treat public $foo as though it had been public mixed $foo = null, and thus normalize how the object looks. That's one less edge case for me to deal with, so I support that.

As noted, though, there's still readonly to consider.

Of note, both ?? and ??= currently treat uninitialized the same as null; this is a very good feature that is extraordinarily useful, and has the nice side effect of reducing the impact of this change.

--Larry Garfield

1 year ago by Ilija Tovilo — view source

unread

Hi Rowan,

Thanks for the RFC.

I have finally written up an RFC I have been considering for some time:
Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

Unifying the unset behavior sounds sensible. I don't see a good reason
for a declared property to be hidden through unset. If this behavior
is desired, the property should be declared dynamically in the
constructor with #[AllowDynamicProperties] added to the class.

Like Jakub, I am also worried about the BC break of changing the
default value to uninitialized. However, I don't think opt-in behavior
is worthwhile (because I don't believe people make sufficient use of
it, while adding something new we have to support forever). If we pick
either of the options you presented (1. initialize all nullable
properties to null, 2. Make all properties uninitialized) I'd vote for
the first one. Then again, given (it seems) people are happier with
the strict behavior of typed properties, do we need to unify the
behavior at all if it means losing that? Currently, they can choose
the behavior they prefer by adding or omitting mixed.

Furthermore, the first approach clashes somewhat with readonly.
Readonly can't have null as a default value because that would make it
legal to access the value before the property is explicitly
initialized, and the property would no longer be uninitialized, making
the explicit assignment illegal. We could just not add the default
value for readonly, but that means more special rules that we're
trying to avoid.

Ilija

1 year ago by Deleu — view source

unread

On Thu, Nov 16, 2023 at 5:41 PM Rowan Tommins rowan.collins@gmail.com
wrote:

Hi all,

I have finally written up an RFC I have been considering for some time:
Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

Currently, changing a property declaration from "private $foo" to
"private mixed $foo" changes its behaviour in significant ways, even
though no actual type checks are added. This RFC seeks to remove those
differences, by extending the "uninitialized" state currently reserved
for typed properties to cover all declared properties:

Properties without an initial value will be "uninitialized", not "null"

Calling "unset" on a declared property will make it "uninitialized",
rather than the current complex behaviour

There are a handful of open questions still in the RFC, but I wanted to
present an initial draft to start the discussion, because the current
behaviour is quite hard to explain in a short e-mail.

Please let me know your thoughts.

Regards,

--
Rowan Tommins
[IMSoP]

--

To unsubscribe, visit: https://www.php.net/unsub.php

I have mixed feelings about this. Reading just the email gives me the
impression of an unimaginable BC Break with the potential of holding back
PHP upgrades A LOT. But reading the RFC, it is pretty clear that what we
have now is a consequence of entropy and what is being proposed is to
explicitly make a plan with all existing cases in mind. The fix for the BC
is something trivial: null was implicit but now will require explicitness.
The behavior which makes existing code work seems easy to get in place.
However, the amount of code that would need this and the fact that Rector
is not used by every PHP Developer is something that makes this a big deal.

The thing I like the most about the proposal is making a sensible choice to
cover the scenarios that are intertwined. Considering what has been said
about opt-in making things harder with no desired mass adoption, perhaps
what could be done instead is an easy-way-out (opt-out). An attribute that
can be added to the class and have it automatically default untyped
uninitialized properties to null. This allows the original proposal to
stand and allow for a wide adoption while still giving a way-out for
computer systems written 30~15 years ago that are still running and still
being slowly refactored / rewritten.

One thing to make it explicitly clear about my suggestion, I don't want or
intend to have an attribute that keeps the current engine as-is + develop
the engine with the new behavior because in my mind that could mean
multiple areas of the engine and it could be quite complex to handle. My
proposal is more about an ad-hoc self-contained Annotation that simply goes
through the class and automatically set everything to null before the
engine does its thing. In a way, it could still be a BC-break no matter
what - 2 different behaviors of the language, but I'm thinking that such
attribute could make things behave as-is 99% of the time and allow legacy
systems to still breathe.

--
Marco Deleu

1 year ago by Rowan Tommins — view source

unread

Hi all,

I have finally written up an RFC I have been considering for some
time: Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

I've revised the RFC; it now proposes to keep the implicit "= null" for
untyped properties, although I'm still interested in suggestions for
other strategies around that. I have also added discussion of variance
checks (thanks Claude for the tips on that).

While doing so, I checked Reflection, and am unsure how to proceed.
Currently ReflectionParameter shows a difference between "function
foo($bar)" and "function foo(mixed $bar)", even though these are
analysed as equivalent in inheritance checks. Should ReflectionProperty
also retain this distinction? Was the possibility discussed when "mixed"
was introduced of using a ReflectionType of mixed for both cases?

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Larry Garfield — view source

unread

Hi all,

I have finally written up an RFC I have been considering for some
time: Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

I've revised the RFC; it now proposes to keep the implicit "= null" for
untyped properties, although I'm still interested in suggestions for
other strategies around that. I have also added discussion of variance
checks (thanks Claude for the tips on that).

Thanks. It's looking pretty good, and should simplify things considerably.

While doing so, I checked Reflection, and am unsure how to proceed.
Currently ReflectionParameter shows a difference between "function
foo($bar)" and "function foo(mixed $bar)", even though these are
analysed as equivalent in inheritance checks. Should ReflectionProperty
also retain this distinction? Was the possibility discussed when "mixed"
was introduced of using a ReflectionType of mixed for both cases?

I don't recall any discussion of ReflectionType when mixed was added, no.

My initial gut reaction is that both ReflectionParameter and ReflectionProperty should treat "omitted" as "mixed", and just evaluate their type to "mixed". It is in practice a distinction without meaning, or will be after this RFC. That said, I also have an itch telling me that there are a few small-but-important edge cases where you would care about the difference between mixed and none in reflection, and that I'd be the one to run into them, but I cannot think of what they would be.

--Larry Garfield

1 year ago by Robert Landers — view source

unread

Hi all,

I have finally written up an RFC I have been considering for some
time: Harmonise "untyped" and "typed" properties

RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

I've revised the RFC; it now proposes to keep the implicit "= null" for
untyped properties, although I'm still interested in suggestions for
other strategies around that. I have also added discussion of variance
checks (thanks Claude for the tips on that).

Thanks. It's looking pretty good, and should simplify things considerably.

While doing so, I checked Reflection, and am unsure how to proceed.
Currently ReflectionParameter shows a difference between "function
foo($bar)" and "function foo(mixed $bar)", even though these are
analysed as equivalent in inheritance checks. Should ReflectionProperty
also retain this distinction? Was the possibility discussed when "mixed"
was introduced of using a ReflectionType of mixed for both cases?

I don't recall any discussion of ReflectionType when mixed was added, no.

My initial gut reaction is that both ReflectionParameter and ReflectionProperty should treat "omitted" as "mixed", and just evaluate their type to "mixed". It is in practice a distinction without meaning, or will be after this RFC. That said, I also have an itch telling me that there are a few small-but-important edge cases where you would care about the difference between mixed and none in reflection, and that I'd be the one to run into them, but I cannot think of what they would be.

--Larry Garfield

--

To unsubscribe, visit: https://www.php.net/unsub.php

Hmmm,

That said, I also have an itch telling me that there are a few small-but-important edge cases where you would care about the difference between mixed and none in reflection,

I'd also probably run into them, and the only one I can think of will
be moot after this RFC:

unset($this->var) to use magic methods instead of the properties
in some older proxies

Other than that, I can't think of anything, though that isn't really
important for reflection...

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Nicolas Grekas — view source

unread

Hi Rowan, Larry,

Thanks for the RFC.

I think there is an inaccuracy that needs to be fixed in the after-unset
state : as noted later in the RFC, magic accessors are called after an
unset($this->typedProps). This means the state cannot be described as
identical ("uninitialized') before and after unset() in the first table in
the RFC. Isn't there some vocabulary in the source that we can use to
describe those states more accurately?

My initial gut reaction is that both ReflectionParameter and

ReflectionProperty should treat "omitted" as "mixed", and just evaluate
their type to "mixed". It is in practice a distinction without meaning, or
will be after this RFC. That said, I also have an itch telling me that
there are a few small-but-important edge cases where you would care about
the difference between mixed and none in reflection, and that I'd be the
one to run into them, but I cannot think of what they would be.

I think this needs to be clarified. I can't think of when this difference
could matter. I never wrote any code that could give any meaning to mixed
vs untyped properties, and many here know that I can write unusual PHP
code. I actually always wondered why we have this difference and I
therefore welcome the RFC.

Maybe this can be evaluated up to the point where we realize that the
change could go into 8.4?
I'd be happy to run a patched PHP on some codebases I maintain to see how
it goes.

Nicolas

1 year ago by Rowan Tommins — view source

unread

I think there is an inaccuracy that needs to be fixed in the after-unset
state : as noted later in the RFC, magic accessors are called after an
unset($this->typedProps). This means the state cannot be described as
identical ("uninitialized') before and after unset() in the first table in
the RFC. Isn't there some vocabulary in the source that we can use to
describe those states more accurately?

Oh. Wow. That's more than just inaccurate terminology...

I always assumed the rule was "access to uninitialised properties triggers __get", not that there was yet another magical state buried in the implementation. From a user point of view, I find that frankly terrible:

Typed properties start off as uninitialized, but if you use unset(), you can make them super-uninitialized.

There's no way to actually see if something's uninitialized or super-uninitialized; and once you've assigned a value, you can't go back to the original uninitialized, only to super-uninitialized.

Accessing an uninitialized property always throws an error, whereas accessing a super-uninitialized property will first check for __get.

I'm not sure choosing a different name from "super-uninitialized" makes much difference to how that reads.

I'm probably going to regret asking this, but is there some reason it works that way? Is there any chance of changing it to just:

Typed properties start off as uninitialized.

Once you've assigned a value, you can't go back to the original uninitialized state using unset()

Accessing an uninitialized property will first check for __get, and throw an error if that isn't defined.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Claude Pache — view source

unread

Le 22 nov. 2023 à 23:17, Rowan Tommins rowan.collins@gmail.com a écrit :

I'm probably going to regret asking this, but is there some reason it works that way? Is there any chance of changing it to just:

Typed properties start off as uninitialized.

Once you've assigned a value, you can't go back to the original uninitialized state using unset()

Accessing an uninitialized property will first check for __get, and throw an error if that isn't defined.

Hi Rowan,

What you describe in the last sentence is what was initially designed and implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2 (section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in order to trigger __get()) in https://github.com/php/php-src/pull/4974

—Claude

1 year ago by Rowan Tommins — view source

unread

What you describe in the last sentence is what was initially designed and implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2 (section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in order to trigger __get()) in https://github.com/php/php-src/pull/4974

Good find. So not only is it not specified this way in the RFC, it actually made it into a live release, then someone complained and we rushed out a more complicated version "to avoid WTF". That's really unfortunate.

I'm not at all convinced by the argument in the linked bug report - whether you get an error or an unexpected call to __get, the solution is to assign a valid value to the property. And making the behaviour different after unset() just hides the user's problem, which is that they didn't expect to ever have a call to __get for that property.

But I guess I'm 4 years too late to make that case.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Robert Landers — view source

unread

What you describe in the last sentence is what was initially designed and implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2 (section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in order to trigger __get()) in https://github.com/php/php-src/pull/4974

Good find. So not only is it not specified this way in the RFC, it actually made it into a live release, then someone complained and we rushed out a more complicated version "to avoid WTF". That's really unfortunate.

I'm not at all convinced by the argument in the linked bug report - whether you get an error or an unexpected call to __get, the solution is to assign a valid value to the property. And making the behaviour different after unset() just hides the user's problem, which is that they didn't expect to ever have a call to __get for that property.

But I guess I'm 4 years too late to make that case.

Regards,

--
Rowan Tommins
[IMSoP]

--

To unsubscribe, visit: https://www.php.net/unsub.php

Heh

But I guess I'm 4 years too late to make that case.

4 years ago, I was using this behavior to write proxies that actually
called things over the network instead of the properties. I didn't
file that bug, but I probably would have filed something similar.
Nowadays, those proxies are dead as we long since generated code based
on interfaces (so they'll pass PHP type checks). I venture once we
have property hooks (assuming that is still a thing coming along),
this behavior can be removed completely as you can just use property
hooks instead of unset to call __get. I'm actually quite excited about
property hooks and hope they pass.

Personally, I haven't used this behavior since 7.4-ish, and I'd be
surprised to see it relied on in modern PHP: code generation is too
easy these days.

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Nicolas Grekas — view source

unread

Hi Rowan,

Le jeu. 23 nov. 2023 à 08:56, Rowan Tommins rowan.collins@gmail.com a
écrit :

On 23 November 2023 01:37:06 GMT, Claude Pache claude.pache@gmail.com
wrote:

What you describe in the last sentence is what was initially designed and
implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2
(section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in
order to trigger __get()) in https://github.com/php/php-src/pull/4974

Good find. So not only is it not specified this way in the RFC, it
actually made it into a live release, then someone complained and we rushed
out a more complicated version "to avoid WTF". That's really unfortunate.

I'm not at all convinced by the argument in the linked bug report -
whether you get an error or an unexpected call to __get, the solution is to
assign a valid value to the property. And making the behaviour different
after unset() just hides the user's problem, which is that they didn't
expect to ever have a call to __get for that property.

But I guess I'm 4 years too late to make that case

Sorry this comes as a surprise to you but you're rewriting history here.
The current behavior, the one that was fixed in that commit, matches how
PHP behaved before typed properties, so this commit brought consistency.

About the behavior, it's been in use for many years to build lazy proxies.
I know two major use cases that leverage this powerful capability: Doctrine
entities and Symfony lazy services. There are more as any code that
leverages ocramius/proxy-manager relies on this.

About the vocabulary, the source tells us that "uninitialized" properties
that are unset() become "undefined". I know that's not super accurate since
a typed property is always defined semantically, but that's nonetheless the
flag that is used in the source. Maybe this could help with the RFC.

Cheers,
Nicolas

1 year ago by Nikita Popov — view source

unread

Hi Rowan,

Le jeu. 23 nov. 2023 à 08:56, Rowan Tommins rowan.collins@gmail.com a
écrit :

On 23 November 2023 01:37:06 GMT, Claude Pache claude.pache@gmail.com
wrote:

What you describe in the last sentence is what was initially designed and
implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2
(section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in
order to trigger __get()) in https://github.com/php/php-src/pull/4974

Good find. So not only is it not specified this way in the RFC, it
actually made it into a live release, then someone complained and we rushed
out a more complicated version "to avoid WTF". That's really unfortunate.

I'm not at all convinced by the argument in the linked bug report -
whether you get an error or an unexpected call to __get, the solution is to
assign a valid value to the property. And making the behaviour different
after unset() just hides the user's problem, which is that they didn't
expect to ever have a call to __get for that property.

But I guess I'm 4 years too late to make that case

Sorry this comes as a surprise to you but you're rewriting history here.
The current behavior, the one that was fixed in that commit, matches how
PHP behaved before typed properties, so this commit brought consistency.

About the behavior, it's been in use for many years to build lazy proxies.
I know two major use cases that leverage this powerful capability: Doctrine
entities and Symfony lazy services. There are more as any code that
leverages ocramius/proxy-manager relies on this.

About the vocabulary, the source tells us that "uninitialized" properties
that are unset() become "undefined". I know that's not super accurate since
a typed property is always defined semantically, but that's nonetheless the
flag that is used in the source. Maybe this could help with the RFC.

This. The lazy initialization use case is the only reason why we still allow declared properties to be unset at all.

Our long term plan was to find an alternative way to support lazy initialization for properties, and then forbid calling unset() on declared properties. However, we still don't have that alternative today.

Regards,
Nikita

1 year ago by Rowan Tommins — view source

unread

On Thu, 23 Nov 2023 at 08:48, Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:

Sorry this comes as a surprise to you but you're rewriting history here.
The current behavior, the one that was fixed in that commit, matches how
PHP behaved before typed properties, so this commit brought consistency.

The question of "what does __get do with a property that has been declared
but not assigned a value?" has no answer before PHP 7.4, because that
situation simply never happened. So there isn't really one answer to what
is "consistent".

If I understand rightly, your position, and Nikita's, is that the behaviour
should be consistent with the statement "if you have declared a property,
access doesn't trigger __get unless you explicitly call unset()". This is
what the change that slipped into 7.4.1 "fixed".

The reason it surprised me is that I expected it to be consistent with a
different statement: "if you have an __get method, this takes precedence
over 'undefined variable' notices/warnings/errors". That statement was true
in 7.3, and still true in the original implementation in 7.4.0 (including
"unitialized" alongside "undefined"), but was broken by the change in
7.4.1: now, the "uninitialized property" error takes precedence over __get
in the specific case of never having assigned a value.

About the behavior, it's been in use for many years to build lazy proxies.
I know two major use cases that leverage this powerful capability: Doctrine
entities and Symfony lazy services. There are more as any code that
leverages ocramius/proxy-manager relies on this.

Just to be clear, it is not the behaviour after calling unset which I am
concerned about, it is the behaviour before calling unset or assigning
any value. I was aware of the interaction with __get, but wrongly assumed
that the rule was simply "uninitialized properties trigger __get".

About the vocabulary, the source tells us that "uninitialized" properties
that are unset() become "undefined". I know that's not super accurate since
a typed property is always defined semantically

Just "undefined" is not sufficiently unambiguous; you have to distinguish
four different states:

Never declared, or added dynamically and then unset
Declared without a type, then unset
Declared with a type, not yet assigned any value
Declared with a type, then unset

The messages presented to the user refer to both (1) and (2) as
"undefined", and both (3) and (4) as "uninitialized". As it stands, the
RFC would replace all instances of (2) with (4), but that still leaves us
with two names for three states.

Claude Pache wrote:

However, it is not a problem in practice, because users of classes
implementing (1) but not (2) do not unset declared properties, ever.

Nikita Popov wrote:

... and then forbid calling unset() on declared properties

Right now, unset() is also the only way to break a reference, other than
another assign-by-reference, so it's quite reasonable to write
"unset($this->foo)" instead of "$this->foo" to ensure a property is truly
reset to a known state. Maybe we need a new function or operator to
atomically break the reference and assign a new value, e.g.
unreference($this->foo, 42); or $this->foo := 42; to replace
unset($this->foo); $this->foo=42;

Regards,

Rowan Tommins
[IMSoP]

1 year ago by Claude Pache — view source

unread

Le 23 nov. 2023 à 08:56, Rowan Tommins rowan.collins@gmail.com a écrit :

What you describe in the last sentence is what was initially designed and implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2 (section Overloaded Properties).

However, it was later changed to the current semantics (unset() needed in order to trigger __get()) in https://github.com/php/php-src/pull/4974

Good find. So not only is it not specified this way in the RFC, it actually made it into a live release, then someone complained and we rushed out a more complicated version "to avoid WTF". That's really unfortunate.

I'm not at all convinced by the argument in the linked bug report - whether you get an error or an unexpected call to __get, the solution is to assign a valid value to the property. And making the behaviour different after unset() just hides the user's problem, which is that they didn't expect to ever have a call to __get for that property.

But I guess I'm 4 years too late to make that case.

Hi,

I think that the legitimacy of the current behaviour is not something to be convinced by abstract reasoning like we are doing now (otherwise, it wouldn’t probably have waited a live release to be implemented in a rush), but something that can be understood only by considering the conflicting uses of __get(), which leads to conflicting expectations, and how they managed to live together despite a fundamental contradiction:

__get() is used to implement virtual properties (as mentioned in the linked bug report). For this use case, access to a declared property should not call __get() (whose implementation may be hidden away in a superclass).
__get() is used to implement lazy properties. For this use case, access to an uninitialised (or unset, or undeclared) property should call __get().

The two expectations did live together as long as declared properties were also initialised (implicitly, to null). It is true that they are formally incompatible; when you say:

And making the behaviour different after unset() just hides the user's problem, which is that they didn't expect to ever have a call to __get for that property.

you are referring to the expectation given by use case 1, which contradicts the expectation given by use case 2. However, it is not a problem in practice, because users of classes implementing (1) but not (2) do not unset declared properties, ever.

The conflict became evident with the advent of typed properties, which cannot be implicitly initialised to null in general. As both use cases are firmly rooted in practice, both should be taken in account. The current, cumbersome and unfortunate behaviour allows to keep sane semantics for those classes using bad practices (use case 1), while not invalidating naughty hacks used by other classes (use case 2).

—Claude

1 year ago by Claude Pache — view source

unread

Le 21 nov. 2023 à 00:08, Rowan Tommins rowan.collins@gmail.com a écrit :

I've revised the RFC; it now proposes to keep the implicit "= null" for untyped properties, although I'm still interested in suggestions for other strategies around that.

Hi,

If you really want untyped property not being implicitly initialised to “null” in the long term, here is my suggestion:

In 8.next, deprecate untyped property without initialiser, suggesting to add an explicit “= null” (one deprecation notice per class definition containing at least one untyped property without initialiser);
In 9.0, an untyped property without initialiser becomes a syntax error;
Later, reintroduce untyped property without initialiser with the new semantics.

—Claude