[RFC[ Property accessor hooks, take 2

1 year ago by Larry Garfield — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

1 year ago by Erick de Azevedo Lima — view source

unread

It's really nice!

We are willing to do so if the consensus is that it would be
beneficial, but want to ask before putting in the effort.
I really think that it's not necessary, as the it still hooks into the
default behavior of the properties.

Even though I don't have voting privileges, I'm rooting a lot for it. Thank
you again for this amazing work!

--
Erick

1 year ago by Pierre — view source

unread

Le 21/02/2024 à 19:55, Larry Garfield a écrit :

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

Yes please ! Pass !

I don't have voting rights, but we need this.

Cheers,

Pierre R.

1 year ago by Robert Landers — view source

unread

Le 21/02/2024 à 19:55, Larry Garfield a écrit :

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

Yes please ! Pass !

I don't have voting rights, but we need this.

Cheers,

Pierre R.

I apologize if this has already been covered:

There are two shorthand notations supported, beyond the optional argument to set.
First, if a hook's body is a single expression, then the { } and return statement may be omitted and replaced with =>, just like with arrow functions.

Does => do any special auto-capturing of variables like arrow
functions or is it just a shorthand? Also, is this a meaningful
shorthand to the example a little further down:

public string $phone {
set = $this->sanitizePhone(...);
}

or do we always have to write it out?

public string $phone {
set => $field = $this->sanitizePhone($value);
}

Would PROPERTY be set inside sanitizePhone() as well?

You mention several ways values are displayed (whether or not they use
the get-hook), but what does the default implementation of
__debugInfo() look like now (or is that out of scope or a silly
question?)

For attributes, it would be nice to be able to target hooks
specifically with attributes instead of also all methods (e.g., a
Attribute::TARGET_GET_HOOK const). For example, if I were writing a
serialization library, I may want to specify #[UseRawValue] only on
getters to ensure that only the raw value is serialized instead of the
getter (which may be specific to the application logic, or
#[GetFromMethod] to tell the serialization library to get the value
from a completely different method. It wouldn't make sense to target
just any method with that attribute.

1 year ago by Matthew Weier O'Phinney — view source

unread

On Wed, Feb 21, 2024 at 12:57 PM Larry Garfield larry@garfieldtech.com
wrote:

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Once again in reading the proposal, the first thing I'm struck by are the
magic "$field" and "$value" variables inside accessors. The first time they
are used, they're used without explanation, and they're jarring.

Additionally, once you start defining the behavior of accessors... you
don't start with the basics, but instead jump into some of the more
esoteric usage, which does nothing to help with the questions I have.

So, first:

Start with the most basic, most expected usage for each of reading and
writing properties.
I need a better argument for why the $field and $value variables exist.
Saying they're macros doesn't help those not deep into internals. As a
user, why do they exist?

Honestly, it's not obvious at all that I can just set something to the
variable "$field", and if I didn't know about the feature and stumbled
across a class that used accessors, I'd be wondering what "$field" is, and
how the property ever gets set. (And yes, I've read the FAQ. Saying it's
used in another language doesn't automatically make it good design.)

I know you say in the narrative that you can use $this->propertyName =
to set the value, but you also say it's not recommended (though you do
not indicate why). To somebody coming primarily from userland who's been
doing OOP in PHP for over 20 years, it's far more clear what's happening.
Alternately, I'd argue that both the accessors should require an argument
that represents the reference to the property:

public string $username {
    set($field, string|Stringable $value) {
        $field = (string) $value;
    }
    get ($field) => strtolower($field);
}

The same is true for $value. I'd recommend only ever allowing the construct
set ($value) {}, as it's immediately clear that $value is the argument
and value being provided to the accessor. When it's implicit, it's easy to
lose context.

Second: you don't have examples of defining BOTH get and set OTHER than
when using expressions for both accessors or a mix. I'm actually unclear
what the syntax is when both are defined. Is there supposed to be a ;
terminating each? Or a ,? Or just an empty line? Again, this is one of
the more common scenarios. It needs to be covered early, and clearly.

Third: the caveats around usage with arrays... give me pause. While I'm
personally trying to not use arrays as much as possible, a lot of code I
see or contribute to still does, and the fact that an array property that
uses a write accessor doesn't allow the same level of access as a normal
array property is something I see leading to a lot of confusion and errors.
I don't have a solution, but I worry that this one thing alone could be
enough to prevent the passage of the RFC.

Fourth: the syntax around inheritance is not intuitive, as it does not work
in the same way as the rest of the language. I'm talking about this:

public int $x {
    get: 2 * parent::$x::get()
}

Why do we need to use the accessors here? Why wouldn't it just be
parent::$x?

I want to be clear: I really like the idea behind this feature, and
overall, I appreciate the design. From a user perspective, though, the
above are things that I found jarring as they vary quite a bit from our
current language design.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

I personally would go with just "accessors".

--
Matthew Weier O'Phinney
mweierophinney@gmail.com
https://mwop.net/
he/him

1 year ago by tim@bastelstu.be — view source

unread

Once again in reading the proposal, the first thing I'm struck by are the
magic "$field" and "$value" variables inside accessors. The first time they
are used, they're used without explanation, and they're jarring.

[…]

Second: you don't have examples of defining BOTH get and set OTHER than
when using expressions for both accessors or a mix. I'm actually unclear
what the syntax is when both are defined. Is there supposed to be a ;
terminating each? Or a ,? Or just an empty line? Again, this is one of
the more common scenarios. It needs to be covered early, and clearly.

On a similar topic with regard to syntax and shorthands, I'd like to
quote myself from my previous year's email (Message-ID:
17d7983b-68b7-e273-a445-f8399c2510cc@bastelstu.be,
https://externals.io/message/120213#120216):

(5) I strongly dislike the doubly abbreviated form of public string $fullName => $this->first . " " . $this->last;. Having just the extra
'>' in there to distinguish it from a regular property feels non-obvious.

It's always possible to follow-up with syntax that allows for additional
brevity, the inverse is not true and I believe for such a semantically
complex feature, having clear and syntax syntax would be beneficial.

Best regards
Tim Düsterhus

1 year ago by Larry Garfield — view source

unread

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Once again in reading the proposal, the first thing I'm struck by are
the magic "$field" and "$value" variables inside accessors. The first
time they are used, they're used without explanation, and they're
jarring.

Additionally, once you start defining the behavior of accessors... you
don't start with the basics, but instead jump into some of the more
esoteric usage, which does nothing to help with the questions I have.

So, first:

Start with the most basic, most expected usage for each of reading
and writing properties.

I need a better argument for why the $field and $value variables
exist. Saying they're macros doesn't help those not deep into
internals. As a user, why do they exist?

For $field, it's not a requirement. It's mostly for copy-paste convenience. A number of people have struggled on this point so if the consensus is to leave out $field and just use $this->propName directly, we can accept that. They can be re-added if reusable hook packages are added in the future (as noted in Future Scope).

For $value, it's to avoid boilerplate. For the majority case, you'll be just operating on an individual value trivially. Checking it's range, or uppercasing it, or whatever. Requiring the developer to provide a name explicitly is just extra work; it's much the same as how PHP doesn't require you to pass $this as the first argument to a method explicitly, the way Python and Rust do. It's just understood that $this exists, and once you learn that it's obvious.

On the occasions where you do want to specify an alternate name for some reason, or more likely you're providing a wider type, you can. But in the typical case it would just be one more thing for the dev to have to type out. This is especially true in what I expect to be a common case, which is promoted constructor arguments with an extra validator set hook on them.

It also introduces some ambiguity. If I specify only the name, does that mean I'm widening the type to mixed? Or just that I'm omitting the name? If specifying the name is rare, that's not really a big deal. If it's required in every case, it's a confusion point in every case.

In the interest of transparency. for comparison:

Kotlin appears to require an argument name, but by convention recommends using "value".
Swift makes it optional, with a default name of "newValue". (Same logic as in the RFC.)
C# ... as far as I can tell, doesn't support a custom name at all; it's always called "value", implicitly.

Second: you don't have examples of defining BOTH get and set OTHER than
when using expressions for both accessors or a mix. I'm actually
unclear what the syntax is when both are defined. Is there supposed to
be a ; terminating each? Or a ,? Or just an empty line? Again,
this is one of the more common scenarios. It needs to be covered early,
and clearly.

... huh. I thought we had one in there somewhere. I will add one, thanks. Though to clarify, there's no separator character.

public string $foo {
get {
// ...
}
set {
// ...
}
}

Third: the caveats around usage with arrays... give me pause. While I'm
personally trying to not use arrays as much as possible, a lot of code
I see or contribute to still does, and the fact that an array property
that uses a write accessor doesn't allow the same level of access as a
normal array property is something I see leading to a lot of confusion
and errors. I don't have a solution, but I worry that this one thing
alone could be enough to prevent the passage of the RFC.

We completely agree that it's a suboptimal situation. But as explained, it is the way it is because it's not possible (as far as we can tell) to fully support hooks on array properties. If you can think of one, please share, because we'd love to make this part better. I don't like it either, but we haven't found a way around it. And that caveat doesn't seem like a good enough reason to not support hooks everywhere we actually can.

Fourth: the syntax around inheritance is not intuitive, as it does not
work in the same way as the rest of the language. I'm talking about
this:
public int $x {
    get: 2 * parent::$x::get()
}
Why do we need to use the accessors here? Why wouldn't it just be parent::$x?

Almost. Ilija spent some time looking into this, and it's possible with caveats.

First, there's then no way to differentiate between "access parent hook" and "read the static property $x on the parent". Arguably that's not a common case, but it is a point of confusion.

The larger issue is that parent::$x can't be just a stand-in for the backing value in all cases. While supporting that for the = operator is straightforward enough, it wouldn't give us access to ++, --, <=, and the dozen or so other operators that could conceivably apply. In theory those could all be implemented manually, but Ilija described that as "hundreds to thousands of lines of code" to do, which... is not time or code well spent. :-) Especially as this is a very edge-case situation to begin with. (In practice, I expect auto-generated ORM proxy code to be the primary user of accessing a parent property from a child hook. I can think of very few other cases where I'd want to use it.)

So we have the choice between making $a = parent::$prop and parent::$prop = $a work, but nothing else, inexplicably (creating confusion) or the slightly longer syntax that wouldn't support those other operations anyway so there's no confusion.

We feel the current approach is the better trade off, but if the consensus generally is for the shorter-but-inconsistent syntax, that can be changed.

I want to be clear: I really like the idea behind this feature, and
overall, I appreciate the design. From a user perspective, though, the
above are things that I found jarring as they vary quite a bit from our
current language design.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

I personally would go with just "accessors".

This is a long, huge and comprehensive work, congratz to the authors.

It clearly shows that so much thought and work has been put into it
that it makes me cautious to even ask for further clarification.

Javascript have similar features via a different syntax, although that syntax would not be viable for PHP

Why not?

It feels quite a natural syntax for PHP and from someone oblivious to
the internal work, it appears to be a slight marginal change to the
existing RFC. Given the extensive work of this RFC, it seems pretty
obvious that this syntax will not work, I just don't know why.

I've added an FAQ section explaining why the Python/JS approach wouldn't really work. To be clear, Ilija and I spent 100+ hours doing research and design before we started implementation (back in mid-late 2022). We did seriously consider the JS-style syntax, but in the end we found it created more problems than it solves. For the type of language PHP is (explicit typed properties), doing it on the property itself is a much cleaner approach.

I apologize if this has already been covered:

There are two shorthand notations supported, beyond the optional argument to set.
First, if a hook's body is a single expression, then the { } and return statement may be omitted and replaced with =>, just like with arrow functions.

Does => do any special auto-capturing of variables like arrow
functions or is it just a shorthand?

No, there is nothing to capture. Inside the hook body you have $this the same as any other method.

Also, is this a meaningful
shorthand to the example a little further down:

public string $phone {
set = $this->sanitizePhone(...);
}

or do we always have to write it out?

public string $phone {
set => $field = $this->sanitizePhone($value);
}

Currently no, the set is always void; you have to write the value yourself. See the FAQ section on the topic for a detailed explanation.

However, I just had a long discussion with Ilija and there is one possibility we could consider: Use the return value only on the shorthand (arrow-function-like) syntax.

So you could do either of these, which would be equivalent:

set {
$this->phone = $this->santizePhone($value);
}

set => $this->santizePhone($value);

This would have the advantage of offering a return-to-set mechanism, as well as even shorter syntax in the simple case (and no question of $field vs $this->propName). But it would have the disadvantage of being potentially inconsistent between the short and long version. It also would mean the short version is incompatible with virtual properties; using a short-set would create a backing value, so it's non-virtual. But since "simple validation, maybe in a promoted constructor property" is likely to be one of the main use cases, it would simplify that case.

Not sure if that's a trade up or not. Thoughts from the list?

Would PROPERTY be set inside sanitizePhone() as well?

No. Like CLASS, it is materialized at compile time and has no meaning outside of its intended scope.

You mention several ways values are displayed (whether or not they use
the get-hook), but what does the default implementation of
__debugInfo() look like now (or is that out of scope or a silly
question?)

var_dump() shows the underlying backing value, bypassing get, since it's debugging the object state. If you implement __debugInfo(), that's a method like any other and you can do what you like, though you'll be reading through get hooks (just like using __serialize()).

For attributes, it would be nice to be able to target hooks
specifically with attributes instead of also all methods (e.g., a
Attribute::TARGET_GET_HOOK const). For example, if I were writing a
serialization library, I may want to specify #[UseRawValue] only on
getters to ensure that only the raw value is serialized instead of the
getter (which may be specific to the application logic, or
#[GetFromMethod] to tell the serialization library to get the value
from a completely different method. It wouldn't make sense to target
just any method with that attribute.

This feels very niche, honestly. Those would naturally have to be sub-cases of TARGET_METHOD anyway, so method-targeted attributes would need to be supported regardless. That makes hook-specific-targeting an easy and non-breaking add-on for a future RFC if it turns out to be useful in practice.

--Larry Garfield

1 year ago by Stephen Reay — view source

unread

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Once again in reading the proposal, the first thing I'm struck by are
the magic "$field" and "$value" variables inside accessors. The first
time they are used, they're used without explanation, and they're
jarring.

Additionally, once you start defining the behavior of accessors... you
don't start with the basics, but instead jump into some of the more
esoteric usage, which does nothing to help with the questions I have.

So, first:

Start with the most basic, most expected usage for each of reading
and writing properties.

I need a better argument for why the $field and $value variables
exist. Saying they're macros doesn't help those not deep into
internals. As a user, why do they exist?

For $field, it's not a requirement. It's mostly for copy-paste convenience. A number of people have struggled on this point so if the consensus is to leave out $field and just use $this->propName directly, we can accept that. They can be re-added if reusable hook packages are added in the future (as noted in Future Scope).

For $value, it's to avoid boilerplate. For the majority case, you'll be just operating on an individual value trivially. Checking it's range, or uppercasing it, or whatever. Requiring the developer to provide a name explicitly is just extra work; it's much the same as how PHP doesn't require you to pass $this as the first argument to a method explicitly, the way Python and Rust do. It's just understood that $this exists, and once you learn that it's obvious.

On the occasions where you do want to specify an alternate name for some reason, or more likely you're providing a wider type, you can. But in the typical case it would just be one more thing for the dev to have to type out. This is especially true in what I expect to be a common case, which is promoted constructor arguments with an extra validator set hook on them.

It also introduces some ambiguity. If I specify only the name, does that mean I'm widening the type to mixed? Or just that I'm omitting the name? If specifying the name is rare, that's not really a big deal. If it's required in every case, it's a confusion point in every case.

In the interest of transparency. for comparison:

Kotlin appears to require an argument name, but by convention recommends using "value".

Swift makes it optional, with a default name of "newValue". (Same logic as in the RFC.)

C# ... as far as I can tell, doesn't support a custom name at all; it's always called "value", implicitly.

Second: you don't have examples of defining BOTH get and set OTHER than
when using expressions for both accessors or a mix. I'm actually
unclear what the syntax is when both are defined. Is there supposed to
be a ; terminating each? Or a ,? Or just an empty line? Again,
this is one of the more common scenarios. It needs to be covered early,
and clearly.

... huh. I thought we had one in there somewhere. I will add one, thanks. Though to clarify, there's no separator character.

public string $foo {
get {
// ...
}
set {
// ...
}
}

Third: the caveats around usage with arrays... give me pause. While I'm
personally trying to not use arrays as much as possible, a lot of code
I see or contribute to still does, and the fact that an array property
that uses a write accessor doesn't allow the same level of access as a
normal array property is something I see leading to a lot of confusion
and errors. I don't have a solution, but I worry that this one thing
alone could be enough to prevent the passage of the RFC.

We completely agree that it's a suboptimal situation. But as explained, it is the way it is because it's not possible (as far as we can tell) to fully support hooks on array properties. If you can think of one, please share, because we'd love to make this part better. I don't like it either, but we haven't found a way around it. And that caveat doesn't seem like a good enough reason to not support hooks everywhere we actually can.

Fourth: the syntax around inheritance is not intuitive, as it does not
work in the same way as the rest of the language. I'm talking about
this:

public int $x {
get: 2 * parent::$x::get()
}

Why do we need to use the accessors here? Why wouldn't it just be parent::$x?

Almost. Ilija spent some time looking into this, and it's possible with caveats.

First, there's then no way to differentiate between "access parent hook" and "read the static property $x on the parent". Arguably that's not a common case, but it is a point of confusion.

The larger issue is that parent::$x can't be just a stand-in for the backing value in all cases. While supporting that for the = operator is straightforward enough, it wouldn't give us access to ++, --, <=, and the dozen or so other operators that could conceivably apply. In theory those could all be implemented manually, but Ilija described that as "hundreds to thousands of lines of code" to do, which... is not time or code well spent. :-) Especially as this is a very edge-case situation to begin with. (In practice, I expect auto-generated ORM proxy code to be the primary user of accessing a parent property from a child hook. I can think of very few other cases where I'd want to use it.)

So we have the choice between making $a = parent::$prop and parent::$prop = $a work, but nothing else, inexplicably (creating confusion) or the slightly longer syntax that wouldn't support those other operations anyway so there's no confusion.

We feel the current approach is the better trade off, but if the consensus generally is for the shorter-but-inconsistent syntax, that can be changed.

I want to be clear: I really like the idea behind this feature, and
overall, I appreciate the design. From a user perspective, though, the
above are things that I found jarring as they vary quite a bit from our
current language design.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

I personally would go with just "accessors".

This is a long, huge and comprehensive work, congratz to the authors.

It clearly shows that so much thought and work has been put into it
that it makes me cautious to even ask for further clarification.

Javascript have similar features via a different syntax, although that syntax would not be viable for PHP

Why not?

It feels quite a natural syntax for PHP and from someone oblivious to
the internal work, it appears to be a slight marginal change to the
existing RFC. Given the extensive work of this RFC, it seems pretty
obvious that this syntax will not work, I just don't know why.

I've added an FAQ section explaining why the Python/JS approach wouldn't really work. To be clear, Ilija and I spent 100+ hours doing research and design before we started implementation (back in mid-late 2022). We did seriously consider the JS-style syntax, but in the end we found it created more problems than it solves. For the type of language PHP is (explicit typed properties), doing it on the property itself is a much cleaner approach.

I apologize if this has already been covered:

There are two shorthand notations supported, beyond the optional argument to set.
First, if a hook's body is a single expression, then the { } and return statement may be omitted and replaced with =>, just like with arrow functions.

Does => do any special auto-capturing of variables like arrow
functions or is it just a shorthand?

No, there is nothing to capture. Inside the hook body you have $this the same as any other method.

Also, is this a meaningful
shorthand to the example a little further down:

public string $phone {
set = $this->sanitizePhone(...);
}

or do we always have to write it out?

public string $phone {
set => $field = $this->sanitizePhone($value);
}

Currently no, the set is always void; you have to write the value yourself. See the FAQ section on the topic for a detailed explanation.

However, I just had a long discussion with Ilija and there is one possibility we could consider: Use the return value only on the shorthand (arrow-function-like) syntax.

So you could do either of these, which would be equivalent:

set {
$this->phone = $this->santizePhone($value);
}

set => $this->santizePhone($value);

This would have the advantage of offering a return-to-set mechanism, as well as even shorter syntax in the simple case (and no question of $field vs $this->propName). But it would have the disadvantage of being potentially inconsistent between the short and long version. It also would mean the short version is incompatible with virtual properties; using a short-set would create a backing value, so it's non-virtual. But since "simple validation, maybe in a promoted constructor property" is likely to be one of the main use cases, it would simplify that case.

Not sure if that's a trade up or not. Thoughts from the list?

Would PROPERTY be set inside sanitizePhone() as well?

No. Like CLASS, it is materialized at compile time and has no meaning outside of its intended scope.

You mention several ways values are displayed (whether or not they use
the get-hook), but what does the default implementation of
__debugInfo() look like now (or is that out of scope or a silly
question?)

var_dump() shows the underlying backing value, bypassing get, since it's debugging the object state. If you implement __debugInfo(), that's a method like any other and you can do what you like, though you'll be reading through get hooks (just like using __serialize()).

For attributes, it would be nice to be able to target hooks
specifically with attributes instead of also all methods (e.g., a
Attribute::TARGET_GET_HOOK const). For example, if I were writing a
serialization library, I may want to specify #[UseRawValue] only on
getters to ensure that only the raw value is serialized instead of the
getter (which may be specific to the application logic, or
#[GetFromMethod] to tell the serialization library to get the value
from a completely different method. It wouldn't make sense to target
just any method with that attribute.

This feels very niche, honestly. Those would naturally have to be sub-cases of TARGET_METHOD anyway, so method-targeted attributes would need to be supported regardless. That makes hook-specific-targeting an easy and non-breaking add-on for a future RFC if it turns out to be useful in practice.

--Larry Garfield

Hi Larry,

It's good to see this idea still progressing.

I have to agree with the other comment(s) that the implicit $field/$value variables seem odd to me. I understand the desire for brevity and the potential future scope of reused hooks, but this concept seems to fly in the face of many years of PHP reducing "magic" like this.

To give one answer to your question about ambiguity if the $value parameter is required - I don't believe this is actually ambiguous, in the context of PHP:

method parameters in child classes don't implicitly 'inherit' the parent method parameter's type if they don't define one (they widen to mixed);
method return types have no implicit inheritance, they must declare a compatible return type;
typed class properties don't implicitly inherit the parent type when the type left off a child property - they must declare the same type.

AFAIK there is no existing behaviour in PHP where omitting a type would mean "the type is implicitly inherited from X", it either means the same as mixed, or it's an error.

The use of implicit variables like this also opens up a new potential for bugs, particularly in the set block, where the difference between setting a new local variable inside the method body and writing to the backing store is indistinguishable without knowing intent. Explicitly writing to $this->propertyName (or $this->{__PROPERTY__} in a hypothetical re-used/generated hook) is certainly longer, but it's also much less likely to suffer from a typo that is then missed by a test, especially given that dynamic properties are now deprecated, so writing to $this->porpertyName from a hook on $propertyName would likely trigger an error, while writing to $feild from a hook would produce no error.

Ultimately so long as the syntax allows for being explicit (which it seems to, and it provides the magic constant allowing for future dynamic use in reused hooks) I'm generally positive about this.

Given that these are both essentially deemed as "shortcuts" for convenience rather than actual differences in functionality, perhaps whether to include the implicit/magic $field/$value variables could be a secondary question(s) on the RFC?

Also, a small nitpick: The link to your attributeutils repo in the examples page, is broken, and it would be nice to see a few examples showing the explicit version of the hooks.

Cheers

Stephen

1 year ago by Larry Garfield — view source

unread

Hi Larry,

It's good to see this idea still progressing.

I have to agree with the other comment(s) that the implicit
$field/$value variables seem odd to me. I understand the desire for
brevity and the potential future scope of reused hooks, but this
concept seems to fly in the face of many years of PHP reducing "magic"
like this.

The "magic" that PHP has been removing is mostly weird and illogical type casting. As noted, neither of these variables are any more "magic" than $this.

However, since it seems no one likes $field, we have removed it from the RFC. Of note, to respond to your comment further down, $this->{PROPERTY} will not work. The virtual-property detection looks for the AST representation of $this->propName, and only that. Dynamic versions that turn into that at runtime cannot work, as it needs to be known at compile time.

For $value, however, we feel strongly that having the default there is a necessary part of the ergonomic picture. In particular, given your comments here:

To give one answer to your question about ambiguity if the $value
parameter is required - I don't believe this is actually ambiguous, in
the context of PHP:

method parameters in child classes don't implicitly 'inherit' the
parent method parameter's type if they don't define one (they widen to
mixed);

method return types have no implicit inheritance, they must declare a
compatible return type;

typed class properties don't implicitly inherit the parent type when
the type left off a child property - they must declare the same type.

AFAIK there is no existing behaviour in PHP where omitting a type would
mean "the type is implicitly inherited from X", it either means the
same as mixed, or it's an error.

That to me suggests that IF a custom variable name is provided, we should require also specifying the type. In which case, in the 95% case, if we require the full argument signature then the 95% case would need to double-specify the type, which is a hard-no from an ergonomic standpoint.

Especially combined with the suggestion yesterday to allow return-to-set in the short-set version, that would mean comparing this:

public string $phone {
set(string $phone) => $this->phone = $this->sanitizePhone($phone);
}

To this:

public string $phone {
set => $this->sanitizePhone($value);
}

And to me, there's absolutely no contest. The latter has about 1/3 as many places for me to make a typo repeating the same information over again. Now imagine comparing the above in a property that's used with constructor promotion.

public function __construct(
public string $phone { set(string $phone) => $this->phone = $this->sanitizePhone($phone); }
public string $phone { set => $this->sanitizePhone($value); }
) {}

Again, it's absolutely no contest for me. I would detest writing the longer version every time.

If PHP has been moving away from weird and inexplicable magic, it's also been moving away from needless boilerplate. (Constructor promotion being the best example, but not the only; types themselves are a factor here, as are arrow functions.) As the whole point of this RFC is to make writing common code easier, requiring redundant boilerplate for it to work is actively counter-productive.

So what I'd suggest instead is "specify the full signature if you want a custom name OR wider type; or omit the param list entirely to get a same-type $value variable, which 99% of the time is all you need." We get that in return for documenting "$value is the default", which for someone who has already figured out $this, should be a very low effort to learn.

Also, a small nitpick: The link to your attributeutils repo in the
examples page, is broken, and it would be nice to see a few examples
showing the explicit version of the hooks.

Link fixed, thanks. What do you mean explicit version of the hooks?

--Larry Garfield

1 year ago by Frederik Bosch — view source

unread

Hi Larry,

It's good to see this idea still progressing.

I have to agree with the other comment(s) that the implicit
$field/$value variables seem odd to me. I understand the desire for
brevity and the potential future scope of reused hooks, but this
concept seems to fly in the face of many years of PHP reducing "magic"
like this.
The "magic" that PHP has been removing is mostly weird and illogical type casting. As noted, neither of these variables are any more "magic" than $this.

However, since it seems no one likes $field, we have removed it from the RFC. Of note, to respond to your comment further down, $this->{PROPERTY} will not work. The virtual-property detection looks for the AST representation of $this->propName, and only that. Dynamic versions that turn into that at runtime cannot work, as it needs to be known at compile time.

For $value, however, we feel strongly that having the default there is a necessary part of the ergonomic picture. In particular, given your comments here:

To give one answer to your question about ambiguity if the $value
parameter is required - I don't believe this is actually ambiguous, in
the context of PHP:

method parameters in child classes don't implicitly 'inherit' the
parent method parameter's type if they don't define one (they widen to
mixed);

method return types have no implicit inheritance, they must declare a
compatible return type;

typed class properties don't implicitly inherit the parent type when
the type left off a child property - they must declare the same type.

AFAIK there is no existing behaviour in PHP where omitting a type would
mean "the type is implicitly inherited from X", it either means the
same as mixed, or it's an error.
That to me suggests that IF a custom variable name is provided, we should require also specifying the type. In which case, in the 95% case, if we require the full argument signature then the 95% case would need to double-specify the type, which is a hard-no from an ergonomic standpoint.

Especially combined with the suggestion yesterday to allow return-to-set in the short-set version, that would mean comparing this:

public string $phone {
set(string $phone) => $this->phone = $this->sanitizePhone($phone);
}

To this:

public string $phone {
set => $this->sanitizePhone($value);
}

And to me, there's absolutely no contest. The latter has about 1/3 as many places for me to make a typo repeating the same information over again. Now imagine comparing the above in a property that's used with constructor promotion.

public function __construct(
public string $phone { set(string $phone) => $this->phone = $this->sanitizePhone($phone); }
public string $phone { set => $this->sanitizePhone($value); }
) {}

Again, it's absolutely no contest for me. I would detest writing the longer version every time.

If PHP has been moving away from weird and inexplicable magic, it's also been moving away from needless boilerplate. (Constructor promotion being the best example, but not the only; types themselves are a factor here, as are arrow functions.) As the whole point of this RFC is to make writing common code easier, requiring redundant boilerplate for it to work is actively counter-productive.

So what I'd suggest instead is "specify the full signature if you want a custom name OR wider type; or omit the param list entirely to get a same-type $value variable, which 99% of the time is all you need." We get that in return for documenting "$value is the default", which for someone who has already figured out $this, should be a very low effort to learn.

Also, a small nitpick: The link to your attributeutils repo in the
examples page, is broken, and it would be nice to see a few examples
showing the explicit version of the hooks.
Link fixed, thanks. What do you mean explicit version of the hooks?

--Larry Garfield

Hi Larry,

Great to see you pick up on this proposal again. Rather than "just use
$this->propName directly", you might want to consider yield to commit
the property value, as a solution to "to assign and confirm the property
write while still allowing code to run after it". But then return would
be also be a logical consequence.

 public  string$propName  {
     set($value)  {  
         yield $value;
         $this->doSomething();
         yield $value;
         return $this->doSomethingElse(); // final commit, not required
     }
 }

If I read your remarks on why not using return I do not get why this
proposal wants to deviate from a normal function syntax.

function test(string $name) {
return [$first, $last] = explode(' ', $name);
}

var_dump(test("Larry Garfield")); // returns ["Larry", "Garfield"]

That's how it always has been, no? So in your example, short code
abbreviated form would not work. One has to write a block.

 public  string$fullName  {  
     set=>  [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // error, $fullName is a string, returning array
 }

 public  string$fullName  {  
     set{
         [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // no error, not returning
     }
 }

Hope this RFC will develop into something that will pass. Good luck!

Regards,
Frederik

1 year ago by Rowan Tommins [IMSoP] — view source

unread

That's how it always has been, no? So in your example, short code
abbreviated form would not work. One has to write a block.

 public  string$fullName  {  
     set=>  [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // error, $fullName is a string, returning array
 }

 public  string$fullName  {  
     set{
         [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // no error, not returning
     }
 }

I think the intention is that both the block and the arrow syntax would
have any return value ignored, as happens with constructors, for
example. Note that in PHP, there is actually no such thing as "a
function not returning a value", even a "void" function actually returns
null; so if the return value was treated as meaningful, your second
example would give an error "cannot assign null to property of type string".

However, as noted in a previous message, I agree that the short form
meaning "the value returned is saved to the backing field" is both more
expected and more useful.

The "yield" idea is ... interesting. I think personally I find it a bit
too magic, and too cryptic to be more readable than an explicit
assignment. Opinions may vary, though.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Frederik Bosch — view source

unread

Hi Rowan,

Thanks for clearing up that the return value will be ignored. I
understand better why that is (void = null). I do like the updated RFC
better than the one with the $field variable to write to. My yield
suggestion was an idea derived from that earlier $field write proposal,
but I wanted to share it anyhow.

I do note that $this->propName might suggest that the backing value is
accessible from other locations than only the property's own get/set
methods, because of $this usage. I would rather have a $field in get
referring to the current value, and a $value in set referring to the
passed value. So with the full name example:

public string $fullName {
get ($field): string => $field;

        set ($value): string {
            [$this->first, $this->last] = explode(' ', $value);
            return $value;
        }
    }

I think it would make absolutely clear that the backing value is only
accessible in the local get/set scope. Regarding returning void=null,
this is something that IDE and static analyzers already pick-up as an
error. I think being stricter on that in this RFC would actually make
sense, and treat void not as null. So with a slightly different full
name example:

public string $fullName {
get ($field): string => $this->first . ' ' . $this->last;

        set ($value): void {
            [$this->first, $this->last] = explode(' ', $value); // real
void, no value returned
        }
    }

And why yield is magic, I do not get that. The word and the expression
actually expresses that something is, well, yielded. yield is a word
that is reserved in the language that serves the purpose of the problem.
I would use it. It is even explainable that the set function is treated
as a generator function.

Regards,
Frederik

That's how it always has been, no? So in your example, short code
abbreviated form would not work. One has to write a block.

     public string$fullName {          set=> [$this->first,
$this->last] = explode http://www.php.net/explode(' ', \ucfirst
http://www.php.net/ucfirst($value)); // error, $fullName is a
string, returning array
     }
     public string$fullName {          set{
             [$this->first, $this->last] = explode
http://www.php.net/explode(' ', \ucfirst
http://www.php.net/ucfirst($value)); // no error, not returning
         }
     }

I think the intention is that both the block and the arrow syntax
would have any return value ignored, as happens with constructors, for
example. Note that in PHP, there is actually no such thing as "a
function not returning a value", even a "void" function actually
returns null; so if the return value was treated as meaningful, your
second example would give an error "cannot assign null to property of
type string".

However, as noted in a previous message, I agree that the short form
meaning "the value returned is saved to the backing field" is both
more expected and more useful.

The "yield" idea is ... interesting. I think personally I find it a
bit too magic, and too cryptic to be more readable than an explicit
assignment. Opinions may vary, though.

Regards,

Frederik Bosch


  Partner

Genkgo logo
Mail: f.bosch@genkgo.nl mailto:f.bosch@genkgo.nl
Web: support.genkgo.com https://support.genkgo.com

Entrada 123
Amsterdam
+31 20 244 1920

Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer
56501153

1 year ago by Rowan Tommins [IMSoP] — view source

unread

I do note that $this->propName might suggest that the backing value is
accessible from other locations than only the property's own get/set
methods, because of $this usage.

Yes, I actually stumbled over that confusion when I was writing some of
the examples in my lengthy e-mail in this thread. As I understand it,
this would work:

public string $foo {
get { $this->foo ??= 0; $this->foo++; return $this->foo; }
set { throw new Exception; }
}

Outside the hooks, trying to write to $this->foo would throw the
exception, because it refers to the hooked property as a whole; but
inside, the same name refers to something different, which isn't
accessible anywhere else.

Now that I've looked more at how Kotlin uses "field", I understand why
it makes sense - it's not an alias for the property itself, but the way
to access a "backing store" which has no other name.

Using $this->foo as the name is tempting if you think of hooks as
happening "on top of" the "real" property; but that would be a different
feature, like Switft's "property observers" (willSet and didSet). What's
really happening is that we're declaring two things at once, and giving
them the same name; almost as if we'd written this:

public string $foo {
get { static $_foo; $_foo ??= 0; $_foo++; return $_foo; }
set { throw new Exception; }
}

Kotlin's "field" is kind of the equivalent of that "static $_foo"

Regarding returning void=null, this is something that IDE and static
analyzers already pick-up as an error. I think being stricter on that
in this RFC would actually make sense, and treat void not as null.

What would happen if a setter contained both "return 42;" and "return;"?
The latter is explicitly allowed in "void" functions, but is also
allowed in a non-void function as meaning "return null;"

And why yield is magic, I do not get that. The word and the expression
actually expresses that something is, well, yielded.

But yielded to where? My mental model of "return to set" is that this:

public string $name { set($value) { $x = something($value); return $x +
1; } }

Is effectively:

private function _name_set($value) { $x = something($value); return $x +
1; } }
plus:
$this->name = $this->_name_set($value);

With "yield", I can't picture that simple translation; the "magic" is
whatever translates the "yield" keyword into "$this->name ="

I would file it with the type widening in the RFC: seems kind of cool,
but probably isn't worth the added complexity.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Frederik Bosch — view source

unread

Hi Rowan,

I do note that $this->propName might suggest that the backing value
is accessible from other locations than only the property's own
get/set methods, because of $this usage.

Yes, I actually stumbled over that confusion when I was writing some
of the examples in my lengthy e-mail in this thread. As I understand
it, this would work:

public string $foo {
    get { $this->foo ??= 0; $this->foo++; return $this->foo; }
    set { throw new Exception; }
}

Outside the hooks, trying to write to $this->foo would throw the
exception, because it refers to the hooked property as a whole; but
inside, the same name refers to something different, which isn't
accessible anywhere else.

Now that I've looked more at how Kotlin uses "field", I understand why
it makes sense - it's not an alias for the property itself, but the
way to access a "backing store" which has no other name.

Using $this->foo as the name is tempting if you think of hooks as
happening "on top of" the "real" property; but that would be a
different feature, like Switft's "property observers" (willSet and
didSet). What's really happening is that we're declaring two things at
once, and giving them the same name; almost as if we'd written this:

public string $foo {
    get { static $_foo; $_foo ??= 0; $_foo++; return $_foo; }
    set { throw new Exception; }
}

Kotlin's "field" is kind of the equivalent of that "static $_foo"

And what happens in the following situation, how are multiple get calls
working together?

public string $fullName {
get => $this->first . ' ' . $this->last; // is this accessing the
backed value, or is it accessing via get
set($value) => $this->fullName = $value;
}

public string $first {
get => explode(' ', $this->fullName)[0], // is this accessing the
backed value, or is it accessing via get
set($value) => $value;
}

Isn't it weird that $this->propName gives different results from one get
function, compared to the other. I would say $this->prop should always
follow the same semantics as explained in the RFC (first __get/__set,
then the accessor).

Regarding returning void=null, this is something that IDE and static
analyzers already pick-up as an error. I think being stricter on that
in this RFC would actually make sense, and treat void not as null.

What would happen if a setter contained both "return 42;" and
"return;"? The latter is explicitly allowed in "void" functions, but
is also allowed in a non-void function as meaning "return null;"
return 42; // returns (int)42
return; // early return, void, same as no return
return null; // returns null

And why yield is magic, I do not get that. The word and the
expression actually expresses that something is, well, yielded.

But yielded to where? My mental model of "return to set" is that this:

public string $name { set($value) { $x = something($value); return $x

1; } }

Is effectively:

private function _name_set($value) { $x = something($value); return $x

1; } }
plus:
$this->name = $this->_name_set($value);

With "yield", I can't picture that simple translation; the "magic" is
whatever translates the "yield" keyword into "$this->name ="

You would picture it by explaining how it works from the source side. A
set function that contains a yield turns the set function into a
directly consumed generator. Considering the following:

public string $first {
    set($value) => {
        yield 'First name';
        yield 'Given name';
        return 'My name';
    }
}

the pseudo-code from the PHP source side would look as follows.

$generator = setCall($class, 'first', $value);
foreach ($generator as $value) {
writeProperty($class, 'first', $value);
}
if ($generator->hasReturn()) {
writeProperty($class, 'first', $generator->getReturn());
}

I would file it with the type widening in the RFC: seems kind of cool,
but probably isn't worth the added complexity.

Regards,

Frederik Bosch


  Partner

Genkgo logo
Mail: f.bosch@genkgo.nl mailto:f.bosch@genkgo.nl
Web: support.genkgo.com https://support.genkgo.com

Entrada 123
Amsterdam
+31 20 244 1920

Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer
56501153

1 year ago by Rob Landers — view source

unread

Hi Rowan,

I do note that $this->propName might suggest that the backing value is accessible from other locations than only the property's own get/set methods, because of $this usage.

Yes, I actually stumbled over that confusion when I was writing some of the examples in my lengthy e-mail in this thread. As I understand it, this would work:

public string $foo {
get { $this->foo ??= 0; $this->foo++; return $this->foo; }
set { throw new Exception; }
}

Outside the hooks, trying to write to $this->foo would throw the exception, because it refers to the hooked property as a whole; but inside, the same name refers to something different, which isn't accessible anywhere else.

Now that I've looked more at how Kotlin uses "field", I understand why it makes sense - it's not an alias for the property itself, but the way to access a "backing store" which has no other name.

Using $this->foo as the name is tempting if you think of hooks as happening "on top of" the "real" property; but that would be a different feature, like Switft's "property observers" (willSet and didSet). What's really happening is that we're declaring two things at once, and giving them the same name; almost as if we'd written this:

public string $foo {
get { static $_foo; $_foo ??= 0; $_foo++; return $_foo; }
set { throw new Exception; }
}

Kotlin's "field" is kind of the equivalent of that "static $_foo"

And what happens in the following situation, how are multiple get calls working together?

public string $fullName {
get => $this->first . ' ' . $this->last; // is this accessing the backed value, or is it accessing via get
set($value) => $this->fullName = $value;
}

public string $first {
get => explode(' ', $this->fullName)[0], // is this accessing the backed value, or is it accessing via get
set($value) => $value;
}

Isn't it weird that $this->propName gives different results from one get function, compared to the other. I would say $this->prop should always follow the same semantics as explained in the RFC (first __get/__set, then the accessor).

Regarding returning void=null, this is something that IDE and static analyzers already pick-up as an error. I think being stricter on that in this RFC would actually make sense, and treat void not as null.

What would happen if a setter contained both "return 42;" and "return;"? The latter is explicitly allowed in "void" functions, but is also allowed in a non-void function as meaning "return null;"
return 42; // returns (int)42
return; // early return, void, same as no return
return null; // returns null

And why yield is magic, I do not get that. The word and the expression actually expresses that something is, well, yielded.

But yielded to where? My mental model of "return to set" is that this:

public string $name { set($value) { $x = something($value); return $x + 1; } }

Is effectively:

private function _name_set($value) { $x = something($value); return $x + 1; } }
plus:
$this->name = $this->_name_set($value);

With "yield", I can't picture that simple translation; the "magic" is whatever translates the "yield" keyword into "$this->name ="
You would picture it by explaining how it works from the source side. A set function that contains a yield turns the set function into a directly consumed generator. Considering the following:

public string $first {
set($value) => {
yield 'First name';
yield 'Given name';
return 'My name';
}
}

the pseudo-code from the PHP source side would look as follows.

$generator = setCall($class, 'first', $value);
foreach ($generator as $value) {
writeProperty($class, 'first', $value);
}
if ($generator->hasReturn()) {
writeProperty($class, 'first', $generator->getReturn());
}

The yield is much more intuitive than magic fields and $this->prop (which feels like an infinite loop). Yield is remarkably simple.

Looking at this, I'm still not sure what would happen here, though (maybe it is covered in the RFC, and I missed it) -- going to use yield here to try it out:

public string $name {
set => {
if(strlen($value) < 5) {
yield 'invalid';
yield $this->invalidName($value);
}
yield $value;
}
}

public function invalidName($name) {
return $this->name = str_pad($name, 5);
}

This is probably an infinite loop in this particular example, but more importantly, do setters allow reentry while executing?

I would file it with the type widening in the RFC: seems kind of cool, but probably isn't worth the added complexity.

Regards,
--
Frederik Bosch

Partner

Genkgo logo

Mail: f.bosch@genkgo.nl
Web: support.genkgo.com

Entrada 123
Amsterdam
+31 20 244 1920

Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer 56501153

— Rob

1 year ago by Rowan Tommins [IMSoP] — view source

unread

And what happens in the following situation, how are multiple get calls working together?

public string $fullName {
    get => $this->first . ' ' . $this->last; // is this accessing the backed value, or is it accessing via get
    set($value) => $this->fullName = $value;
}

public string $first {
    get => explode(' ', $this->fullName)[0], // is this accessing the backed value, or is it accessing via get
    set($value) => $value;
}

I don't think it's that confusing - the rule is not "hooks vs methods", it's "special access inside the property's own hook". But as I say, I'm coming around to the idea that using a different name for that "backing field" / "raw value" might be sensible.

What would happen if a setter contained both "return 42;" and "return;"? The latter is explicitly allowed in "void" functions, but is also allowed in a non-void function as meaning "return null;"
return 42; // returns (int)42
return; // early return, void, same as no return
return null; // returns null

I'm not sure if you misunderstood my question, or just the context of why I asked it. I'm talking about a hook like this:

set($value) { if ($value) { return 42; } else { return; } }

Currently, the only definition of "void" in the language is that a void function must not contain an explicit return value. We could turn that check around, and deduce that a certain hook is void. This hook would not pass that check, so we would compile it to have an assignment, and the false case would assign null to the property. To avoid that, we would need some additional analysis to prove that in all possible paths, a return statement with a value is reached.

The alternative would be to run the code, and somehow observe that it "returned void". But "void" isn't a value we can represent at run-time; we would need to set the return value to some special value just for this specific case. We would have to turn that on just for hook bodies, as returning it from normal functions would be a huge BC break, and also not very useful - with union types, there would be plenty of better options for a function to indicate a return value that needs special handling.

$generator = setCall($class, 'first', $value);
foreach ($generator as $value) {
writeProperty($class, 'first', $value);
}
if ($generator->hasReturn()) {
writeProperty($class, 'first', $generator->getReturn());
}

That's already an order of magnitude more complicated than "the return value is used on the right-hand side of an assignment", and it's missing at least one case: set($value) { return $value; } will not compile to a generator, so needs to skip and assign the value directly.

By "magic", what I meant was "hidden logic underneath that makes it work". Assign-by-return has a small amount of magic - you can express it in half a line of code; assign-by-yield has much more magic - a whole bunch of loops and conditionals to operate your coroutine.

The yield is much more intuitive than magic fields

I think we'll just have to differ in opinion on that one. Maybe you're just more used to working with coroutines than I am.

Note that yield also doesn't solve how to read the current backing value in a get hook (or a set hook that wants to compare before and after), so we still need some way to refer to it.

Regards,
Rowan Tommins
[IMSoP]

1 year ago by Frederik Bosch — view source

unread

On 26 February 2024 23:11:16 GMT, Frederik Bosch f.bosch@genkgo.nl
wrote:

And what happens in the following situation, how are multiple get
calls working together?

public string $fullName {
    get => $this->first . ' ' . $this->last; // is this accessing the
backed value, or is it accessing via get
    set($value) => $this->fullName = $value;
}

public string $first {
    get => explode(' ', $this->fullName)[0], // is this accessing the
backed value, or is it accessing via get
    set($value) => $value;
}

I don't think it's that confusing - the rule is not "hooks vs
methods", it's "special access inside the property's own hook". But as
I say, I'm coming around to the idea that using a different name for
that "backing field" / "raw value" might be sensible.

What would happen if a setter contained both "return 42;" and
"return;"? The latter is explicitly allowed in "void" functions, but
is also allowed in a non-void function as meaning "return null;"
return 42; // returns (int)42
return; // early return, void, same as no return
return null; // returns null

I'm not sure if you misunderstood my question, or just the context of
why I asked it. I'm talking about a hook like this:

set($value) { if ($value) { return 42; } else { return; } }

Currently, the only definition of "void" in the language is that a
void function must not contain an explicit return value. We could turn
that check around, and deduce that a certain hook is void. This hook
would not pass that check, so we would compile it to have an
assignment, and the false case would assign null to the property. To
avoid that, we would need some additional analysis to prove that in
all possible paths, a return statement with a value is reached.

The alternative would be to run the code, and somehow observe that it
"returned void". But "void" isn't a value we can represent at
run-time; we would need to set the return value to some special value
just for this specific case. We would have to turn that on just for
hook bodies, as returning it from normal functions would be a huge BC
break, and also not very useful - with union types, there would be
plenty of better options for a function to indicate a return value
that needs special handling.

$generator = setCall($class, 'first', $value);
foreach ($generator as $value) {
   writeProperty($class, 'first', $value);
}
if ($generator->hasReturn()) {
writeProperty($class, 'first', $generator->getReturn());
}

That's already an order of magnitude more complicated than "the return
value is used on the right-hand side of an assignment", and it's
missing at least one case: set($value) { return $value; } will not
compile to a generator, so needs to skip and assign the value directly.

By "magic", what I meant was "hidden logic underneath that makes it
work". Assign-by-return has a small amount of magic - you can express
it in half a line of code; assign-by-yield has much more magic - a
whole bunch of loops and conditionals to operate your coroutine.

The yield is much more intuitive than magic fields

I think we'll just have to differ in opinion on that one. Maybe you're
just more used to working with coroutines than I am.

Note that yield also doesn't solve how to read the current backing
value in a get hook (or a set hook that wants to compare before and
after), so we still need some way to refer to it.

Regards,
Rowan Tommins
[IMSoP]

Hi Rowan,

Our discussion sums up the pros and cons. Whether yield is
complicated/confusing or not, is maybe personal. The same applies to
getting $this->prop resulting in different calls. Larry has removed
$field from the RFC completely now, while I think it was a sensible
approach to read the current backing value. I think I have laid out
another alternative to writing with the yield/return suggestion. It's up
to the authors of the RFC to do something with it, or not. Thanks for
taking the suggestion seriously.

Regards,
Frederik

1 year ago by Larry Garfield — view source

unread

Hi Rowan,

Our discussion sums up the pros and cons. Whether yield is
complicated/confusing or not, is maybe personal. The same applies to
getting $this->prop resulting in different calls. Larry has removed
$field from the RFC completely now, while I think it was a sensible
approach to read the current backing value. I think I have laid out
another alternative to writing with the yield/return suggestion. It's up
to the authors of the RFC to do something with it, or not. Thanks for
taking the suggestion seriously.

Regards,
Frederik

Ilija and I have discussed this, and we both agree that yield is not a viable option. There is no generator or generator-like behavior involved in hooks at all, and a syntax that implies there is would be very misleading. And adjusting the code to make it actually generator-based would make the code considerably more complex, and most likely slower.

It figures that people would start speaking up in favor of $field right after I removed it from the RFC text. :-P At the moment, we're comfortable either direction. (It hasn't been removed from the code yet.) The main question is whether the trade-off of an implicit variable name and the potential for confusion is outweighed by the clarity about what is happening and where. It sounds like most people are just really, really pissed off by an implicit variable, but that's based on the as-usual highly unscientific survey of "who replies to an email." I will probably start a poll shortly to help get a better sense of what the actual voting population thinks.

--Larry Garfield

1 year ago by Erick de Azevedo Lima — view source

unread

It sounds like most people are just really, really pissed off by an
implicit variable
I think that it could be good to follow the PHP way to mark the "magic"
stuff, which is putting leading underscores on the magic stuff. It's not
pretty, but it's good because our eyes can detect the magic stuff in the
code almost instantaneously. I particularly have no problems with magic
stuff, since it's well documented. But I admit that the leading underscores
help me find PHP's magic methods inside a class with no effort at all and
that's darn good.

--
Erick Lima

1 year ago by Rowan Tommins [IMSoP] — view source

unread

It sounds like most people are just really, really pissed off by an
implicit variable

I think that it could be good to follow the PHP way to mark the
"magic" stuff, which is putting leading underscores on the magic stuff.

I think that might help; I also think that even if the RFC offers a
choice to the list, the final implementation should not offer choice to
users.

I think part of what put people off with the original wording was that
it implied $field was an alias for $this->propertyName, but the alias
was "preferred". The reality is that we have a new thing that we need a
name/syntax for, and $field or $this->propertyName are possible options.

To avoid another lengthy e-mail, I've put together some alternative RFC
wording. The main idea is to switch the framing from "hooks on top of
properties, which may be virtual" to "hooked properties which are
virtual by default, but may access a special backing field".

As noted in the introduction this is not intended as a
counter-proposal or critique, just somewhere to collate my thoughts and
suggestions: https://wiki.php.net/rfc/property-hooks/imsop-suggestion

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Rob Landers — view source

unread

Hi Rowan,

Our discussion sums up the pros and cons. Whether yield is
complicated/confusing or not, is maybe personal. The same applies to
getting $this->prop resulting in different calls. Larry has removed
$field from the RFC completely now, while I think it was a sensible
approach to read the current backing value. I think I have laid out
another alternative to writing with the yield/return suggestion. It's up
to the authors of the RFC to do something with it, or not. Thanks for
taking the suggestion seriously.

Regards,
Frederik

Ilija and I have discussed this, and we both agree that yield is not a viable option. There is no generator or generator-like behavior involved in hooks at all, and a syntax that implies there is would be very misleading. And adjusting the code to make it actually generator-based would make the code considerably more complex, and most likely slower.

If it makes the code more complex, then it probably shouldn't be there. AFAIK saying there isn't generator-like behavior, I would disagree. The value (in this case) is exactly like an iterator, and may have multiple values through the function lifetime. A normal function only exposes one value -- the return value -- unless it exports values out of its scope using references. Only a generator exposes multiple values over the course of its lifetime.

It figures that people would start speaking up in favor of $field right after I removed it from the RFC text. :-P At the moment, we're comfortable either direction. (It hasn't been removed from the code yet.) The main question is whether the trade-off of an implicit variable name and the potential for confusion is outweighed by the clarity about what is happening and where. It sounds like most people are just really, really pissed off by an implicit variable, but that's based on the as-usual highly unscientific survey of "who replies to an email." I will probably start a poll shortly to help get a better sense of what the actual voting population thinks.

I suspect that people who are for it might also happen to be Gmail users. Also, I don't feel particularly strongly either way, nor am I a voter, so I haven't said anything one way or the other.

--Larry Garfield

— Rob

1 year ago by Larry Garfield — view source

unread

That's how it always has been, no? So in your example, short code
abbreviated form would not work. One has to write a block.
 public  string$fullName  {  
     set=>  [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // error, $fullName is a string, returning array
 }

 public  string$fullName  {  
     set{
         [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>;(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // no error, not returning
     }
 }
I think the intention is that both the block and the arrow syntax would
have any return value ignored, as happens with constructors, for
example. Note that in PHP, there is actually no such thing as "a
function not returning a value", even a "void" function actually returns
null; so if the return value was treated as meaningful, your second
example would give an error "cannot assign null to property of type string".

This correct. Given a function test():

$ret = test('Larry Garfield');

There's no way to tell if $ret is a possibly-null value we should do something with, or null by side-effect. The RFC right now takes the stance of "it's null by side effect, always, so we never do anything with the return so it's consistent."

However, as noted in a previous message, I agree that the short form
meaning "the value returned is saved to the backing field" is both more
expected and more useful.

You're the first person to comment on it, but I'm glad you agree. :-) I like it, but Ilija is still unsure about it.

The "yield" idea is ... interesting. I think personally I find it a bit
too magic, and too cryptic to be more readable than an explicit
assignment. Opinions may vary, though.

Regards,

Mixing in syntax only used for generators here seems like it's asking for trouble. It wouldn't actually be a coroutine, so using coroutine like syntax would just be confusing. It's confusing to me whether this implies the hook becomes a generator or not, which means it's likely to confuse a lot of other people.

--Larry Garfield

1 year ago by Marc Bennewitz — view source

unread

Hi Larry,

first of all thank you very much for this amazing work you two have been
done :+1:.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.
Once again in reading the proposal, the first thing I'm struck by are
the magic "$field" and "$value" variables inside accessors. The first
time they are used, they're used without explanation, and they're
jarring.

Additionally, once you start defining the behavior of accessors... you
don't start with the basics, but instead jump into some of the more
esoteric usage, which does nothing to help with the questions I have.

So, first:

Start with the most basic, most expected usage for each of reading
and writing properties.

I need a better argument for why the $field and $value variables
exist. Saying they're macros doesn't help those not deep into
internals. As a user, why do they exist?
For $field, it's not a requirement. It's mostly for copy-paste convenience. A number of people have struggled on this point so if the consensus is to leave out $field and just use $this->propName directly, we can accept that. They can be re-added if reusable hook packages are added in the future (as noted in Future Scope).

I'm also feeling that introducing magic variables isn't the best design
choice.
I read your section about "Why do set hooks not return the value to
set?" and I don't really agree.

Let me explain ...

Virtual properties and technically all functions return a valid.

I think it would me much less magic if property setters on virtual
properties declare a void return type.
This would make it very obvious that this is a virtual property even on
having to read complex setters.

it disallows any action /after/ the assignment happens

I actually think this would be a good think!

An object property should only be set after everything has been done.
If you want to do more it should either not be part of the setter or you
should use a temporary variable what you have to do anyway to not leave
your object in an incorrect state.

Let's say we have the following setter:

public string$name {
set($value){ // do stuff
$field = $value; // or $this->name = $value
// do more stuff and (eventually) fail
throw Exception();
}
}

try {
$object->name = 'test';
} finally {
$object->name; // what is name here ?
}

ambiguity

I actually feel that $field is ambiguous. What happens if you declare
set($field) {} ? Does such construct let the engine set the property
value immediately as the input value gets immediately assigned to the
property via $field?

Greetings,
Marc

1 year ago by Larry Garfield — view source

unread

Hi Larry,

first of all thank you very much for this amazing work you two have been
done :+1:.

I'm also feeling that introducing magic variables isn't the best design
choice.
I read your section about "Why do set hooks not return the value to
set?" and I don't really agree.

Let me explain ...

Virtual properties and technically all functions return a valid.

I think it would me much less magic if property setters on virtual
properties declare a void return type.
This would make it very obvious that this is a virtual property even on
having to read complex setters.

Making everyone type ": void" after every set hook, when we already know that's going to be the case, seems like a really bad developer experience.

I talked with Ilija extensively about it, and there is no meaningful way to distinguish between "this method returned null" and "this method didn't return" from the call site in the engine. If we could, that would allow smarter detection of when it makes sense to use a return value. Hence my suggestion of allowing set-on-return only for the => form.

ambiguity

I actually feel that $field is ambiguous. What happens if you declare
set($field) {} ? Does such construct let the engine set the property
value immediately as the input value gets immediately assigned to the
property via $field?

$field has already been removed. See previous email.

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

However, I just had a long discussion with Ilija and there is one
possibility we could consider: Use the return value only on the
shorthand (arrow-function-like) syntax.

So you could do either of these, which would be equivalent:

set {
$this->phone = $this->santizePhone($value);
}

set => $this->santizePhone($value);

Regarding this point, I've realised that the current short-hand set syntax isn't actually any shorter:

set { $this->phone = $this->santizePhone($value); }
set => $this->phone = $this->santizePhone($value);

It also feels weird to say both "the right-hand side must be a valid expression" and "the value of the expression is ignored".

So I think making the short-hand be "expression to assign to the implicit backing field" makes a lot more sense.

Regards,

Rowan Tommins
[IMSoP]

1 year ago by Bruce Weirdan — view source

unread

Hi Larry and others

I've added an FAQ section explaining why the Python/JS approach wouldn't really work. To be clear, Ilija and I spent 100+ hours doing research and design before we started implementation (back in mid-late 2022). We did seriously consider the JS-style syntax, but in the end we found it created more problems than it solves. For the type of language PHP is (explicit typed properties), doing it on the property itself is a much cleaner approach.

The section you added [1] seems to focus on having both public string $fistName and public function get/set firstName():string, and how
it's hard to keep types and visibility in sync. But I'm not sure if
you considered making properties and accessors mutually exclusive. I
mean the following:

class Person
{
    public string $firstName;      // compile time error, setter/getter defined

    public function __construct(private string $first, private string $last) {}

    public function get firstName(): string       // compile time
error, property defined
    {
        return $this->first . " " . $this->last;
    }

    public function set firstName(string $value): void     // compile
time error, property defined
    {
        [$this->first, $this->last] = explode(' ', $value);
    }
}

This seems to address most of the counterpoints you listed, to some degree:

What is the property type of the $firstName property?

Well, you propose to allow wider write-types yourself, so the question
would apply there as well. But presumably, the property type is its
read type - so whatever getter returns.

but there's nothing inherent that forces, public string $firstName, get firstName()s return and set firstName()s parameter to be the same. Even if we could detect it as a compile error, it means one more thing that the developer has to keep track of and get right, in three different places.

With mutually exclusive accessors and properties it becomes just two
places. And yes, accessor consistency would need to be checked at
compile time. But the same can be said for the widening of write type
you proposed.

What about visibility? Do the get/set methods need to have the same visibility as the property?

When there's no property the question becomes moot.

If not, does that become a way to do asymmetric visibility?

Yes.

What about inconsistency between the method's visibility and the property visibility? How is that handled?

There's no inconsistency when there's no property. Accessor visibility
can be different - allowing the asymmetric visibility you wanted to
implement in your other RFC.

How do you differentiate between virtual and non-virtual properties?

This one is hard to answer without asking another question: why would
you need to? Does the requirement to know it stem from engine
implementation details, or do you need as a person writing code in
PHP?

For non-virtual properties, if you need to triple-enter everything, we're back to constructors pre-promotion. Plus, the accessor methods could be anywhere in the class, potentially hundreds of lines away. That means just looking at the property declaration doesn't tell you what its logic is; the logic may be on line 960, which only makes keeping its type/visibility in sync with the property harder.

Forbidding property declaration reduces that to double. The rest is
mostly stylistic and can be said about traditional
(non-constructor-promoted) properties as well.

Now this approach naturally has some open questions, foremost about
inheritance. But we probably don't need to go into those details if
you already explored this way and found some technical obstacles. If
you did, it would probably make sense to list them in the FAQ section.

[1] https://wiki.php.net/rfc/property-hooks#why_not_pythonjavascript-style_accessor_methods

--
Best regards,
Bruce Weirdan mailto:weirdan@gmail.com

1 year ago by Bruce Weirdan — view source

unread

Hi Larry and others

I've added an FAQ section explaining why the Python/JS approach wouldn't really work. To be clear, Ilija and I spent 100+ hours doing research and design before we started implementation (back in mid-late 2022). We did seriously consider the JS-style syntax, but in the end we found it created more problems than it solves. For the type of language PHP is (explicit typed properties), doing it on the property itself is a much cleaner approach.

The section you added [1] seems to focus on having both public string $fistName and public function get/set firstName():string, and how
it's hard to keep types and visibility in sync. But I'm not sure if
you considered making properties and accessors mutually exclusive. I
mean the following:
class Person
{
    public string $firstName;      // compile time error, setter/getter defined

    public function __construct(private string $first, private string $last) {}

    public function get firstName(): string       // compile time
error, property defined
    {
        return $this->first . " " . $this->last;
    }

    public function set firstName(string $value): void     // compile
time error, property defined
    {
        [$this->first, $this->last] = explode(' ', $value);
    }
}
This seems to address most of the counterpoints you listed, to some degree:

What is the property type of the $firstName property?

Well, you propose to allow wider write-types yourself, so the question
would apply there as well. But presumably, the property type is its
read type - so whatever getter returns.

but there's nothing inherent that forces, public string $firstName, get firstName()s return and set firstName()s parameter to be the same. Even if we could detect it as a compile error, it means one more thing that the developer has to keep track of and get right, in three different places.

With mutually exclusive accessors and properties it becomes just two
places. And yes, accessor consistency would need to be checked at
compile time. But the same can be said for the widening of write type
you proposed.

What about visibility? Do the get/set methods need to have the same visibility as the property?

When there's no property the question becomes moot.

If not, does that become a way to do asymmetric visibility?

Yes.

What about inconsistency between the method's visibility and the property visibility? How is that handled?

There's no inconsistency when there's no property. Accessor visibility
can be different - allowing the asymmetric visibility you wanted to
implement in your other RFC.

How do you differentiate between virtual and non-virtual properties?

This one is hard to answer without asking another question: why would
you need to? Does the requirement to know it stem from engine
implementation details, or do you need as a person writing code in
PHP?

For non-virtual properties, if you need to triple-enter everything, we're back to constructors pre-promotion. Plus, the accessor methods could be anywhere in the class, potentially hundreds of lines away. That means just looking at the property declaration doesn't tell you what its logic is; the logic may be on line 960, which only makes keeping its type/visibility in sync with the property harder.

Forbidding property declaration reduces that to double. The rest is
mostly stylistic and can be said about traditional
(non-constructor-promoted) properties as well.

Now this approach naturally has some open questions, foremost about
inheritance. But we probably don't need to go into those details if
you already explored this way and found some technical obstacles. If
you did, it would probably make sense to list them in the FAQ section.

[1] https://wiki.php.net/rfc/property-hooks#why_not_pythonjavascript-style_accessor_methods

Resending this since I've never got a reply and it's quite possible
the message got lost due to mail list issues.

--
Best regards,
Bruce Weirdan mailto:weirdan@gmail.com

1 year ago by Larry Garfield — view source

unread

Hi Larry and others

I've added an FAQ section explaining why the Python/JS approach wouldn't really work. To be clear, Ilija and I spent 100+ hours doing research and design before we started implementation (back in mid-late 2022). We did seriously consider the JS-style syntax, but in the end we found it created more problems than it solves. For the type of language PHP is (explicit typed properties), doing it on the property itself is a much cleaner approach.

The section you added [1] seems to focus on having both public string $fistName and public function get/set firstName():string, and how
it's hard to keep types and visibility in sync. But I'm not sure if
you considered making properties and accessors mutually exclusive. I
mean the following:
class Person
{
    public string $firstName;      // compile time error, setter/getter defined

    public function __construct(private string $first, private string $last) {}

    public function get firstName(): string       // compile time
error, property defined
    {
        return $this->first . " " . $this->last;
    }

    public function set firstName(string $value): void     // compile
time error, property defined
    {
        [$this->first, $this->last] = explode(' ', $value);
    }
}
This seems to address most of the counterpoints you listed, to some degree:

What is the property type of the $firstName property?

Well, you propose to allow wider write-types yourself, so the question
would apply there as well. But presumably, the property type is its
read type - so whatever getter returns.

but there's nothing inherent that forces, public string $firstName, get firstName()s return and set firstName()s parameter to be the same. Even if we could detect it as a compile error, it means one more thing that the developer has to keep track of and get right, in three different places.

With mutually exclusive accessors and properties it becomes just two
places. And yes, accessor consistency would need to be checked at
compile time. But the same can be said for the widening of write type
you proposed.

What about visibility? Do the get/set methods need to have the same visibility as the property?

When there's no property the question becomes moot.

If not, does that become a way to do asymmetric visibility?

Yes.

What about inconsistency between the method's visibility and the property visibility? How is that handled?

There's no inconsistency when there's no property. Accessor visibility
can be different - allowing the asymmetric visibility you wanted to
implement in your other RFC.

How do you differentiate between virtual and non-virtual properties?

This one is hard to answer without asking another question: why would
you need to? Does the requirement to know it stem from engine
implementation details, or do you need as a person writing code in
PHP?

For non-virtual properties, if you need to triple-enter everything, we're back to constructors pre-promotion. Plus, the accessor methods could be anywhere in the class, potentially hundreds of lines away. That means just looking at the property declaration doesn't tell you what its logic is; the logic may be on line 960, which only makes keeping its type/visibility in sync with the property harder.

Forbidding property declaration reduces that to double. The rest is
mostly stylistic and can be said about traditional
(non-constructor-promoted) properties as well.

Now this approach naturally has some open questions, foremost about
inheritance. But we probably don't need to go into those details if
you already explored this way and found some technical obstacles. If
you did, it would probably make sense to list them in the FAQ section.

[1] https://wiki.php.net/rfc/property-hooks#why_not_pythonjavascript-style_accessor_methods
Resending this since I've never got a reply and it's quite possible
the message got lost due to mail list issues.

What you suggest might be possible, but it still runs into the problem of backed properties. Plain methods work fine for virtual properties, which makes sense for Python and JS as they don't have pre-defined properties. Once the language has predefined properties, a method-centric approach leaves a lot more to be manual, which makes it more work to achieve the same end. And then what if you want hooks on a property used in constructor promotion? Or if you cannot declare a property and similarly named hook-method, how do you add hooks to a plain property in a child class, something generated code like ORMs will most likely want to do?

Even if it could be done, what would be the advantage? Other than "it looks like JS", I don't really see any benefit, just a lot of complexity we'd have to sort out and find the edge cases on all over again, and basically rewrite the whole RFC. Unless someone can show why the property-centric approach is fundamentally broken (and given that we have a successful implementation already I think that is unlikely), there's little reason to revisit the fundamentals of the RFC at this point.

(And, mind you, the above discussion does not demonstrate that a method-centric approach is in fact equally feasible. At best, there are probably ways to make most things work, but without actually doing it there's no way to be sure it wouldn't run into a brick wall somewhere.)

--Larry Garfield

1 year ago by Deleu — view source

unread

On Wed, Feb 21, 2024 at 3:58 PM Larry Garfield larry@garfieldtech.com
wrote:

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

This is a long, huge and comprehensive work, congratz to the authors.

It clearly shows that so much thought and work has been put into it that it
makes me cautious to even ask for further clarification.

Javascript have similar features via a different syntax, although that
syntax would not be viable for PHP

Why not?

final class Foo
{
    public string $bar;

    public function get bar(): string
    {
        // custom getter
    }

    public function set bar(string $value): void
    {
        // custom setter
    }
}

It feels quite a natural syntax for PHP and from someone oblivious to the
internal work, it appears to be a slight marginal change to the existing
RFC. Given the extensive work of this RFC, it seems pretty obvious that
this syntax will not work, I just don't know why.

--
Marco Deleu

1 year ago by Robert Landers — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

This is a reply to Marco (https://externals.io/message/122445#122449)
for which I didn't actually receive the email but got an email from
the list that I didn't receive the email -- seems like it would have
been simpler and more correct to resend the email. That is a bit
weird, but whatever.

Given the extensive work of this RFC, it seems pretty obvious that
this syntax will not work, I just don't know why.

I feel like the syntax is natural, coming from other languages with
this feature. However, I really do appreciate the example Marco gives
as it feels very idiomatically PHP with all the getters/setters we are
used to writing. It also means a pretty simple "just add a space and
downcase" to switch from traditional methods to getters/setters to
hooks, which could be really nice.

1 year ago by tag Knife — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

I remember the previous RFC. cant remeber why it was decline, but i hope
this
one passes, I know the PHP community has been asking for native
getter/setters
instead of __get __set for a while, since 7.0 was released at least.

A few things i was interested to get the idea around.
Was it thought about for the set{} for it to return the value to set the
property to
instead implicitly setting its own field?

public string $name {
    set {
        return usfirst($value);
    }
}

Where the value returned in the set is what the property will be set to?

1 year ago by Deleu — view source

unread

A few things i was interested to get the idea around.
Was it thought about for the set{} for it to return the value to set the
property to
instead implicitly setting its own field?

eg
public string $name {
    set {
        return usfirst($value);
    }
}
Where the value returned in the set is what the property will be set to?

The answer to this question is best described on the RFC FAQ:
https://wiki.php.net/rfc/property-hooks#why_do_set_hooks_not_return_the_value_to_set

--
Marco Deleu

1 year ago by tim@bastelstu.be — view source

unread

I believe there is an error in the "Final Hooks" section:

// But this is NOT allowed, because beforeSet is final in the parent.

I believe it should be s/beforeSet/set/.

In the same section this sentence is probably grammatically incorrect.

Declaring hooks final on a property that is declared final is
redundant will throw an error.

Regarding the same sentence I also don't see why that should be
disallowed, even if it is redundant. It's also legal to define 'final'
functions within a 'final' class. Which also brings me to the question:
Is defining 'final' properties and 'final' hooks within a 'final' class
equally disallowed?
Since the RFC was last discussed, the #[\Override] RFC was accepted
and implemented (https://wiki.php.net/rfc/marking_overriden_methods). A
short section on the interaction with #[\Override] would probably be
helpful.
Not sure if I've asked it before, but have you considered making the
parameter for ReflectionProperty::getHook() an enum?

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.

Calling it hooks is fine and allows for future extension without
renaming, should the need arise.

Best regards
Tim Düsterhus

1 year ago by kontakt@beberlei.de — view source

unread

Hey,

On Wed, Feb 21, 2024 at 7:58 PM Larry Garfield larry@garfieldtech.com
wrote:

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

thank you for this proposal. there are some points i'd like to make into
this discussion:

Thank you for the removal of $field, it was non-idomatic from a PHP POV.
I would prefer that the short syntax $foo => null; be voted upon
separately. Personally I think it could be confusing and is too close to a
regular assignment for default value and I prefer not to have it and keep
the rest of the RFC.
The magic of detecting if a property is virtual or backed is - like Rowan
said - subtle. I would also prefer this to be managed via an explicit
mechanism, by for example keywording the property as "virtual", instead of
the implicit way with the parsing based detection.
As Doctrine project maintainer, we have had some troubles supporting read
only properties for proxies. I would have hoped that with hooks I can
overwrite a readonly property and "hook" into it instead by defining a
getter, but "no setter". From my read of the RFC this would not be allowed
and throws a compile time error. Could you maybe clarify why at least this
special case is not possible? I didn't immediately get it from the RFC
section on readonly as it only speaks about the problems in abstract.

greetings
Benjamin

--
Larry Garfield
larry@garfieldtech.com

1 year ago by Larry Garfield — view source

unread

thank you for this proposal. there are some points i'd like to make
into this discussion:

Thank you for the removal of $field, it was non-idomatic from a PHP POV.

I would prefer that the short syntax $foo => null; be voted upon
separately. Personally I think it could be confusing and is too close
to a regular assignment for default value and I prefer not to have it
and keep the rest of the RFC.

See my long reply to Rowan just now, where I go into this.

The magic of detecting if a property is virtual or backed is - like
Rowan said - subtle. I would also prefer this to be managed via an
explicit mechanism, by for example keywording the property as
"virtual", instead of the implicit way with the parsing based detection.

See my long reply to Rowan just now, where I go into this.

As Doctrine project maintainer, we have had some troubles supporting
read only properties for proxies. I would have hoped that with hooks I
can overwrite a readonly property and "hook" into it instead by
defining a getter, but "no setter". From my read of the RFC this would
not be allowed and throws a compile time error. Could you maybe clarify
why at least this special case is not possible? I didn't immediately
get it from the RFC section on readonly as it only speaks about the
problems in abstract.

Once again, readonly properties were designed sloppily without consideration of how they would interact with other features. Full asymmetric visibility would have been much better suited to what you describe, but that was rejected. (If hooks pass, we are considering taking a second attempt at aviz. Haven't decided yet.)

As for details, consider this technically legal (if ill-advised) code:

public DateTimeImmutable $now {
get {
return new DateTimeImmutable();
}
}

If that property is non-readonly, then while that may be weird, it's not really unexpected. There's no guarantee a property won't change value from one access to the next. UNLESS that property is readonly. But if it's readonly, what the heck does the above code do? It would be returning a different value on every request, so it violates one of the constraints on readonly. The hook can do arbitrary logic, which means there's really no way for the engine to detect at compile time "wait, you're doing something dynamic." (That would be an extreme amount of work unless PHP was pure-by-default, which it is not.)

For inheritance, this is the same reasoning for why a property cannot have the readonly marker removed in a child class. It breaks one of the expected constraints of the property, even if not the type per se. (So it's kinda a Liskov issue.)

This is true regardless of whether the property is virtual or not. Virtual properties just have the extra complication of "when you try to set it, there's an if (is_uninitialized()) check in the engine. What does that even mean when there's no backing property??"

There's one carve out that might be supportable, which is just a set hook on a readonly property. That MIGHT be able to work around readonly's flaws by making sure the set hook calls parent::$foo::set($value), to ensure the actual write happens in the parent (since readonly is private-set, not protected-set). Ilija tells me that could be tricky to do, however, so we'd rather punt on that for now. Follow-up RFCs (even this version) could flesh out that edge case alone without impacting the rest of the design.

--Larry Garfield

1 year ago by drealecs@gmail.com — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC.

I liked how the discussion and the RFC evolved so far and that we have less
magic now.

I didn't saw one aspect discussed, automatic detection if a property is
virtual or not.
Why not using explicit keyword for marking a property as virtual and not
relying on the detection?

Detection cannot be complete, and I can think if the property is used in
another method called from the hook, we might end up with errors when
extracting few lines of code as a method.

Since in the reflection the "virtual" name is used, it would be natural and
simple to have it explicitly as part of the property definition.

Of course, a virtual property would require at least a hook type defined,
otherwise there wouldn't be any point for its existence.

Alex

1 year ago by Larry Garfield — view source

unread

Hi folks. Based on earlier discussions, we've made a number of changes to the RFC that should address some of the concerns people raised. We also had some very fruitful discussions off-list with several developers from the Foundation, which led to what we feel are some solid improvements.

https://wiki.php.net/rfc/property-hooks

Smaller changes:

We've added a PropertyHookType enum for use in reflection, which lets us get rid of a possible exception.
get_mangled_object_vars()'s behavior is now defined to be the same as an array, ie, skip hooks.
A final property with a final hook is no longer an error. It's just silently redundant, like for methods.
Made explict what happens with recursive hook calls and method calls from a hook. The behavior here is the same as for __get/__set.
Made support for #[\Override] explicit.
Added an FAQ regarding the property-centric approach rather than method-centric approach.
Clarified that the parent::$foo::get() syntax works on a parent property regardless of whether it has hooks.
Clarified that untyped properties are supported. (Though, it's 2024, please don't use untyped properties.)
Clarified that interface properties cannot specify a wider write-type, for simplicity. That could be considered as a future add-on with no BC breaks.
Provided an explanation of how to interpret the parent-hook access syntax.
Added an FAQ item explaining why a 'virtual' keyword is not feasible.

Larger changes:

As noted a while ago, $field has been removed.
I've added an FAQ question regarding the parent::$foo::get() syntax, and why it is.
The $foo => expression shorthand has been removed. The legal shorthands are now:

public string $foo {
get => evaluates to a value;
set => assigns this value;
}

The set shorthand (with => ) now means "write this value instead". The non-shorthand version (set { } ) is always return void, so you have to assign the value yourself but you get more flexibility. Having updated the examples accordingly, I think this is actually a really nice and intuitive trade-off, as it makes the common transformation and validation cases (eg, in constructor promotion) even easier to follow with no redundancy.
On a set hook, the user may specify both a type and name, or neither. (That is, set {} or set (Foo $newFoo). If not specified, it defaults to the type of the property and $value, as before.
I restructured how the content in 2, 3, 4 is presented and moved some stuff around so that it flows more logically (I think).
The restrictions around references have been softened. Specifically, references are only disallowed if a backed property has a set hook. If it has only a get, or if it's a virtual property, references are allowed. We also added a future-scope section on a possible way to support assigning by reference in the future, if there is sufficient need.
Interfaces may now require a &get hook, or just 'get'. A class may use a &get hook on a 'get' interface declaration. This is the same logic as already exists for methods; we're just copying it.

Hopefully the above changes should resolve most outstanding concerns. I do think the updated shorthand handling is better overall, so I'm happy with it.

There also seems to be little consensus so far on naming this thing hooks vs accessors. Absent a consensus, we'll probably stick with hooks to avoid the effort of renaming all the things.

Thank you everyone for the feedback so far, and if you still have some, please say so. (Even if it's just to say that you're happy with the RFC now so we feel more comfortable bringing it to a vote.)

--Larry Garfield

1 year ago by michal.brzuchalski@gmail.com — view source

unread

Hi Larry,

pt., 8 mar 2024 o 16:55 Larry Garfield larry@garfieldtech.com napisał(a):

Hi folks. Based on earlier discussions, we've made a number of changes to
the RFC that should address some of the concerns people raised. We also
had some very fruitful discussions off-list with several developers from
the Foundation, which led to what we feel are some solid improvements.

https://wiki.php.net/rfc/property-hooks

This RFC looks awesome, thanks Larry and Ilija I love the functionality in
its current shape.

Thank you everyone for the feedback so far, and if you still have some,

please say so. (Even if it's just to say that you're happy with the RFC
now so we feel more comfortable bringing it to a vote.)

The only thing I don't like and can still be worked on is the reflection
mechanism changes.
The proposed methods isVirtual and getRawValue, setRawValue pair introduces
a need to catch exceptions which could be eliminated by subtyping
ReflectionProperty.
Having these methods on a separate subtype allows returning a valid value.
I realize this isn't trivial because for the last 2 days, I was thinking
about giving it a name and TBH cannot figure out anything feasible.
If this is not possible to put in understandable words then at least
mention it in FAQ and why not.

Cheers,
Michał Marcin Brzuchalski

1 year ago by Larry Garfield — view source

unread

Hi Larry,

pt., 8 mar 2024 o 16:55 Larry Garfield larry@garfieldtech.com napisał(a):

Hi folks. Based on earlier discussions, we've made a number of changes to the RFC that should address some of the concerns people raised. We also had some very fruitful discussions off-list with several developers from the Foundation, which led to what we feel are some solid improvements.

https://wiki.php.net/rfc/property-hooks

This RFC looks awesome, thanks Larry and Ilija I love the functionality
in its current shape.

Thank you everyone for the feedback so far, and if you still have some, please say so. (Even if it's just to say that you're happy with the RFC now so we feel more comfortable bringing it to a vote.)

The only thing I don't like and can still be worked on is the
reflection mechanism changes.
The proposed methods isVirtual and getRawValue, setRawValue pair
introduces a need to catch exceptions which could be eliminated by
subtyping ReflectionProperty.
Having these methods on a separate subtype allows returning a valid
value.
I realize this isn't trivial because for the last 2 days, I was
thinking about giving it a name and TBH cannot figure out anything
feasible.
If this is not possible to put in understandable words then at least
mention it in FAQ and why not.

Cheers,
Michał Marcin Brzuchalski

Hm, interesting. I'll have to check with Ilija on feasibility. My question is, how would it eliminate it?

Suppose we hypothetically have a "ReflectionPropertyWithHooks" reflection object. It has those three extra methods on it. But that means $rObject->getProperty() could return a ReflectionProperty or ReflectionPropertyWithHooks, and you don't know which it is. You'd have to do an instanceof check to know which type of property it is, and thus what methods are available. That doesn't seem any less clumsy (nor more, to be fair) than calling isVirtual().

$rProp = $rObject->getProperty('foo', $obj);
$rProp->getValue(); // works always.

if (!$rProp->isVirtual()) {
$rProp->getRawValue(); // Works, may or may not be the same return as getValue()
}

vs.

if (!$rProp instanceof VirtualProperty) {
$rProp->getRawValue(); // Works.
}

That doesn't seem to be an improvement. If you omit the conditional, you still need a catch one way or the other. It just changes what gets thrown (an Exception vs an Error). What type of hierarchy were you thinking of that would help here?

--Larry Garfield

1 year ago by michal.brzuchalski@gmail.com — view source

unread

pon., 11 mar 2024 o 15:30 Larry Garfield larry@garfieldtech.com
napisał(a):

Hi Larry,

pt., 8 mar 2024 o 16:55 Larry Garfield larry@garfieldtech.com
napisał(a):

Hi folks. Based on earlier discussions, we've made a number of changes
to the RFC that should address some of the concerns people raised. We also
had some very fruitful discussions off-list with several developers from
the Foundation, which led to what we feel are some solid improvements.

https://wiki.php.net/rfc/property-hooks

This RFC looks awesome, thanks Larry and Ilija I love the functionality
in its current shape.

Thank you everyone for the feedback so far, and if you still have some,
please say so. (Even if it's just to say that you're happy with the RFC
now so we feel more comfortable bringing it to a vote.)

The only thing I don't like and can still be worked on is the
reflection mechanism changes.
The proposed methods isVirtual and getRawValue, setRawValue pair
introduces a need to catch exceptions which could be eliminated by
subtyping ReflectionProperty.
Having these methods on a separate subtype allows returning a valid
value.
I realize this isn't trivial because for the last 2 days, I was
thinking about giving it a name and TBH cannot figure out anything
feasible.
If this is not possible to put in understandable words then at least
mention it in FAQ and why not.

Cheers,
Michał Marcin Brzuchalski

Hm, interesting. I'll have to check with Ilija on feasibility. My
question is, how would it eliminate it?

Suppose we hypothetically have a "ReflectionPropertyWithHooks" reflection
object. It has those three extra methods on it. But that means
$rObject->getProperty() could return a ReflectionProperty or
ReflectionPropertyWithHooks, and you don't know which it is. You'd have to
do an instanceof check to know which type of property it is, and thus what
methods are available. That doesn't seem any less clumsy (nor more, to be
fair) than calling isVirtual().

It is similar when you work with ReflectionType or ReflectionEnum, you
always need to match against a certain type to ensure the code will behave
predictably.

$rProp = $rObject->getProperty('foo', $obj);
$rProp->getValue(); // works always.

if (!$rProp->isVirtual()) {
$rProp->getRawValue(); // Works, may or may not be the same return as
getValue()
}

vs.

if (!$rProp instanceof VirtualProperty) {
$rProp->getRawValue(); // Works.
}

That doesn't seem to be an improvement. If you omit the conditional, you
still need a catch one way or the other. It just changes what gets thrown
(an Exception vs an Error). What type of hierarchy were you thinking of
that would help here?

My thinking was like:

if the property has hooks, only then calling getRawValue, setRawValue,
or isVirtual make sense,
if the property has no hooks and is static calling getRawValue, or
setRawValue always throws because of "On a static property, this method
will always throw an error."
if the property is "virtual", calling getRawValue, or setRawValue always
throws an error,
if the property is not "virtual", calling getValue, or setValue is safe
and never throws, otherwise it may throw under some conditions related to
certain hook presence.

In conclusion, I thought that the presence of hooks introduces 3 new
methods but some will always be thrown because of incorrect usage.
Normally, I'd model it with subtypes to completely avoid try-catch blocks
for the cost of a simple instanceof check which I consider much cleaner for
the reader than a bunch of try-catch blocks. Remember about static
analysis, each of them when checked will propose to handle possible
exceptions.

But as wrote before, I don't know how to model it well.

Cheers,
Michał Marcin Brzuchalski

1 year ago by Larry Garfield — view source

unread

Hm, interesting. I'll have to check with Ilija on feasibility. My question is, how would it eliminate it?

Suppose we hypothetically have a "ReflectionPropertyWithHooks" reflection object. It has those three extra methods on it. But that means $rObject->getProperty() could return a ReflectionProperty or ReflectionPropertyWithHooks, and you don't know which it is. You'd have to do an instanceof check to know which type of property it is, and thus what methods are available. That doesn't seem any less clumsy (nor more, to be fair) than calling isVirtual().

It is similar when you work with ReflectionType or ReflectionEnum, you
always need to match against a certain type to ensure the code will
behave predictably.

$rProp = $rObject->getProperty('foo', $obj);
$rProp->getValue(); // works always.

if (!$rProp->isVirtual()) {
$rProp->getRawValue(); // Works, may or may not be the same return as getValue()
}

vs.

if (!$rProp instanceof VirtualProperty) {
$rProp->getRawValue(); // Works.
}

That doesn't seem to be an improvement. If you omit the conditional, you still need a catch one way or the other. It just changes what gets thrown (an Exception vs an Error). What type of hierarchy were you thinking of that would help here?

My thinking was like:

if the property has hooks, only then calling getRawValue,
setRawValue, or isVirtual make sense,

if the property has no hooks and is static calling getRawValue, or
setRawValue always throws because of "On a static property, this method
will always throw an error."

if the property is "virtual", calling getRawValue, or setRawValue
always throws an error,

if the property is not "virtual", calling getValue, or setValue is
safe and never throws, otherwise it may throw under some conditions
related to certain hook presence.

In conclusion, I thought that the presence of hooks introduces 3 new
methods but some will always be thrown because of incorrect usage.
Normally, I'd model it with subtypes to completely avoid try-catch
blocks for the cost of a simple instanceof check which I consider much
cleaner for the reader than a bunch of try-catch blocks. Remember about
static analysis, each of them when checked will propose to handle
possible exceptions.

But as wrote before, I don't know how to model it well.

Agreed, though I don't know how to model it either. :-) If someone can figure out a way to do so before we go to a vote, we'll consider it.

--Larry Garfield

1 year ago by Stephen Reay — view source

unread

Hi folks. Based on earlier discussions, we've made a number of changes to the RFC that should address some of the concerns people raised. We also had some very fruitful discussions off-list with several developers from the Foundation, which led to what we feel are some solid improvements.

https://wiki.php.net/rfc/property-hooks

Smaller changes:

We've added a PropertyHookType enum for use in reflection, which lets us get rid of a possible exception.

get_mangled_object_vars()'s behavior is now defined to be the same as an array, ie, skip hooks.

A final property with a final hook is no longer an error. It's just silently redundant, like for methods.

Made explict what happens with recursive hook calls and method calls from a hook. The behavior here is the same as for __get/__set.

Made support for #[\Override] explicit.

Added an FAQ regarding the property-centric approach rather than method-centric approach.

Clarified that the parent::$foo::get() syntax works on a parent property regardless of whether it has hooks.

Clarified that untyped properties are supported. (Though, it's 2024, please don't use untyped properties.)

Clarified that interface properties cannot specify a wider write-type, for simplicity. That could be considered as a future add-on with no BC breaks.

Provided an explanation of how to interpret the parent-hook access syntax.

Added an FAQ item explaining why a 'virtual' keyword is not feasible.

Larger changes:

As noted a while ago, $field has been removed.

I've added an FAQ question regarding the parent::$foo::get() syntax, and why it is.

The $foo => expression shorthand has been removed. The legal shorthands are now:

public string $foo {
get => evaluates to a value;
set => assigns this value;
}

The set shorthand (with => ) now means "write this value instead". The non-shorthand version (set { } ) is always return void, so you have to assign the value yourself but you get more flexibility. Having updated the examples accordingly, I think this is actually a really nice and intuitive trade-off, as it makes the common transformation and validation cases (eg, in constructor promotion) even easier to follow with no redundancy.

On a set hook, the user may specify both a type and name, or neither. (That is, set {} or set (Foo $newFoo). If not specified, it defaults to the type of the property and $value, as before.

I restructured how the content in 2, 3, 4 is presented and moved some stuff around so that it flows more logically (I think).

The restrictions around references have been softened. Specifically, references are only disallowed if a backed property has a set hook. If it has only a get, or if it's a virtual property, references are allowed. We also added a future-scope section on a possible way to support assigning by reference in the future, if there is sufficient need.

Interfaces may now require a &get hook, or just 'get'. A class may use a &get hook on a 'get' interface declaration. This is the same logic as already exists for methods; we're just copying it.

Hopefully the above changes should resolve most outstanding concerns. I do think the updated shorthand handling is better overall, so I'm happy with it.

There also seems to be little consensus so far on naming this thing hooks vs accessors. Absent a consensus, we'll probably stick with hooks to avoid the effort of renaming all the things.

Thank you everyone for the feedback so far, and if you still have some, please say so. (Even if it's just to say that you're happy with the RFC now so we feel more comfortable bringing it to a vote.)

--Larry Garfield

Hi Larry

Thanks again for both of your work on this, I'm really hopeful this passes.

Was there ever any further discussion/resolution/decision about the use an explicit virtual keyword, and the related flag for creation of a backing store? I thought it was discussed by several people but I don't recall seeing any eventual consensus, and it looks to my eye that it hasn't changed from the original proposal: i.e. it's 'magic' and $this->{__PROPERTY__} won't work?

Is that correct?

Cheers

Stephen

1 year ago by Larry Garfield — view source

unread

Hi Larry

Thanks again for both of your work on this, I'm really hopeful this passes.

Was there ever any further discussion/resolution/decision about the use
an explicit virtual keyword, and the related flag for creation of a
backing store? I thought it was discussed by several people but I don't
recall seeing any eventual consensus, and it looks to my eye that it
hasn't changed from the original proposal: i.e. it's 'magic' and
$this->{__PROPERTY__} won't work?

Is that correct?

We looked into virtual, and considered it, but determined that it wouldn't actually work because of inheritance. Details are in a new FAQ entry:

https://wiki.php.net/rfc/property-hooks#why_no_explicit_virtual_flag

--Larry Garfield

1 year ago by drealecs@gmail.com — view source

unread

On Tue, Mar 12, 2024 at 4:36 PM Larry Garfield larry@garfieldtech.com
wrote:

Hi Larry

Thanks again for both of your work on this, I'm really hopeful this
passes.

Was there ever any further discussion/resolution/decision about the use
an explicit virtual keyword, and the related flag for creation of a
backing store? I thought it was discussed by several people but I don't
recall seeing any eventual consensus, and it looks to my eye that it
hasn't changed from the original proposal: i.e. it's 'magic' and
$this->{__PROPERTY__} won't work?

Is that correct?

We looked into virtual, and considered it, but determined that it
wouldn't actually work because of inheritance. Details are in a new FAQ
entry:

https://wiki.php.net/rfc/property-hooks#why_no_explicit_virtual_flag

Nice that you added the details there why explicit virtual is not a good
idea.

So, if the parent class changes between virtual and non-virtual, while the
child class is virtual, the only tiny BC break will be in the reflection as
we will see a change between virtual and non-virtual?

Alex

1 year ago by Larry Garfield — view source

unread

Hi Larry

Thanks again for both of your work on this, I'm really hopeful this passes.

Was there ever any further discussion/resolution/decision about the use
an explicit virtual keyword, and the related flag for creation of a
backing store? I thought it was discussed by several people but I don't
recall seeing any eventual consensus, and it looks to my eye that it
hasn't changed from the original proposal: i.e. it's 'magic' and
$this->{__PROPERTY__} won't work?

Is that correct?

We looked into virtual, and considered it, but determined that it wouldn't actually work because of inheritance. Details are in a new FAQ entry:

https://wiki.php.net/rfc/property-hooks#why_no_explicit_virtual_flag

Nice that you added the details there why explicit virtual is not a good idea.

So, if the parent class changes between virtual and non-virtual, while
the child class is virtual, the only tiny BC break will be in the
reflection as we will see a change between virtual and non-virtual?

Alex

I believe that is correct, yes.

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

Hi folks. Based on earlier discussions, we've made a number of changes
to the RFC that should address some of the concerns people raised. We
also had some very fruitful discussions off-list with several developers
from the Foundation, which led to what we feel are some solid
improvements.

https://wiki.php.net/rfc/property-hooks

Hi Larry,

Thanks again for the continuing hard work on this!

if a |get| hook for property |$foo| calls method |bar()|, then inside
that method |$this->foo| will refer to the raw property, both read and
write. If |bar()| is called from somewhere other than the hook, reading
from |$this->foo| will trigger the |get| hook. This behavior is
identical to that already used by |__get| and |__set| today.

I'm slightly confused by this.

If there is an actual property called $foo, then __get and __set will be
called only when it is out of visibility, regardless of the call stack -
e.g. a private property will always trigger __get from public scope, and
always access it directly from private scope: https://3v4l.org/R5Yos
That seems differ from what's proposed, where even a private call to
bar() would trigger the hook.

The protection against recursion appears to only be relevant for
completely undefined properties. For __get, the direct access can never
do anything useful - there's nothing to access: https://3v4l.org/2nDZS
For __set, it is at least possible for the non-recursive write to
succeed, but only in the niche case of creating a dynamic property:
https://3v4l.org/dpYOj I'm not sure that there's any equivalent to this
scenario for property hooks, since they can never be undefined/dynamic.

There is one exception to the above: if a property is virtual, then
there is no presumed connection between the get and set operations.
[...] For that reason, |&get| by reference is allowed for virtual
properties, regardless of whether or not there is a |set| hook.

I don't agree with this, and the example immediately following it
demonstrates the exact opposite: the &get and set hooks are both
proxying to the same backing value, and have all the same problems as if
the property was non-virtual. I would imagine a lot of real-life virtual
properties would be doing something similar: converting to/from a
different type, proxying to another object, etc.

I think this exception is unnecessarily complicated: either trust users
to handle the implications of combining &get with set, or forbid it.

Additionally, |&get| hooks are allowed for arrays as well, provided
there is no |set| hook.

I mentioned in a previous e-mail the possibility of using the &get hook
for array writes. Has this been considered?

That is:

$c->arr['beep'] = 'boop';

Would be equivalent to:

$temp =& $c->arr;
$temp['beep'] = 'boop';
unset($temp);

Which would be valid if $arr had an &get hook defined.

A |set| hook on a typed property must declare a parameter type that
is the same as or contravariant (wider) from the type of the property.

Once a property has both a |get| and |set| operation, however, it is
no longer covariant or contravariant for further extension.

How do these two rules interact?

Could this:

public string $foo {
   get => $this->_foo;
   set(string|Stringable $value) {
       $this->_foo = (string)$value;
   }
}

be over-ridden by this, where the property's "main type" remains
invariant but its "settable type" is contravariant?

public string $foo {
   get => $this->_foo;
   set(string|Stringable|SomethingElse $value) {
       $this->_foo = $value instanceof SomethingElse ?
$value->asString() : (string)$value;
   }
}

ReflectionProperty has several new methods to work with hooks.

There should be some way to reliably determine the "settable type" of a
property. At the moment, I think you would have to do something like this:

$setHook = $property->getHook(PropertyHookType::Set);
$writeType = $setHook === null ? $property->getType()
: $setHook->getParameters()[0]->getType();

Once again, I would like to make the case that asymmetric types are an
unnecessary complication that should be left to Future Scope.

The fact that none of the other languages referenced have such a feature
should also give us pause. There's nothing to stop us being the first to
innovate a feature, but we should be extra cautious when doing so, with
no previous experience to learn from. It also means there is no
expectation from users coming from other languages that this will be
possible.

If it genuinely seems useful, it can be added in a follow-up RFC, or
even a later version of PHP, with little impact on the rest of the
feature. But if we add it now and regret it, or some detail of its
implementation, we will be stuck with it forever.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Larry Garfield — view source

unread

if a get hook for property $foo calls method bar(), then inside that method $this->foo will refer to the raw property, both read and write. If bar() is called from somewhere other than the hook, reading from $this->foo will trigger the get hook. This behavior is identical to that already used by __get and __set today.

I'm slightly confused by this.

If there is an actual property called $foo, then __get and __set will
be called only when it is out of visibility, regardless of the call
stack - e.g. a private property will always trigger __get from public
scope, and always access it directly from private scope:
https://3v4l.org/R5Yos That seems differ from what's proposed, where
even a private call to bar() would trigger the hook.

The protection against recursion appears to only be relevant for
completely undefined properties. For __get, the direct access can never
do anything useful - there's nothing to access: https://3v4l.org/2nDZS
For __set, it is at least possible for the non-recursive write to
succeed, but only in the niche case of creating a dynamic property:
https://3v4l.org/dpYOj I'm not sure that there's any equivalent to this
scenario for property hooks, since they can never be undefined/dynamic.

It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same.

So maybe "is the same" rather than "is identical", but otherwise it's the same concept.

There is one exception to the above: if a property is virtual, then there is no presumed connection between the get and set operations. [...] For that reason, &get by reference is allowed for virtual properties, regardless of whether or not there is a set hook.

I don't agree with this, and the example immediately following it
demonstrates the exact opposite: the &get and set hooks are both
proxying to the same backing value, and have all the same problems as
if the property was non-virtual. I would imagine a lot of real-life
virtual properties would be doing something similar: converting to/from
a different type, proxying to another object, etc.

I think this exception is unnecessarily complicated: either trust users
to handle the implications of combining &get with set, or forbid it.

The point is to give the user the option for full backwards compatibility when it makes sense. This requires jumping through some hoops, which is the point. This is essentially equivalent to creating a by-ref getter + a setter, exposing the underlying property. By creating a virtual property, we are "accepting" that the two are detached. While we could disallow this, we recognize that there may be valid use-cases that we'd like to enable. It also parallels __get/__set, where using &__get means you can write to something without going through __set.

In practice I expect it virtual properties with both hooks to be very rare. Most virtual properties will, I expect, be lazy-computed get-only values.

Additionally, &get hooks are allowed for arrays as well, provided there is no set hook.

I mentioned in a previous e-mail the possibility of using the &get hook
for array writes. Has this been considered?

That is:

$c->arr['beep'] = 'boop';

Would be equivalent to:

$temp =& $c->arr;
$temp['beep'] = 'boop';
unset($temp);

Which would be valid if $arr had an &get hook defined.

With the change to allow &get in the absence of set, I believe that would already work.

cf: https://3v4l.org/3Gnti/rfc#vrfc.property-hooks

A set hook on a typed property must declare a parameter type that is the same as or contravariant (wider) from the type of the property.

Once a property has both a get and set operation, however, it is no longer covariant or contravariant for further extension.

How do these two rules interact?

Could this:

public string $foo {
get => $this->_foo;
set(string|Stringable $value) {
$this->_foo = (string)$value;
}
}

be over-ridden by this, where the property's "main type" remains
invariant but its "settable type" is contravariant?

public string $foo {
get => $this->_foo;
set(string|Stringable|SomethingElse $value) {
$this->_foo = $value instanceof SomethingElse ?
$value->asString() : (string)$value;
}
}

That would be legal.

ReflectionProperty has several new methods to work with hooks.

There should be some way to reliably determine the "settable type" of a
property. At the moment, I think you would have to do something like
this:

$setHook = $property->getHook(PropertyHookType::Set);
$writeType = $setHook === null ? $property->getType() :
$setHook->getParameters()[0]->getType();

Hm. Good point here. We'll probably need to add another method to ReflectionProperty for that. Stay tuned.

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same.

I guess I'm looking at it more from the user's point of view: it's very
rare with __get and __set to have a method that sometimes accesses the
"real" property, and sometimes goes through the "hook". Either there is
no real property, or the property has private/protected scope, so any
method on the classes sees the "real" property regardless of access
via the hook.

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only directly inside the hook
definitions, via the "field" keyword).

The point is to give the user the option for full backwards compatibility when it makes sense. This requires jumping through some hoops, which is the point. This is essentially equivalent to creating a by-ref getter + a setter, exposing the underlying property. By creating a virtual property, we are "accepting" that the two are detached. While we could disallow this, we recognize that there may be valid use-cases that we'd like to enable. It also parallels __get/__set, where using &__get means you can write to something without going through __set.

I get the impression that to you, it's a given that a "virtual property"
is something clearly distinct from a "property with hooks", and that
users will consciously decide between one and the other.

This isn't my expectation; based on what people are used to from
existing features, and other languages, I expect users to see this as an
obvious starting point for defining a hooked property:

private int $_foo;
public int $foo { get => $this->_foo; set { $this->_foo = $value; } {

And this as a convenient short-hand for exactly the same thing:

public int $foo { get => $this->foo; set { $this->foo = $value; } }

Choosing one or the other won't feel like "jumping through a hoop", and
the ability to use an &get hook with one and not the other will simply
seem like a weird oddity.

In practice I expect it virtual properties with both hooks to be very rare. Most virtual properties will, I expect, be lazy-computed get-only values.

I don't think this is true. Both of these are, in the terms of the RFC,
"virtual properties":

public Something $proxied { get => $this->otherObject->thing; set {
$this->otherObject->thing = $value; } };

public Money $price;
public int $pricePence { get => $this->price->asPence(); set {
$this->price = Money::fromPence($value); } }

I can also imagine generated classes with "virtual" properties which
call out to generic "getCached" and "setAndClearCache" methods doing the
job of this pair of __get and __set methods:
https://github.com/yiisoft/yii2/blob/master/framework/db/BaseActiveRecord.php#L274

With the change to allow &get in the absence of set, I believe that would already work.

cf:https://3v4l.org/3Gnti/rfc#vrfc.property-hooks

Awesome! The RFC should probably highlight this, as it gives a
significant extra option for array properties.

(Also, good to know 3v4l has a copy of the branch; I hadn't thought to
check.)

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Larry Garfield — view source

unread

It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same.

I guess I'm looking at it more from the user's point of view: it's very
rare with __get and __set to have a method that sometimes accesses the
"real" property, and sometimes goes through the "hook". Either there is
no real property, or the property has private/protected scope, so any
method on the classes sees the "real" property regardless of access
via the hook.

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only directly inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective. (The internal implementation is backwards from Kotlin, but that doesn't matter to the user.)

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set. We couldn't use a backing field otherwise, without some other syntax. (This is where Kotlin uses 'field'.) So, I'm not really sure what we're discussing at this point. What specific changes are you suggesting?

With the change to allow &get in the absence of set, I believe that would already work.

cf: https://3v4l.org/3Gnti/rfc#vrfc.property-hooks

Awesome! The RFC should probably highlight this, as it gives a
significant extra option for array properties.

Updated. I may try to rewrite the array and references section this weekend, as with the changes in the design to be more permissive I'm not sure it's entirely clear anymore what the net result is.

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only directly inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective.

Unless I'm misunderstanding something, the backing field in Kotlin is accessible only inside the hooks, nowhere else. I don't know what would happen if a hook caused a recursive call to itself, but there's no mention in the docs of it bypassing the hooks, only this:

This backing field can be referenced in the accessors using the field identifier

and

The field identifier can only be used in the accessors of the property.

And then a section explaining that more complex hooks should use a separate backing property - which is the only option in C#, and roughly what people would do in PHP today with __get and __set.

Kotlin does have a special syntax for "delegating" hooks, but looking at the examples, they do not use the backing field at all, they have to provide their own storage.

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set.

I understand that something needs to happen if a recursive call happens, but it could just be an error, like any other unbounded recursion.

I can also understand the temptation to make it something more useful than an error, and provide a way to access the "backing field" / "raw value" from outside the hook. But it does lead to something quite surprising: the same line of code does different things depending on how it is called.

I doubt many people have ever discovered that __get and __set work that way, since as far as I can see it's only possible to use deliberately if you're dynamically adding and unsetting properties inside your class.

So, I don't necessarily think hooks working that way is the wrong decision, I just think it's a decision we should make consciously, not one that's obvious.

Regards,
Rowan Tommins
[IMSoP]

1 year ago by Larry Garfield — view source

unread

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only directly inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective.

Unless I'm misunderstanding something, the backing field in Kotlin is
accessible only inside the hooks, nowhere else. I don't know what would
happen if a hook caused a recursive call to itself, but there's no
mention in the docs of it bypassing the hooks, only this:

This backing field can be referenced in the accessors using the field identifier

and

The field identifier can only be used in the accessors of the property.

And then a section explaining that more complex hooks should use a
separate backing property - which is the only option in C#, and roughly
what people would do in PHP today with __get and __set.

Kotlin does have a special syntax for "delegating" hooks, but looking
at the examples, they do not use the backing field at all, they have to
provide their own storage.

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set.

I understand that something needs to happen if a recursive call
happens, but it could just be an error, like any other unbounded
recursion.

I can also understand the temptation to make it something more useful
than an error, and provide a way to access the "backing field" / "raw
value" from outside the hook. But it does lead to something quite
surprising: the same line of code does different things depending on
how it is called.

I doubt many people have ever discovered that __get and __set work that
way, since as far as I can see it's only possible to use deliberately
if you're dynamically adding and unsetting properties inside your class.

So, I don't necessarily think hooks working that way is the wrong
decision, I just think it's a decision we should make consciously, not
one that's obvious.

Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:

public string $expensive {
get => $this->expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->expensive = $value;
}
}

So disabling the hooks from within the hooks seems like the only logical solution there. (Short of bringing back $field and making it mandatory, which is actually much harder than it sounds because of the ++ et al operators that would need to be supported.)

The other case then becomes:

class Foo {
public string $a {
get => $this->expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->expensive = $value;
}
}

public function compute() {
$start = $this->expensive ?? 'a';
return $start . 'b';
}
}

Inside compute(), the logic requires reading from $this->expensive. If we have no guard, that would cause an infinite loop. If the guard extends down the call stack, then the loop is eliminated.

That does mean, as you note, that if you call $foo->compute(), the first call to $this->expensive will invoke the get hook, but subsequent calls to it will not. That may seem odd, but in practice, we do not see any other alternative that doesn't make infinite loops very easy to write. This is, of course, a highly contrived example. In practice, I don't expect it to come up much in the first place.

It was this logic that led us to the current implementation: Once you're within the callstack of a hook, you bypass that property's hooks. We then noted that __get/__set have essentially the same guard logic. It doesn't come up often because, again as you note, it's only relevant in even more contrived examples.

So yes, the current logic is very deliberate and based on a process of elimination to arrive at what we feel is the only logical design. Likely the same process of elimination that led to the behavior of the guards on __get/__set, which is why they are essentially the same. Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

In theory we could also forbid accessing a property within the call stack of its hooks; in that case, the above code would simply error. However, that is a less-optimal solution because then compute() is only sometimes callable, depending on the callstack. That is no less weird than $this->expensive skipping hooks only sometimes. It would also be inconsistent with how __get/__set work, which makes hooks less capable and less consistent. That's why we don't think that's a good way of doing it.

Also, we've rewritten the references and arrays sections in the RFC. Nothing actually changed in the implementation, but it should be a lot clearer now that only a fairly small subset of impossible situations are blocked. Most things "just work," and when they don't, they (again) don't for a logical reason that parallels the behavior of the language elsewhere. There's even some nice summary tables. :-)

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:

public string $expensive {
get => $this->expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->expensive = $value;
}
}

To play devil's advocate, in an implementation with only virtual properties, this is still perfectly possible, just one declaration longer:

private string $_expensive;
public string $expensive {
get => $this->_expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->_expensive = $value;
}
}

Note that in this version there is an unambiguous way to refer to the raw value from anywhere else in the class, if you wanted a clearAll() method for instance.

I can't stress enough that this is where a lot of my thinking comes from: that backed properties are really the special case, not the default. Anything you can do with a backed property you can do with a virtual one, but the opposite will never be true.

The minimum version of backed properties is basically just sugar for that - the property is still essentially virtual, but the language declares the backing property for you, leading to:

public string $expensive {
get => $field ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$field = $value;
}
}

I realise now that this isn't actually how the current implementation works, but again I wanted to illustrate where I'm coming from: that backed properties are just a convenience, not a different type of property with its own rules.

Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

I can only speak for myself, but my expectations were based on:

a) How __get and __set are used in practice. That generally involves reading and writing a private property, of either the same or different name from the public one; and that private property is visible everywhere equally, no special handling based on the call stack.

b) What happens if you accidentally cause infinite recursion in a normal function or method, which is that the language eventually hits a stack depth limit and throws an error.

So the assertion that the proposal was consistent with expectations surprised me. It feels to me like something that will seem surprising to people when they first encounter it, but useful once they understand the implications.

Regards,
Rowan Tommins
[IMSoP]

1 year ago by Ilija Tovilo — view source

unread

Hi Rowan

On Sat, Mar 16, 2024 at 9:32 AM Rowan Tommins [IMSoP]
imsop.php@rwec.co.uk wrote:

Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:

public string $expensive {
get => $this->expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->expensive = $value;
}
}

To play devil's advocate, in an implementation with only virtual properties, this is still perfectly possible, just one declaration longer:

private string $_expensive;
public string $expensive {
get => $this->_expensive ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$this->_expensive = $value;
}
}

Note that in this version there is an unambiguous way to refer to the raw value from anywhere else in the class, if you wanted a clearAll() method for instance.

I can't stress enough that this is where a lot of my thinking comes from: that backed properties are really the special case, not the default. Anything you can do with a backed property you can do with a virtual one, but the opposite will never be true.

The minimum version of backed properties is basically just sugar for that - the property is still essentially virtual, but the language declares the backing property for you, leading to:

public string $expensive {
get => $field ??= $this->compute();
set {
if (strlen($value) < 50) throw new Exception();
$field = $value;
}
}

I realise now that this isn't actually how the current implementation works, but again I wanted to illustrate where I'm coming from: that backed properties are just a convenience, not a different type of property with its own rules.

That's not really how we think about it. Our design decisions have
been guided by a few factors:

The RFC intentionally makes plain properties and properties with
hooks as fully compatible as possible.

A subclass can override a plain property by adding hooks to it. Many
other languages only allow doing so if the parent property already has
generated accessors ({ get; set; }). For many of them, switching
from a plain property to one with accessors is actually an ABI break.
One requires generating assembly/IR instructions that access a field
in some structure, the other one is a method call. This is not
relevant in our case.

In most languages, a consequence of { get; set; } is that such
properties cannot be passed by reference. This part is relevant to
PHP, because PHP makes heavy use of explicit by-reference passing for
arrays, but not much else. However, as outlined in the RFC, arrays are
not a good use-case for hooks to begin with. So instead of fragmenting
the entirety of all PHP code bases into plain and { get; set; }
properties where it doesn't actually make a semantic difference, and
then not even using them when it would matter (arrays), we have
decided to avoid generated hooks altogether.

The approach of making plain and hooked properties compatible also
immediately means that a property can have both a "backing value"
(inherited from the parent property) and hooks (from the child
property). This goes against your model that backed properties are
really just two properties, one for the backing value and a virtual
one for the hooks.

Our approach has the nice side effect of properties only containing
hooks when they actually do something. We don't need to deal with
optimizations like "the hook is auto-generated, revert to accessing
the property directly to make it faster", or even just having the
generated hook taking up unnecessary memory. You can think of our
properties this way:

class Property {
    public ?Data $storage;
    public ?callable $getHook;
    public ?callable $setHook;

    public function get() {
        if ($hook = $this->getHook) {
            return $hook();
        } else if ($storage) {
            return $storage->get();
        } else {
            throw new Error('Property is write-only');
        }
    }

    public function set($value) {
        if ($hook = $this->setHook) {
            $hook($value);
        } else if ($storage) {
            $storage->set($value);
        } else {
            throw new Error('Property is read-only');
        }
    }
}

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.

Although you say backed properties are just syntactic, they really
are not. For example, renaming a public property, making it private
and replacing it with a new passthrough virtual property breaks
serialization, as serialization works on the object's raw values. On
the other hand, adding a hook to an existing property doesn't
influence its backing value, so there is no impact on serialization.

Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

I can only speak for myself, but my expectations were based on:

a) How __get and __set are used in practice. That generally involves reading and writing a private property, of either the same or different name from the public one; and that private property is visible everywhere equally, no special handling based on the call stack.

b) What happens if you accidentally cause infinite recursion in a normal function or method, which is that the language eventually hits a stack depth limit and throws an error.

So the assertion that the proposal was consistent with expectations surprised me. It feels to me like something that will seem surprising to people when they first encounter it, but useful once they understand the implications.

Guards are used for dynamic property creation within __get/__set:
https://3v4l.org/6u3SR#v8.3.4

When __get or __set are called, the object remembers that this
property is being accessed via magic method. When you're already
inside this magic method, another call will not be triggered, thus
falling back to accessing the actual property of the object. In this
case, this means adding a dynamic property.

Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
field = 'value', but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.

Ilija

1 year ago by Rowan Tommins [IMSoP] — view source

unread

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.

That is a really helpful explanation, thanks; I hadn't thought about the
significance of inheritance between hooked and non-hooked properties.

I still think there will be a lot of users coming from other languages,
or from using __get and __set, who will look at virtual properties
first. Making things less surprising for those people seems worth some
effort, but I'm not asking for a complete redesign.

Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
field = 'value', but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.

I can think of several reasons we could explore other syntax:

To make it clearer in code whether a particular line is accessing via
the hooks, or by-passing them 2) To make the code in the hooks shorter
(e.g. $field is significantly shorter than
$this->someDescriptiveName) 3) To allow code to by-pass the hooks at
will, rather than only when called from the hooks (e.g. having a single
method that resets the state of several lazy-loaded properties)

Those reasons are probably not enough to rule out the current syntax;
but they show there are trade-offs being made.

To be honest, my biggest hesitation with the RFC remains asymmetric
types (the ability to specify types in the set hook). It's quite a
significant feature, with no precedent I know of, and I'm worried we'll
overlook something by including it immediately. For instance, what will
be the impact on people using reflection or static analysis to reason
about types? I would personally be more comfortable leaving that to a
follow-up RFC to consider the details more carefully.

Nobody else has raised that, beyond the syntax; I'm not sure if that's
because everyone is happy with it, or because the significance has been
overlooked.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Ilija Tovilo — view source

unread

Hi Rowan

On Sat, Mar 16, 2024 at 8:23 PM Rowan Tommins [IMSoP]
imsop.php@rwec.co.uk wrote:

I still think there will be a lot of users coming from other languages, or from using __get and __set, who will look at virtual properties first. Making things less surprising for those people seems worth some effort, but I'm not asking for a complete redesign.

For clarity, you are asking for a way to make the "virtualness" of
properties more explicit, correct? We touch on a keyword and why we
think it's suboptimal in the FAQ section. Unfortunately, I cannot
think of many alternatives. The $field variable made it a bit more
obvious, but only marginally.

I do believe that, for the most part, the user should not have to
think much about whether the property is backed or virtual. The
behavioral differences are mostly intuitive. For example:

class Test {
    // This property has a set hook that writes to the backing value. Since
    // we're using the backing value, it makes sense for there to be a way to
    // retrieve it. Without that, it wouldn't be useful.
    public $prop {
        set {
            $this->prop = strtoupper($value);
        }
    }

    // Similarly, a property with only a get hook that accesses the backing
    // value would need a way to write to the property for the get to be useful.
    public $prop {
        get => strtoupper($this->prop);
    }

    // A property with a get hook that does not use the backing value does not
    // need an implicit set operation, as writing to the backing value would be
    // useless, given that nobody will read it.
    public $prop {
        get => 42;
    }

    // Similarly, in the esoteric write-only case that does not use the backing
    // value, having an implicit get operation would always lead to a
    // "uninitialized property" error, and is not useful as such.
    public $prop {
        set {
            echo "Prop set\n";
        }
    }
}

Furthermore, serialize, var_dump and the other functions operating
on raw property values will include the property only if it is backed.
This also seems intuitive to me: If you never use the backing value,
the backing value would always be uninitialized, so there's no reason
to include it.

One case that is not completely obvious is lazy-initialized properties.

class Test {
    public $prop {
        get => $this->prop ??= expensiveOperation();
    }
}

It's not immediately obvious that there is a public set operation
here. The correct way to fix this would be with asymmetric visibility,
which was previously declined. Either way, I don't consider this case
alone enough to completely switch our approach. Please let me know if
you are aware of any other potentially non-intuitive cases.

I will admit that it is unfortunate that a user of the property has to
look through the hook implementation to understand whether a property
is writable. As you have previously suggested, one option might be to
add an explicit set; declaration. Maybe it's a bit more obvious now,
after my previous e-mail, why we are trying to avoid this.

Apart from the things already mentioned, it's unclear to me whether,
with such set; declarations, a get-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the set itself would fail. To make this work, the
absence of set; would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
field = 'value', but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.

I can think of several reasons we could explore other syntax:

To make it clearer in code whether a particular line is accessing via the hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. $field is significantly shorter than $this->someDescriptiveName) 3) To allow code to by-pass the hooks at will, rather than only when called from the hooks (e.g. having a single method that resets the state of several lazy-loaded properties)

Those reasons are probably not enough to rule out the current syntax; but they show there are trade-offs being made.

Fair enough. 1 and 2 are reasons why we added the $field macro as an
alternative syntax in the original draft. I don't quite understand
point 3. In Kotlin, field is only usable within its associated hook.
Other languages I'm aware of do not provide a way to access the
backing value directly, neither inside nor outside the accessor.

To be honest, my biggest hesitation with the RFC remains asymmetric types (the ability to specify types in the set hook). It's quite a significant feature, with no precedent I know of, and I'm worried we'll overlook something by including it immediately. For instance, what will be the impact on people using reflection or static analysis to reason about types? I would personally be more comfortable leaving that to a follow-up RFC to consider the details more carefully.

I personally do not feel strongly about whether asymmetric types make
it into the initial implementation. Larry does, however, and I think
it is not fair to exclude them without providing any concrete reasons
not to. I will spend time in the following days cleaning up tests, and
I will try my best to try to break asymmetric types. If I (or anybody
else) can't find a way to do so, I don't see a reason to remove them.

Nobody else has raised that, beyond the syntax; I'm not sure if that's because everyone is happy with it, or because the significance has been overlooked.

Yes, unfortunately that's a classic problem in RFC discussions: Syntax
gets a disproportionate amount of attention.

Ilija

1 year ago by Rowan Tommins [IMSoP] — view source

unread

For clarity, you are asking for a way to make the "virtualness" of
properties more explicit, correct?

Either more explicit, or less important: the less often the user needs
to know whether a property is virtual, the less it matters how easily
they can find out.

Please let me know if
you are aware of any other potentially non-intuitive cases.

I agree that while they may not be immediately obvious to the user, most
of the distinctions do make sense once you think about them.

The remaining difference I can see in the current RFC which seems to be
unnecessary is that combining &get with set is only allowed on virtual
properties. Although it may be "virtual" in the strict sense, any &get
hook must actually be referring to some value stored somewhere - that
might be a backed property, another field on the current class, a
property of some other object, etc:

public int $foo { &get => $this->foo; set { $this->foo = $value; } }

public int $bar { &get => $this->_bar; set { $this->_bar = $value; } }

public int $baz { &get => $this->delegatedObj->baz; set {
$this->delegatedObj->baz = $value; } }

This sentence from the RFC applies equally to all three of these examples:

That is because any attempted modification of the value by reference
would bypass a |set| hook, if one is defined.

I suggest that we either trust the user to understand that that will
happen, and allow combining &get and set on any property; or we do not
trust them, and forbid it on any property.

Apart from the things already mentioned, it's unclear to me whether,
with such set; declarations, a get-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the set itself would fail. To make this work, the
absence of set; would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

Any write inside the get hook already by-passes the set hook and refers
to the underlying property, so there would be no need for any default
set behaviour other than throwing an error.

It's not likely to be a common scenario, but the below works with the
current implementation https://3v4l.org/t7qhR/rfc#vrfc.property-hooks

class Example {
    public int $nextNumber {
        get {
            $this->nextNumber ??= 0;
            return $this->nextNumber++;
        }
        // Mimic the current behaviour of a virtual property:
https://3v4l.org/cAfAI/rfc#vrfc.property-hooks
        set => throw new Error('Property Example::$nextNumber is
read-only');
    }
}

Fair enough. 1 and 2 are reasons why we added the $field macro as an
alternative syntax in the original draft. I don't quite understand
point 3. In Kotlin, field is only usable within its associated hook.
Other languages I'm aware of do not provide a way to access the
backing value directly, neither inside nor outside the accessor.

We are already allowing more than Kotlin by letting hooks call out to a
method, and have that method refer back to the raw value.
Hypothetically, we could allow any method to access it, using some
syntax like $this->foo::raw. As a spectrum from least access to most access:

$field - accessible only in the lexical scope of the hook
$this->foo - accessible in the dynamic scope of the hook, e.g. a hook
calling $this->doSomething(PROPERTY);
$this->foo::raw - accessible anywhere in the class, e.g. a public
clearAll() method by-passing hooks

Whichever we provide for backed properties, option 3 is available for
virtual properties anyway, and common with __get/__set: store a value in
a private property, and have a public hooked property providing access
to it.

I understand now that option 2 fits most easily with the implementation,
and with decisions around inheritance and upgrade of existing code; but
the other options do have their advantages from a user's point of view.

I personally do not feel strongly about whether asymmetric types make
it into the initial implementation. Larry does, however, and I think
it is not fair to exclude them without providing any concrete reasons
not to. I will spend time in the following days cleaning up tests, and
I will try my best to try to break asymmetric types. If I (or anybody
else) can't find a way to do so, I don't see a reason to remove them.

My concern is more about the external impact of what is effectively a
change to the type system of the language: will IDEs give correct
feedback to users about which assignments are legal? will tools like
PhpStan and Psalm require complex changes to analyse code using such
properties? will we be prevented from adding some optimisation to
OpCache because these properties break some otherwise safe assumption?

Maybe I'm being over-cautious, but those are the kinds of questions I
would expect to come up if this feature had its own RFC.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Ilija Tovilo — view source

unread

Hi Rowan

On Sun, Mar 17, 2024 at 3:41 PM Rowan Tommins [IMSoP]
imsop.php@rwec.co.uk wrote:

The remaining difference I can see in the current RFC which seems to be
unnecessary is that combining &get with set is only allowed on virtual
properties. Although it may be "virtual" in the strict sense, any &get
hook must actually be referring to some value stored somewhere - that
might be a backed property, another field on the current class, a
property of some other object, etc:

public int $foo { &get => $this->foo; set { $this->foo = $value; } }

public int $bar { &get => $this->_bar; set { $this->_bar = $value; } }

public int $baz { &get => $this->delegatedObj->baz; set {
$this->delegatedObj->baz = $value; } }

This sentence from the RFC applies equally to all three of these examples:

That is because any attempted modification of the value by reference
would bypass a |set| hook, if one is defined.

I suggest that we either trust the user to understand that that will
happen, and allow combining &get and set on any property; or we do not
trust them, and forbid it on any property.

I'm indeed afraid that people will blindly make their array properties
by-reference, without understanding the implications. Allowing
by-reference behavior for virtual read/write properties is a tradeoff,
for cases where it may be necessary. Exposing private properties
by-reference is already possible outside of hooks
(https://3v4l.org/VNhf7), that's not something we can prevent for
secondary backing properties. However, we can at least make sure that
a reference to the baking value of a hooked property doesn't escape.

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.

Apart from the things already mentioned, it's unclear to me whether,
with such set; declarations, a get-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the set itself would fail. To make this work, the
absence of set; would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

Any write inside the get hook already by-passes the set hook and refers
to the underlying property, so there would be no need for any default
set behaviour other than throwing an error.

It's not likely to be a common scenario, but the below works with the
current implementation https://3v4l.org/t7qhR/rfc#vrfc.property-hooks

class Example {
public int $nextNumber {
get {
$this->nextNumber ??= 0;
return $this->nextNumber++;
}
// Mimic the current behaviour of a virtual property:
https://3v4l.org/cAfAI/rfc#vrfc.property-hooks
set => throw new Error('Property Example::$nextNumber is
read-only');
}
}

Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit set; declaration. You can interpret set; as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook. As mentioned in my previous
e-mail, auto-generated hooks is something we'd really like to avoid.
So, if the absence of set; means that the backing value is not
writable, the hook itself must be exempt from this rule.

Another thing to consider: The current implementation inherits the
backing value and all hooks from its parent. If the suggestion is to
add an explicit set; declaration to make it more obvious that the
property is writable, how does this help overridden properties?

class P {
    public $prop {
        get => strtolower($this->prop);
        set;
    }
}

class C extends P {
    public $prop {
        get => strtoupper(parent::$prop::get());
    }
}

Even though P::$prop signals that it is writable, there is no such
indication in C::$prop. You may suggest to also add set; to the
child, but then what if the parent adds a custom implementation for
set;?

class P {
    public $prop {
        get => strtolower($this->prop);
        set {
            echo $value, "\n";
            $this->prop = $value;
        }
    }
}

class C extends P {
    public $prop {
        get => strtoupper(parent::$prop::get());
        set;
    }
}

The meaning for set; is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. get => parent::$prop::get()).

We are already allowing more than Kotlin by letting hooks call out to a
method, and have that method refer back to the raw value.
Hypothetically, we could allow any method to access it, using some
syntax like $this->foo::raw. As a spectrum from least access to most access:

$field - accessible only in the lexical scope of the hook

$this->foo - accessible in the dynamic scope of the hook, e.g. a hook
calling $this->doSomething(PROPERTY);

$this->foo::raw - accessible anywhere in the class, e.g. a public
clearAll() method by-passing hooks

Whichever we provide for backed properties, option 3 is available for
virtual properties anyway, and common with __get/__set: store a value in
a private property, and have a public hooked property providing access
to it.

I seriously doubt accessing the backing value outside of the current
hook is useful. The backing value is an implementation detail. If it
is absolutely needed, ReflectionProperty::setRawValue() offers a way
to do it. I understand the desire for a shorter alternative like
$field, but it doesn't seem like the majority shares this desire at
this point in time.

I understand now that option 2 fits most easily with the implementation,
and with decisions around inheritance and upgrade of existing code; but
the other options do have their advantages from a user's point of view.

A different syntax like $this->prop::raw comes with similar
complexity issues, similar to those previously discussed for
parent::$prop/parent::$prop = 'prop'. All operations that
currently work out-of-the-box for the currently proposed syntax (read,
assign, assign by ref, assign with operation (+=, -=, etc.), inc/dec,
isset, send by-ref) would need new implementations. We could limit the
new syntax to read/assign/assign by ref, but that means more typing
for all other cases (e.g. $this->prop::raw = $this->prop::raw + 1
vs. $this->prop++). So, is that really better?

My concern is more about the external impact of what is effectively a
change to the type system of the language: will IDEs give correct
feedback to users about which assignments are legal? will tools like
PhpStan and Psalm require complex changes to analyse code using such
properties? will we be prevented from adding some optimisation to
OpCache because these properties break some otherwise safe assumption?

I don't consider Opcache a problem. An assignment on a non-final
property with mismatched types is no longer guaranteed to fail.
However, Opcache doesn't usually optimize error cases, e.g. replacing
them with direct exceptions. I can't speak for IDEs or static
analyzers, but I'm not sure what makes this case special. We can ask
some of their maintainers for feedback.

Maybe I'm being over-cautious, but those are the kinds of questions I
would expect to come up if this feature had its own RFC.

Of course, no worries. I'd rather hear it now than when the voting has
started. ;)

Ilija

1 year ago by Rowan Tommins [IMSoP] — view source

unread

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.

I disagree with this reasoning, because I foresee plenty of cases where
a virtual property is necessary anyway, so doesn't provide any
additional hoop to jump through.

But there's not much more to say on this point, so I guess we'll leave
it there.

Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit set; declaration. You can interpret set; as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook.

Regardless of which of these views you start with, it still seems
intuitive to me that accesses inside the get hook would bypass the
normal rules and write to the raw value.

Leaving aside the implementation, there are three things that can happen
when you write to a property:

a) the set hook is called
b) the raw property is written to
c) an error is thrown

Inside the dynamic scope of a hook, the behaviour is always (b), and I
don't see any reason for that to change. From anywhere else, backed
properties currently try (a) and fall back to (b); virtual properties
try (a) and fall back to (c).

I do understand that falling back to (b) makes the implementation
simpler, and works well with inheritance and some use cases; but falling
back to (c) wouldn't necessarily need a "default hook", just a marker of
"has hooks".

It occurred to me you could implement it in reverse: auto-generate a
hook "set => throw new Error;" and then remove it if the user opts in
to the default set behaviour. That would keep the "write directly" case
optimised "for free"; but it would be awkward for inheritance, as you'd
have to somehow avoid calling the parent's hook.

The meaning for set; is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. get => parent::$prop::get()).

Yes, I think this is probably a good argument against requiring "set;"

I think "be careful when inheriting only one hook" will always be a key
rule to teach anyway, because it's easy to mess up (e.g. assuming the
parent is backed and accessing $this->foo, rather than calling the
parent's hook implementation). But adding "set;" into the mix probably
just makes it worse.

I seriously doubt accessing the backing value outside of the current
hook is useful. The backing value is an implementation detail. If it
is absolutely needed, ReflectionProperty::setRawValue() offers a way
to do it. I understand the desire for a shorter alternative like
$field, but it doesn't seem like the majority shares this desire at
this point in time.

The example of clearAll() is a real use case, which people will
currently achieve with __get and __set (e.g. the Yii ActiveRecord
implementation I linked in one of my previous messages).

The alternative wouldn't be reflection, it would just be switching to a
virtual property with the value stored in a private field. I think
that's fine, it's just drawing the line of which use cases backed
properties cover: Kotlin covers more use cases than C#; PHP will cover
more than Kotlin (methods able to by-pass a hook when called from that
hook); but it will draw the line here.

A different syntax like $this->prop::raw comes with similar
complexity issues, similar to those previously discussed for
parent::$prop/parent::$prop = 'prop'.

Yeah, I can't even think of a nice syntax for it, let alone a nice
implementation. Let's leave it as a thought experiment, no further
action needed. :)

Regarding asymmetric types:

I can't speak for IDEs or static
analyzers, but I'm not sure what makes this case special. We can ask
some of their maintainers for feedback.

In order to reliably tell the user whether "$a->foo = $b->bar;" is a
type-safe operation, the analyser will need to track two types for every
property, the "gettable type" and the "settable type", and apply them in
the correct contexts.

I've honestly no idea whether that will be easy or hard; it will
probably vary between tools. In particular, I get the impression IDEs /
editor plugins sometimes have a base implementation used for multiple
programming languages, and PHP might be the only one that needed this
extra tracking.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Derick Rethans — view source

unread

Hi folks. Based on earlier discussions, we've made a number of
changes to the RFC that should address some of the concerns people
raised. We also had some very fruitful discussions off-list with
several developers from the Foundation, which led to what we feel are
some solid improvements.

https://wiki.php.net/rfc/property-hooks

Some comments and questions:

Be aware, the detection logic works on $this->[propertyName] directly at 
compile time, not on dynamic forms of it like $prop = 'beep'; 
$this->$prop. That will not trigger a backing value.

How can that not cause issues?

 The set hook's return type is unspecified, and will silently be 
treated as void.

What happens if you do specify a return type? Will it Error?

Implicit ''set'' parameter

If the write-type of a property is the same as its defined type 
(this is the common case), then the argument may be omitted 
entirely. 

…

If the parameter is not specified, it defaults to $value.

I am not a fan of this "magical" behaviour. Do we need this short cut,
and the following "Short-set"?

With asymmetric visibility that was previously proposed, the 
example can be further simplified.

But it isn't here, so why is this example (and the next one) in the RFC?
:-)

Interaction with constructor property promotion

… In particular, the shorthand version of hook bodies and the 
ability to call out to private methods if they get complicated 
partially obviate the concern about syntactic complexity.

Although that is true, it does add more complexity in tools that needs
to parse PHP, as there is now another piece of new syntax that needs to
be added (and tested with).

ReflectionProperty has several new methods to work with hooks.

getHooks(): array returns an array of \ReflectionMethod objects  
keyed by the hook they are for.

What will the name for the &get hook be? And shouldn't there be an enum
case for that as well?

cheers,
Derick

1 year ago by Larry Garfield — view source

unread

Hi folks. Based on earlier discussions, we've made a number of
changes to the RFC that should address some of the concerns people
raised. We also had some very fruitful discussions off-list with
several developers from the Foundation, which led to what we feel are
some solid improvements.

https://wiki.php.net/rfc/property-hooks

Some comments and questions:

Be aware, the detection logic works on $this->[propertyName] directly at
compile time, not on dynamic forms of it like $prop = 'beep';
$this->$prop. That will not trigger a backing value.

How can that not cause issues?

For most uses that should be fine. But it's more that trying to do anything else would be vastly more complicated and error prone for the implementation. And most people hated the alternate $field variable from Kotlin.

The set hook's return type is unspecified, and will silently be
treated as void.

What happens if you do specify a return type? Will it Error?

Yes, it's a parse error.

Implicit ''set'' parameter

If the write-type of a property is the same as its defined type
(this is the common case), then the argument may be omitted
entirely.

…

If the parameter is not specified, it defaults to $value.

I am not a fan of this "magical" behaviour. Do we need this short cut,
and the following "Short-set"?

We believe we do. One of the goals of the RFC is to reduce and avoid verbose boilerplate. The short-hands should be the common case, in practice. The only reason one would have to specify a set parameter is if there was a good reason to change the variable name or widen the set type. Both of those are a-typical situations.

With asymmetric visibility that was previously proposed, the
example can be further simplified.

But it isn't here, so why is this example (and the next one) in the RFC?
:-)

Many/most languages with accessors also have asymmetric visibility. Nikita's original RFC also combined them all into one. The two RFCs are not mutually dependent, but are mutually-supportive, by design. So this comes under the "why didn't we include feature X" heading, with an answer "that's in a separate RFC."

Interaction with constructor property promotion

… In particular, the shorthand version of hook bodies and the
ability to call out to private methods if they get complicated
partially obviate the concern about syntactic complexity.

Although that is true, it does add more complexity in tools that needs
to parse PHP, as there is now another piece of new syntax that needs to
be added (and tested with).

That's true, but it's also inherently true of hooks themselves. And any other syntax improvement to the language.

ReflectionProperty has several new methods to work with hooks.

getHooks(): array returns an array of \ReflectionMethod objects
keyed by the hook they are for.

What will the name for the &get hook be? And shouldn't there be an enum
case for that as well?

&get isn't its own hook. It's still a "get" hook, it just returns by reference. So it will be available with the "Get" enum value. If the return-by-ref status needs to be known for whatever reason, that is already available on the ReflectionMethod object that is returned.

--Larry Garfield

1 year ago by Robert Landers — view source

unread

Hello,

I'm a bit confused on inheritance. In the following example of a
proxy, do I need to be aware of a parent's hook and handle it
specially?

class Loud
{
public string $name {
get {
return strtoupper($this->name);
}
}
}

class LoudProxy extends Loud
{
public string $name {
get {
// detected the parent has a hook? //
$return = parent::$name::get();
// do something with return //
return $return;
}
}
}

what happens if the Loud class later removes its hook implementation
(ex: moves it to the set hook)? Will my proxy now cause an error?

Would simply calling $this->name call the parents hook?

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Larry Garfield — view source

unread

Hello,

I'm a bit confused on inheritance. In the following example of a
proxy, do I need to be aware of a parent's hook and handle it
specially?

class Loud
{
public string $name {
get {
return strtoupper($this->name);
}
}
}

class LoudProxy extends Loud
{
public string $name {
get {
// detected the parent has a hook? //
$return = parent::$name::get();
// do something with return //
return $return;
}
}
}

what happens if the Loud class later removes its hook implementation
(ex: moves it to the set hook)? Will my proxy now cause an error?

Would simply calling $this->name call the parents hook?

Per the RFC:

"If there is no hook on the parent property, its default get/set behavior will be used. "

so parent::$name::get() will "read the parent property", which will go through a hook if one is defined, and just read the raw value if not. So there is no detection logic needed, and the parent can add/remove a hook without affecting the child.

Calling $this->name in LoudProxy's get hook will access backing property on LoudProxy itself, ignoring the parent entirely.

--Larry Garfield

1 year ago by Robert Landers — view source

unread

Hello,

I'm a bit confused on inheritance. In the following example of a
proxy, do I need to be aware of a parent's hook and handle it
specially?

class Loud
{
public string $name {
get {
return strtoupper($this->name);
}
}
}

class LoudProxy extends Loud
{
public string $name {
get {
// detected the parent has a hook? //
$return = parent::$name::get();
// do something with return //
return $return;
}
}
}

what happens if the Loud class later removes its hook implementation
(ex: moves it to the set hook)? Will my proxy now cause an error?

Would simply calling $this->name call the parents hook?

Per the RFC:

"If there is no hook on the parent property, its default get/set behavior will be used. "

so parent::$name::get() will "read the parent property", which will go through a hook if one is defined, and just read the raw value if not. So there is no detection logic needed, and the parent can add/remove a hook without affecting the child.

Calling $this->name in LoudProxy's get hook will access backing property on LoudProxy itself, ignoring the parent entirely.

--Larry Garfield

Awesome! Thanks! I just wasn't sure what "default" meant in this context.

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Lynn — view source

unread

On Wed, Feb 21, 2024 at 7:58 PM Larry Garfield larry@garfieldtech.com
wrote:

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

In regards to arrays, what about additional operations next to get/set? I
doubt this solution will cover all the use-cases or perhaps even
over-complicate things, just throwing the idea out there.

class Test {
    private array $_myData = [];
    public array $myData {
        get => $this->_myData;
        append => $this->_myData[] = $value;
    }
}

Thinking about the other post about offset and containers (
https://github.com/Girgias/php-rfcs/blob/master/container
-offset-behaviour.md):

class Test {
    private array $_myData = [];
    public array $myData {
        get => $this->_myData;
        append => $this->_myData[] = $value;
        write_dimension => $this->_myData[$offset] = $value;
    }
}

Is this issue restricted to arrays only? From my understanding objects
functioning as arrays are already by reference and thus should not suffer
from this?

class Test {
    private ArrayObject $_myData;
    public ArrayObject $myData {
        get => $this->_myData;
    }

    public function __construct() {
        $this->_myData = new ArrayObject();
    }
}

// would this work without issues?
$obj = new Test();
$obj->myData[] = 'test';

1 year ago by Larry Garfield — view source

unread

In regards to arrays, what about additional operations next to get/set?
I doubt this solution will cover all the use-cases or perhaps even
over-complicate things, just throwing the idea out there.
class Test {
    private array $_myData = [];
    public array $myData {
        get => $this->_myData;
        append => $this->_myData[] = $value;
    }
}
Thinking about the other post about offset and containers
(https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md):
class Test {
    private array $_myData = [];
    public array $myData {
        get => $this->_myData;
        append => $this->_myData[] = $value;
        write_dimension => $this->_myData[$offset] = $value;
    }
}

Those hooks may be possible; we'd have to try it and see. However, they also wouldn't be able to fully emulate arrays. $foo->bar['baz'][] = 'beep' could get very weird. Those wouldn't cover unsetting. Array functions like array_splice() still wouldn't work. Basically, there will never be a way to make arrays 100% transparent with hooks. That's why the RFC recommends still using a method for array modification, as that's already a very well-understood and flexible approach.

Fortunately, as currently written (append and write_dimension are forbidden), those additional hooks could be considered and added in their own RFC in the future with no BC break. Whether or not they make sense or cover "enough" use cases is a question that could be answered in the future, after Gina's RFC passes.

So on this one, I think "punt" is the best option for now. It can be safely revisited in the future.

Is this issue restricted to arrays only? From my understanding objects
functioning as arrays are already by reference and thus should not
suffer from this?
class Test {
    private ArrayObject $_myData;
    public ArrayObject $myData {
        get => $this->_myData;
    }

    public function __construct() {
        $this->_myData = new ArrayObject();
    }
}

// would this work without issues?
$obj = new Test();
$obj->myData[] = 'test';

Mostly correct. Objects pass by handle, not by reference. (You can pass an object by reference instead, and it behaves subtly differently.) But the net effect is the same. Your sample code there would run fine.

--Larry Garfield

1 year ago by Jakob Givoni — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

It’s long, but that’s because we’re handling every edge case we could
think of. Properties involve dealing with both references and inheritance,
both of which have complex implications. We believe we’ve identified the
most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask:
Originally, this RFC was called “property accessors,” which is the
terminology used by most languages. During early development, when we had
4 accessors like Swift, we changed the name to “hooks” to better indicate
that one was “hooking into” the property lifecycle. However, later
refinement brought it back down to 2 operations, get and set. That makes
the “hooks” name less applicable, and inconsistent with what other
languages call it.

However, changing it back at this point would be a non-small amount of
grunt work. There would be no functional changes from doing so, but it’s
lots of renaming things both in the PR and the RFC. We are willing to do so
if the consensus is that it would be beneficial, but want to ask before
putting in the effort.

--
Larry Garfield
larry@garfieldtech.com

Hi, thanks for the RFC and the effort put into trying to make it palatable
to skeptical minds!

After reading most of the discussion in this thread I believe that the RFC
in its current form can work and that I will get used to it's
"peculiarities", but an idea occurred to me that may have some advantages,
so here goes:

Use the "set" keyword that you've already introduced to set the raw value
of a "backed" property:

public int $name {
set {
set strtoupper($value);
}
}

Or when used in short form:

public int $name {
set => set strtoupper($value);
}

Advantages in no particular order:

Shorter than $this->name
No magic $field
Short and long form works the same

Disadvantage: "Set" can only be used to set the raw value inside the hook
method itself. Or maybe that's a good thing too. To be honest, I don't love
that $this->name sometimes goes through the hook and sometimes not. I'd
prefer if the raw value could only be accessed inside the hooks or via a
special syntax like f.ex. $this->name:raw

If there are any use cases or technical details that I've missed that would
make this syntax unfavourable, I apologize.

Another observation (I apologize for being late to the game but it was a
long RFC and thread to read through):

What would happen if we stopped talking about virtual vs. backed
properties? Couldn't we just treat a property that was never set the same
as any other uninitialized property?
What I mean is, that if you try to access the raw value of a property with
a set hook that never sets its own raw value, you'd get either null or Typed
property [...] must not be accessed before initialization, just like you'd
expect if you're already used to modern php. Of course you'd just write
your code correctly so that that never happens. It's already the case that
uninitialized properties are omitted when serializing the object so there
would be no difference there either.

The advantage here would be that there's no need to detect the virtual or
backed nature of the property at compile time and the RFC would be a lot
shorter.

Thank you for your consideration!

Best,
Jakob

1 year ago by Larry Garfield — view source

unread

Hi, thanks for the RFC and the effort put into trying to make it
palatable to skeptical minds!

After reading most of the discussion in this thread I believe that the
RFC in its current form can work and that I will get used to it's
"peculiarities", but an idea occurred to me that may have some
advantages, so here goes:

Use the "set" keyword that you've already introduced to set the raw
value of a "backed" property:

public int $name {
set {
set strtoupper($value);
}
}
Or when used in short form:

public int $name {
set => set strtoupper($value);
}

Advantages in no particular order:

Shorter than $this->name

No magic $field

Short and long form works the same

Disadvantage: "Set" can only be used to set the raw value inside the
hook method itself. Or maybe that's a good thing too. To be honest, I
don't love that $this->name sometimes goes through the hook and
sometimes not. I'd prefer if the raw value could only be accessed
inside the hooks or via a special syntax like f.ex. $this->name:raw

If there are any use cases or technical details that I've missed that
would make this syntax unfavourable, I apologize.

Interesting idea. Not being able to write the raw value except in the set hook isn't a bug, but an important feature, so that's not a downside. (Modulo reflection, which is a reasonable back-door.)

However, there's a few other disadvantages that probably make it not worth it.

set is not actually a keyword at the moment. It's contextually parsed in the lexer, so it doesn't preclude using set as a constant or function name the way a full keyword does. (PHP has many of these context-only keywords.) Making it a keyword inside the body of the hook would do that, however.
Like $field, it would be a syntax you just "have to know". Most people seem to hate that idea, right or wrong.
Like the considered syntaxes for parent-access, it wouldn't be possible to do anything but a direct write. So set => set++ wouldn't be possible, whereas with $this->prop all existing operations should "just work."
Would we then also want a get keyword in the get hook to be parallel? What does that even do there? It would have the same implications as point 3 in get, so we're back to $field by a different spelling.

So it's an interesting concept, but the knock-on effects would lead to a lot more complications.

Another observation (I apologize for being late to the game but it was
a long RFC and thread to read through):

What would happen if we stopped talking about virtual vs. backed
properties? Couldn't we just treat a property that was never set the
same as any other uninitialized property?
What I mean is, that if you try to access the raw value of a property
with a set hook that never sets its own raw value, you'd get either
null or Typed property [...] must not be accessed before
initialization, just like you'd expect if you're already used to modern
php. Of course you'd just write your code correctly so that that never
happens. It's already the case that uninitialized properties are
omitted when serializing the object so there would be no difference
there either.

The advantage here would be that there's no need to detect the virtual
or backed nature of the property at compile time and the RFC would be a
lot shorter.

Unfortunately the backed-vs-virtual distinction is quite important at an implementation level for a few reasons.

A backed property reserves memory space for that property. A virtual property does not. Making virtual properties "unused backed" properties would increase memory usage for values that would never be usable.
There would be no realistic way to differentiate between a get-only virtual property with no storage, and a backed property that just happens to have a get hook but no set hook. Meaning you would be able to write to an otherwise-inaccessible backing value of the property.
That would then appear in serialization, even though it's impossible to get to from code without using reflection. Which is just all kinds of confusing.

So for practical reasons, the distinction isn't just a user-facing difference but an important engine-level distinction we cannot avoid.

Cheers.

--Larry Garfield

1 year ago by Larry Garfield — view source

unread

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more
polished property access hooks/interface properties RFC. It’s 99%
unchanged from last summer; the PR is now essentially complete and more
robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

https://wiki.php.net/rfc/property-hooks

Courtesy heads up: Baring any major changes, we are planning to call the vote in a little over a week, on the 8th/9th of April. (Ilija still has some tests he wants to finalize, and I will be out of town next weekend.)

--Larry Garfield

1 year ago by Rowan Tommins [IMSoP] — view source

unread

[Including my full previous reply, since the list and gmail currently
aren't being friends. Apologies that this leads to rather a lot of
reading in one go...]

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC.

Hello, and a huge thanks to both you and Ilija for the continued work
on this. I'd really like to see this feature make it into PHP, and
agree with a lot of the RFC.

My main concern is the proliferation of things that look the same but
act differently, and things that look different but act the same:

var $a;
public $b;
public mixed $c;
public mixed $d = null;
public mixed $e => null;
public mixed $f { get => null };
public mixed $g { get { return null; } }
public mixed $h { get => $field }
public mixed $i { get => $this->i }
public mixed $j { get => $this->_j }
public mixed $k { &get => $this->k }
public mixed $l { &get => $this->_l }

As currently proposed:

a and b are both what we might call "traditional" properties, and
equivalent to each other; a uses legacy syntax which we haven't
removed for some reason

c allows the same values as b, but is a "typed property", which
changes its behaviour in various ways

e looks like d, but is actually equivalent to f and g

h and i are both "properties with hooks", but j is a "virtual
property", which brings additional changes in behaviour

l allows callers to assign by reference, but k is not allowed

To make this all less confusing, I suggest the following changes:

Remove the short-hand syntax in example e, so this sentence from the
RFC is always true: "For a property to use a hook, it must replace its
trailing |;| with a code block denoted by |{ }|."

Allow any property to define an "&get" hook in place of a "get" hook
(i.e. allow example k). It is up to the user to decide whether this
will cause problems.

Limit as much as possible the difference in behaviour between
"virtual" and "hooked" properties.

And, probably most controversially:

Hooks should always be on top of the normal property defined,
unless explicitly indicated to be "virtual". Example j would thus be:

public virtual mixed $j { get => $this->_j }

This is slightly more verbose, but removes all the complexity for both
the implementation and users in determining which properties are
"virtual". I believe the time saved in reading the more explicit code
would outweigh the time spent typing the extra keyword.

Regarding the implicit $value on set hooks, I am unconvinced by the
comparison to $this, which acts more like a keyword - it is reserved
outside of methods, read-only inside them, and cannot be renamed. I
think a closer analogy would be "foreach ( $foo as $key => $value )"
or "catch ( SomeException $e )": naming $value is always required;
$key and $e can be omitted, but doing so makes the values unavailable,
it does not give them default names.

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

As noted above, I think the user should be able to opt into this
facility for both virtual and non-virtual hooked properties, at their
own risk, for example:

class Example {
// non-virtual property, using a get hook for additional
behaviour, not to reroute the value
    public array $foo {
         &get { $this->foo = $this->lazyLoad('foo'); return $this->foo; }
    }
    // ...
}
$a = new Example;
$a->foo[] = 42; // will call $a->lazyLoad('foo') to populate the
initial value, then append an item to it

The more I think about it, the more convinced I am this RFC is trying to
cram too many features into too small a space.

For instance, the ability to specify a type on the set hook. Taking the
example from the RFC:

 public  UnicodeString$name  {
     set(string|UnicodeString$value)  {
         $this->name  =  $value  instanceof UnicodeString ?$value  :  new  UnicodeString($value);         
     }
 }

What is the type of $name? The answer is "it depends if you're writing
to or reading from it". The same use case can be covered by this:

 public  UnicodeString$name;      public string $name_string{ get => (string)$this->name;
     set=> $this->name  =  new  UnicodeString($value);
 }

Now we have two properties with clear types, without the complexity of
the conditional (which would be even worse if we wanted more than two
types). We can even swap the "real" and "virtual" properties transparently:

 public  UnicodeString$name{ get => new  UnicodeString($this->name_string);
     set=> $this->name_string  =  (string)$value;
 }
 public string $name_string;

This exotic "asymmetric typing" is then being used to justify other
decisions - if you can specify setter's the type, it's confusing if you
specify a name without a type; so we need to make the name optional as
well... Compare to C#, where "value" is not a default, it's an
unchangeable keyword; or Kotlin, where naming it is mandatory but
doesn't have mention type.

I think my concerns about distinguishing "virtual properties" may stem
from a similar cause.

In C#, all "properties" are virtual - as soon as you have any
non-default "get", "set" or "init" definition, it's up to you to declare
a separate "field" to store the value in. Swift's "computed properties"
are similar: if you have a custom getter or setter, there is no backing
store; to add behaviour to a "stored property", you use the separate
"property observer" hooks.

Kotlin's approach is philosophically the opposite: there are no fields,
only properties, but properties can access a hidden "backing field" via
the special keyword "field". Importantly, omitting the setter doesn't
make the property read-only, it implies set(value) { field = value }

The current RFC attempts to combine all of these ideas into one syntax,
on top of everything the language already has. The result has some
odd-shaped corners. For instance, this won't work:

public string $name { set => throw new Exception('Read-only property ' .
PROPERTY); }

But this will:

public string $name { set => throw new Exception('Read-only property ' .
PROPERTY . '; current value is: ' . $this->name); }

The first declares a virtual property, with no default getter, like in
C# or Swift. The second instead acts like Kotlin, and has a default
getter referencing the implicit backing field.

It would be clearer to choose one style or the other: explicitly enable
the defaults...

public string $name { get; set => throw new Exception('Read-only
property ' . PROPERTY); } // default getter and backing field
requested
public string $name { get => $this->name ??= $this->generateName(); }
// setter disabled because it's not mentioned, even though backing field
is used

...or explicitly disable them:

public string $name { set => throw new Exception('Read-only property ' .
PROPERTY } // implied default getter and backing field
public virtual string $name { get => $this->firstName . ' ' .
$this->lastName; } // setter disabled because property is declared
virtual

I think there's some really great functionality in the RFC, and would
love for it to succeed in some form, but I think it would benefit from
removing some of the "magic".

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Larry Garfield — view source

unread

[Including my full previous reply, since the list and gmail currently
aren't being friends. Apologies that this leads to rather a lot of
reading in one go...]

Eh, I'd prefer a few big emails that come in slowly to lots of little emails that come in fast. :-)

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC.

Hello, and a huge thanks to both you and Ilija for the continued work
on this. I'd really like to see this feature make it into PHP, and
agree with a lot of the RFC.

My main concern is the proliferation of things that look the same but
act differently, and things that look different but act the same:

snip

a and b are both what we might call "traditional" properties, and
equivalent to each other; a uses legacy syntax which we haven't
removed for some reason

I don't know why we haven't removed var either. I can't recall the last time I saw it in real code. But that's out of scope here.

snip

I think there's some really great functionality in the RFC, and would
love for it to succeed in some form, but I think it would benefit from
removing some of the "magic".

Regards,

--
Rowan Tommins
[IMSoP]

I'm going to try and respond to a couple of different points together here, including from later in the thread, as it's just easier.

== Re, design philosophy:

In C#, all "properties" are virtual - as soon as you have any
non-default "get", "set" or "init" definition, it's up to you to declare
a separate "field" to store the value in. Swift's "computed properties"
are similar: if you have a custom getter or setter, there is no backing
store; to add behaviour to a "stored property", you use the separate
"property observer" hooks.

Kotlin's approach is philosophically the opposite: there are no fields,
only properties, but properties can access a hidden "backing field" via
the special keyword "field". Importantly, omitting the setter doesn't
make the property read-only, it implies set(value) { field = value }

A little history here to help clarify how we ended up where we are: The original RFC as we designed it modeled very closely on Swift, with 4 hooks. Using get/set at all would create a virtual property and you were on your own, while the beforeSet/afterSet hooks would not. We ran that design by some PHP Foundation sponsors a year ago (I don't actually know who, Roman did it for us), and the general feedback was "we like the idea, but woof this is complicated with all these hooks and having to make my own backing property for all these little things. Couldn't this be simplified?" We thought a bit more, and I off-handedly suggested to Ilija "I mean, would it be possible to just detect if a get/set hook is using a backing store and make it automatically? Then we could get rid of the before/after hooks." He gave it a quick try and found that was straightforward, so we pivoted to that simplified version. We then realized that we had... mostly just recreated Kotlin's design, so shrugged happily and went on with life.

As noted in an earlier email, C#, Kotlin, and Swift all have different stances on the variable name for the incoming value. We originally modeled on Swift so had that model (optional newVal name), and also because we liked how compact it was. When we switched to the simplified, incidentally Kotlin-esque approach, we just kept the optional variable as it works.

I think where that ended up is pretty nice, personally, even if it is not a direct map of any particular other language.

== Re asymmetric typing:

This is capability already present today if using a setter method.

class Person {
private $name;

public function setName(UnicodeString|string $name)
{
    $this->name  =  $value  instanceof UnicodeString ? $value : new  UnicodeString($value);         
}

}

And widening the parameter type in a child class is also entirely legal. As the goal of the RFC is, essentially, "make most common getter/setter patterns easy to add to a property without making an API-breaking method, so people don't have to add redundant just-in-case getters and setters all the time," covering an easy-to-cover use case seems like a good thing to do.

It also ties into the question of the explict/implicit name, for the reason you mentioned earlier (unspecified means mixed), not by intent. More on that in another section.

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a virtual keyword. Ilija also suggested a backed keyword, which forces a backed property to exist even if it's not used in the hook itself.

Adding virtual adds more work for the developer, but more clarity. It would also mean $this->$propName or $this->{PROPERTY} would work "as expected", since there's no auto-detection for virtual-ness. On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected. If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation, which could be unexpected. (Mostly this would be for a virtual-get that accidentally has a default setter because you forgot to mark it virtual.)
Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem. It would also allow for $this->$propName to work if a property is explicitly backed. On the flipside, it's one more thing to think about, and the above example it solves would be trivially solved by having the method just return the value to set and letting the set hook do the actual write, which is arguably better and more reliable code anyway.
The status quo (auto-detection based on the presence of $this->propName). This has the advantage it "just works" in the 95% case, without having to think about anything extra. The downside is it does have some odd edge cases, like needing $this->propName to be explicitly used.

I don't think any is an obvious winner. My personal preference would be for status quo (auto-detect) or explicit-virtual always. I could probably live with either, though I think I'd personally favor status quo. Thoughts from others?

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there. That is, it makes the validation unreliable. A major footgun. The question is, do we favor caveat-emptor flexibility or correct-by-construction safety? Personally I always lead toward the latter, though PHP in general is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's easier to enable that loophole in the future if we really want it than to get rid of it if it turns out to have been a bad idea.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook. I worry that is "one more quirky edge case", though, so as above it may be better to skip for now as it's easier to add later than remove. But if the consensus is to do that, we're open to it. (Question for everyone.)

== Re

== Re arrays

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook. This would have a large and obscure performance penalty, and a set implementation would have no way of knowing which values had changed, leading to, for instance, code needing to revalidate all elements even if only one has changed.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

Drop the "top level" shorthand, for get-only hooks.
Keep the => shorthand for the get hook itself.
For a set hook, the {} form has no return value; set the value yourself however you want.
For a set hook, the => form implies a backed value and will set the property to whatever value that evaluates to.

So these are equivalent:

public $foo { set { $this->foo = $value; } }
public $foo { set => $value; }

These are equivalent:

public string $foo {
get {
return strtoupper($this->foo);
}
}
public string $foo { get => strtoupper($this->foo); }

And this goes away:

public string $foo => strtoupper($this->foo);

That covers the common cases with an arrow-function-like syntax that behaves as you'd expect (it returns things), and allows a longer version with arbitrarily complex logic if desired. It also means that each syntax variant does mean something importantly different.

Would that be an acceptable compromise? (Question for everyone.)

== Re the $value variable in set

Honestly, Rowan's earlier point here is the strongest argument in favor for me of the current RFC approach. Anywhere else in PHP, something that looks like a parameter and has no type, like ($param), means its type is mixed. It would be weird and confusing to be different here. That's above and beyond the issue of forcing people to retype something obvious every time. (I cite again, recent PHP's trend toward removing needless boilerplate, which is very good.) Requiring that the type be specified, for consistency, makes little sense if the type is not allowed to vary. You're just repeating a string from earlier on the same line, for no particular benefit.

I genuinely don't understand the pushback on $value. It's something you learn once and never have to think about again. It's consistent.

Ilija jokingly suggested making it always $value, unconditionally, and allowing only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Though I suspect that won't go over well, either. :-)

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

The alternative that gives the most future-flexibility is to do neither: The variable is called $value, period, you can't change it, and you can't change the type, either. There is no () after set, ever. Punt both of those to a later follow-up. I'd prefer to include both now, but including neither now is the next-safer option.

Regarding $field

Sigh, now y'all like it. :-P Most of the feedback on this has been negative, so I'm inclined to leave it out at this point, unless there's a major swing in feedback to bring it back. But the RFC seems more likely to pass without it than with right now.

--Larry Garfield

1 year ago by Stephen Reay — view source

unread

[Including my full previous reply, since the list and gmail currently
aren't being friends. Apologies that this leads to rather a lot of
reading in one go...]

Eh, I'd prefer a few big emails that come in slowly to lots of little emails that come in fast. :-)

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC.

Hello, and a huge thanks to both you and Ilija for the continued work
on this. I'd really like to see this feature make it into PHP, and
agree with a lot of the RFC.

My main concern is the proliferation of things that look the same but
act differently, and things that look different but act the same:

snip

a and b are both what we might call "traditional" properties, and
equivalent to each other; a uses legacy syntax which we haven't
removed for some reason

I don't know why we haven't removed var either. I can't recall the last time I saw it in real code. But that's out of scope here.

snip

I think there's some really great functionality in the RFC, and would
love for it to succeed in some form, but I think it would benefit from
removing some of the "magic".

Regards,

--
Rowan Tommins
[IMSoP]

I'm going to try and respond to a couple of different points together here, including from later in the thread, as it's just easier.

== Re, design philosophy:

In C#, all "properties" are virtual - as soon as you have any
non-default "get", "set" or "init" definition, it's up to you to declare
a separate "field" to store the value in. Swift's "computed properties"
are similar: if you have a custom getter or setter, there is no backing
store; to add behaviour to a "stored property", you use the separate
"property observer" hooks.

Kotlin's approach is philosophically the opposite: there are no fields,
only properties, but properties can access a hidden "backing field" via
the special keyword "field". Importantly, omitting the setter doesn't
make the property read-only, it implies set(value) { field = value }

A little history here to help clarify how we ended up where we are: The original RFC as we designed it modeled very closely on Swift, with 4 hooks. Using get/set at all would create a virtual property and you were on your own, while the beforeSet/afterSet hooks would not. We ran that design by some PHP Foundation sponsors a year ago (I don't actually know who, Roman did it for us), and the general feedback was "we like the idea, but woof this is complicated with all these hooks and having to make my own backing property for all these little things. Couldn't this be simplified?" We thought a bit more, and I off-handedly suggested to Ilija "I mean, would it be possible to just detect if a get/set hook is using a backing store and make it automatically? Then we could get rid of the before/after hooks." He gave it a quick try and found that was straightforward, so we pivoted to that simplified version. We then realized that we had... mostly just recreated Kotlin's design, so shrugged happily and went on with life.

As noted in an earlier email, C#, Kotlin, and Swift all have different stances on the variable name for the incoming value. We originally modeled on Swift so had that model (optional newVal name), and also because we liked how compact it was. When we switched to the simplified, incidentally Kotlin-esque approach, we just kept the optional variable as it works.

I think where that ended up is pretty nice, personally, even if it is not a direct map of any particular other language.

== Re asymmetric typing:

This is capability already present today if using a setter method.

class Person {
private $name;

public function setName(UnicodeString|string $name)
{
$this->name = $value instanceof UnicodeString ? $value : new UnicodeString($value);
}
}

And widening the parameter type in a child class is also entirely legal. As the goal of the RFC is, essentially, "make most common getter/setter patterns easy to add to a property without making an API-breaking method, so people don't have to add redundant just-in-case getters and setters all the time," covering an easy-to-cover use case seems like a good thing to do.

It also ties into the question of the explict/implicit name, for the reason you mentioned earlier (unspecified means mixed), not by intent. More on that in another section.

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a virtual keyword. Ilija also suggested a backed keyword, which forces a backed property to exist even if it's not used in the hook itself.

Adding virtual adds more work for the developer, but more clarity. It would also mean $this->$propName or $this->{PROPERTY} would work "as expected", since there's no auto-detection for virtual-ness. On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected. If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation, which could be unexpected. (Mostly this would be for a virtual-get that accidentally has a default setter because you forgot to mark it virtual.)

Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem. It would also allow for $this->$propName to work if a property is explicitly backed. On the flipside, it's one more thing to think about, and the above example it solves would be trivially solved by having the method just return the value to set and letting the set hook do the actual write, which is arguably better and more reliable code anyway.

The status quo (auto-detection based on the presence of $this->propName). This has the advantage it "just works" in the 95% case, without having to think about anything extra. The downside is it does have some odd edge cases, like needing $this->propName to be explicitly used.

I don't think any is an obvious winner. My personal preference would be for status quo (auto-detect) or explicit-virtual always. I could probably live with either, though I think I'd personally favor status quo. Thoughts from others?

I agree that a flag to make the field virtual (thus disabling the backing store) makes more sense than a flag to make it backed; It's also easier to understand when comparing hooked properties with regular properties (essentially, backed is the default, you have to opt-in to it being virtual). I don't think the edge cases of "auto" make it worthwhile just to not need "virtual".

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there. That is, it makes the validation unreliable. A major footgun. The question is, do we favor caveat-emptor flexibility or correct-by-construction safety? Personally I always lead toward the latter, though PHP in general is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's easier to enable that loophole in the future if we really want it than to get rid of it if it turns out to have been a bad idea.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook. I worry that is "one more quirky edge case", though, so as above it may be better to skip for now as it's easier to add later than remove. But if the consensus is to do that, we're open to it. (Question for everyone.)

I don't have strong feeling about this, but in general I usually tend to prefer options that are consistent, and give power/options to the developer. If references are opt-in anyway, I see that as accepting the trade-offs. If a developer doesn't want to allow by-ref modifications of the property, why would they make it referenceable in the first place? This sounds a bit like disallowing regular public properties because they might be modified outside the class - that's kind of the point, surely.

== Re

== Re arrays

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook. This would have a large and obscure performance penalty, and a set implementation would have no way of knowing which values had changed, leading to, for instance, code needing to revalidate all elements even if only one has changed.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

Drop the "top level" shorthand, for get-only hooks.

Keep the => shorthand for the get hook itself.

For a set hook, the {} form has no return value; set the value yourself however you want.

For a set hook, the => form implies a backed value and will set the property to whatever value that evaluates to.

So these are equivalent:

public $foo { set { $this->foo = $value; } }
public $foo { set => $value; }

These are equivalent:

public string $foo {
get {
return strtoupper($this->foo);
}
}
public string $foo { get => strtoupper($this->foo); }

And this goes away:

public string $foo => strtoupper($this->foo);

That covers the common cases with an arrow-function-like syntax that behaves as you'd expect (it returns things), and allows a longer version with arbitrarily complex logic if desired. It also means that each syntax variant does mean something importantly different.

Would that be an acceptable compromise? (Question for everyone.)

I think the examples given are clear, and the lack of the top-level short closure-esque version makes it more obvious. Forgive me, I must have missed some of the previous comments - is there a reason the 'full' setter can't return a value, for the sake of consistency? I understand that you don't want "return to set" to be the only option, for the sake of e.g. change/audit logging type functionality (i.e. set and then some action to record that the change was made), but it seems a little odd and inconsistent to me that the return value of a short closure would be used when the return value of the long version isn't. This isn't really a major issue, I'm just curious if there was some explanation about it?

== Re the $value variable in set

Honestly, Rowan's earlier point here is the strongest argument in favor for me of the current RFC approach. Anywhere else in PHP, something that looks like a parameter and has no type, like ($param), means its type is mixed. It would be weird and confusing to be different here. That's above and beyond the issue of forcing people to retype something obvious every time. (I cite again, recent PHP's trend toward removing needless boilerplate, which is very good.) Requiring that the type be specified, for consistency, makes little sense if the type is not allowed to vary. You're just repeating a string from earlier on the same line, for no particular benefit.

I genuinely don't understand the pushback on $value. It's something you learn once and never have to think about again. It's consistent.

Ilija jokingly suggested making it always $value, unconditionally, and allowing only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Though I suspect that won't go over well, either. :-)

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

My only question with this is the same as I had in an earlier reply (and I'm not sure it was ever answered directly?), and you allude to this yourself: everywhere else, ($var) means a parameter with type mixed. Why is the type required here, when you've specifically said you want to avoid boilerplate? If we're going to assume people can understand that (implicit property-type $value) is implicit, surely we can also assume that they will understand "specifying a parameter without a type" means the parameter has no type (i.e. is mixed`).

Again, for myself I'd be likely to type it (or regular parameters, properties, etc) as mixed if that's what I want anyway, but the inconsistency here seems odd, unless there's some until-now unknown drive to deprecate type-less parameters/properties/etc.

The alternative that gives the most future-flexibility is to do neither: The variable is called $value, period, you can't change it, and you can't change the type, either. There is no () after set, ever. Punt both of those to a later follow-up. I'd prefer to include both now, but including neither now is the next-safer option.

Regarding $field

Sigh, now y'all like it. :-P Most of the feedback on this has been negative, so I'm inclined to leave it out at this point, unless there's a major swing in feedback to bring it back. But the RFC seems more likely to pass without it than with right now.

--Larry Garfield

Cheers

Stephen

1 year ago by Larry Garfield — view source

unread

snip

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a virtual keyword. Ilija also suggested a backed keyword, which forces a backed property to exist even if it's not used in the hook itself.

Adding virtual adds more work for the developer, but more clarity. It would also mean $this->$propName or $this->{PROPERTY} would work "as expected", since there's no auto-detection for virtual-ness. On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected. If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation, which could be unexpected. (Mostly this would be for a virtual-get that accidentally has a default setter because you forgot to mark it virtual.)

Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem. It would also allow for $this->$propName to work if a property is explicitly backed. On the flipside, it's one more thing to think about, and the above example it solves would be trivially solved by having the method just return the value to set and letting the set hook do the actual write, which is arguably better and more reliable code anyway.

The status quo (auto-detection based on the presence of $this->propName). This has the advantage it "just works" in the 95% case, without having to think about anything extra. The downside is it does have some odd edge cases, like needing $this->propName to be explicitly used.

I don't think any is an obvious winner. My personal preference would be for status quo (auto-detect) or explicit-virtual always. I could probably live with either, though I think I'd personally favor status quo. Thoughts from others?

I agree that a flag to make the field virtual (thus disabling the
backing store) makes more sense than a flag to make it backed; It's
also easier to understand when comparing hooked properties with regular
properties (essentially, backed is the default, you have to opt-in to
it being virtual). I don't think the edge cases of "auto" make it
worthwhile just to not need "virtual".

Uh, I can't tell if you're saying "use status quo" or "use the virtual keyword." Please clarify. :-)

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there. That is, it makes the validation unreliable. A major footgun. The question is, do we favor caveat-emptor flexibility or correct-by-construction safety? Personally I always lead toward the latter, though PHP in general is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's easier to enable that loophole in the future if we really want it than to get rid of it if it turns out to have been a bad idea.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook. I worry that is "one more quirky edge case", though, so as above it may be better to skip for now as it's easier to add later than remove. But if the consensus is to do that, we're open to it. (Question for everyone.)

I don't have strong feeling about this, but in general I usually tend
to prefer options that are consistent, and give power/options to the
developer. If references are opt-in anyway, I see that as accepting the
trade-offs. If a developer doesn't want to allow by-ref modifications
of the property, why would they make it referenceable in the first
place? This sounds a bit like disallowing regular public properties
because they might be modified outside the class - that's kind of the
point, surely.

== Re

== Re arrays

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook. This would have a large and obscure performance penalty, and a set implementation would have no way of knowing which values had changed, leading to, for instance, code needing to revalidate all elements even if only one has changed.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

snip

I think the examples given are clear, and the lack of the top-level
short closure-esque version makes it more obvious. Forgive me, I must
have missed some of the previous comments - is there a reason the
'full' setter can't return a value, for the sake of consistency? I
understand that you don't want "return to set" to be the only option,
for the sake of e.g. change/audit logging type functionality (i.e. set
and then some action to record that the change was made), but it seems
a little odd and inconsistent to me that the return value of a short
closure would be used when the return value of the long version isn't.
This isn't really a major issue, I'm just curious if there was some
explanation about it?

Mainly because it introduces a lot more complexity. As far as I'm aware, determining at runtime whether or not to make use of the return value is impossible. (Ie, we cannot differentiate between "return", "no return statement", and "return null".) So it would require compile time branching to generate two different pathways after lexically detecting if there is a "return" token present somewhere in the hook body. That is probably doable, technically, but introduces more complexity to an already necessarily-large RFC. It's also something that is simple enough to add later as an option without any BC breaks or implications for other parts of the design, so it's safe to punt. (Some things are harder to punt on than others, as noted, but this one is easy/safe to skip for now.)

== Re the $value variable in set

snip

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

My only question with this is the same as I had in an earlier reply
(and I'm not sure it was ever answered directly?), and you allude to
this yourself: everywhere else, ($var) means a parameter with type
mixed. Why is the type required here, when you've specifically said
you want to avoid boilerplate? If we're going to assume people can
understand that (implicit property-type $value) is implicit, surely we can also assume that they will understand "specifying a parameter without a type" means the parameter has no type (i.e. is mixed`).

Again, for myself I'd be likely to type it (or regular parameters,
properties, etc) as mixed if that's what I want anyway, but the
inconsistency here seems odd, unless there's some until-now unknown
drive to deprecate type-less parameters/properties/etc.

If we went this route, then an untyped set param would likely imply "mixed", just like on methods. Which, since mixed is the super type of everything, would still technically work, but would weaken the type enforcement and thus static analysis potential. (Just as a method param can be widened in a child class all the way to mixed/omitted, and it would be unwise for all the same reasons.)

In the RFC as currently written, omitted means "derive from the property," which is a concept that doesn't exist in methods; the closest equivalent would be if omitting a type in a child method parameter meant "use the parent's type implicitly," which is not how that works right now.

--Larry Garfield

1 year ago by Stephen Reay — view source

unread

snip

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a virtual keyword. Ilija also suggested a backed keyword, which forces a backed property to exist even if it's not used in the hook itself.

Adding virtual adds more work for the developer, but more clarity. It would also mean $this->$propName or $this->{PROPERTY} would work "as expected", since there's no auto-detection for virtual-ness. On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected. If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation, which could be unexpected. (Mostly this would be for a virtual-get that accidentally has a default setter because you forgot to mark it virtual.)

Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem. It would also allow for $this->$propName to work if a property is explicitly backed. On the flipside, it's one more thing to think about, and the above example it solves would be trivially solved by having the method just return the value to set and letting the set hook do the actual write, which is arguably better and more reliable code anyway.

The status quo (auto-detection based on the presence of $this->propName). This has the advantage it "just works" in the 95% case, without having to think about anything extra. The downside is it does have some odd edge cases, like needing $this->propName to be explicitly used.

I don't think any is an obvious winner. My personal preference would be for status quo (auto-detect) or explicit-virtual always. I could probably live with either, though I think I'd personally favor status quo. Thoughts from others?

I agree that a flag to make the field virtual (thus disabling the
backing store) makes more sense than a flag to make it backed; It's
also easier to understand when comparing hooked properties with regular
properties (essentially, backed is the default, you have to opt-in to
it being virtual). I don't think the edge cases of "auto" make it
worthwhile just to not need "virtual".

Uh, I can't tell if you're saying "use status quo" or "use the virtual keyword." Please clarify. :-)

I'm saying I think an explicit virtual keyword is the better option of the three (because the benefits of the 'auto' mode don't outweigh the edge cases it introduces).

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there. That is, it makes the validation unreliable. A major footgun. The question is, do we favor caveat-emptor flexibility or correct-by-construction safety? Personally I always lead toward the latter, though PHP in general is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's easier to enable that loophole in the future if we really want it than to get rid of it if it turns out to have been a bad idea.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook. I worry that is "one more quirky edge case", though, so as above it may be better to skip for now as it's easier to add later than remove. But if the consensus is to do that, we're open to it. (Question for everyone.)

I don't have strong feeling about this, but in general I usually tend
to prefer options that are consistent, and give power/options to the
developer. If references are opt-in anyway, I see that as accepting the
trade-offs. If a developer doesn't want to allow by-ref modifications
of the property, why would they make it referenceable in the first
place? This sounds a bit like disallowing regular public properties
because they might be modified outside the class - that's kind of the
point, surely.

== Re

== Re arrays

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook. This would have a large and obscure performance penalty, and a set implementation would have no way of knowing which values had changed, leading to, for instance, code needing to revalidate all elements even if only one has changed.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

snip

I think the examples given are clear, and the lack of the top-level
short closure-esque version makes it more obvious. Forgive me, I must
have missed some of the previous comments - is there a reason the
'full' setter can't return a value, for the sake of consistency? I
understand that you don't want "return to set" to be the only option,
for the sake of e.g. change/audit logging type functionality (i.e. set
and then some action to record that the change was made), but it seems
a little odd and inconsistent to me that the return value of a short
closure would be used when the return value of the long version isn't.
This isn't really a major issue, I'm just curious if there was some
explanation about it?

Mainly because it introduces a lot more complexity. As far as I'm aware, determining at runtime whether or not to make use of the return value is impossible. (Ie, we cannot differentiate between "return", "no return statement", and "return null".) So it would require compile time branching to generate two different pathways after lexically detecting if there is a "return" token present somewhere in the hook body. That is probably doable, technically, but introduces more complexity to an already necessarily-large RFC. It's also something that is simple enough to add later as an option without any BC breaks or implications for other parts of the design, so it's safe to punt. (Some things are harder to punt on than others, as noted, but this one is easy/safe to skip for now.)

Right, i hadn't considered that whole "everything implicitly returns, even if it's null" scenario. Makes sense. The inconsistency is still a bit jarring but I understand the reasoning now, thanks.

== Re the $value variable in set

snip

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

My only question with this is the same as I had in an earlier reply
(and I'm not sure it was ever answered directly?), and you allude to
this yourself: everywhere else, ($var) means a parameter with type
mixed. Why is the type required here, when you've specifically said
you want to avoid boilerplate? If we're going to assume people can
understand that (implicit property-type $value) is implicit, surely we can also assume that they will understand "specifying a parameter without a type" means the parameter has no type (i.e. is mixed`).

Again, for myself I'd be likely to type it (or regular parameters,
properties, etc) as mixed if that's what I want anyway, but the
inconsistency here seems odd, unless there's some until-now unknown
drive to deprecate type-less parameters/properties/etc.

If we went this route, then an untyped set param would likely imply "mixed", just like on methods. Which, since mixed is the super type of everything, would still technically work, but would weaken the type enforcement and thus static analysis potential. (Just as a method param can be widened in a child class all the way to mixed/omitted, and it would be unwise for all the same reasons.)

Having a mixed param in the set hook shouldn't weaken the actual backing parameter though - when the hook writes to $this->prop, the parent type is still enforced, surely? If not, why not?

As for how static analysis tools handle this concept - I'd have thought it's too early to suggest what static analysis tools will or won't support given how much they already support based on less-formal syntax like docblocks. It's already possible to have a property that is reported (to static analysis tools/IDEs/etc) as a fixed type, but accepts a wider type on write, with the current mish-mash of typed properties, docblocks, magic getters and setters, and the bizarre unset behaviour. This is simply converting that into standardised syntax. The RFC itself proposes a scenario where a wider type is accepted in the set hook. I find it hard to believe that a static analysis tool can model "this property is Foo but it accepts string|Foo on write" but not "this property is Foo but it accepts mixed on write". Heck if that's such a problem what is said tool going to do when someone explicitly widens the parameter to mixed in a set hook?

I would argue that if the language is providing better support for typed properties by adding 'hooks' like this, the need for static analysis of those specific parts reduces greatly - if someone wants to accept mixed when storing as a string, and convert it in the hook, and the language can enforce those types at runtime, why should some hypothetical static analysis be a hangup for that?

In the RFC as currently written, omitted means "derive from the property," which is a concept that doesn't exist in methods; the closest equivalent would be if omitting a type in a child method parameter meant "use the parent's type implicitly," which is not how that works right now.

For the third time: I'm well aware of how parameter types work everywhere else, and that's why I'm asking why the same behaviour isn't being followed here?

You've said you want to avoid boilerplate; and
You've said you would expect most people to just use the implicit same-type $value parameter; and
You've acknowledged that the existing 'standard' is that a parameter without a type is considered mixed; and
You've acknowledged in your RFC that there is a use-case for wanting to accept a wider type than what a property stores internally.

So why then is it so unacceptable that the existing standard be followed, such that a set hook with an "untyped" parameter would be treated as mixed as it is everywhere else?

Yes, I know you said "widening to mixed is unwise". I don't seem to recall amongst all the type-related previous RFCs, any that suggested that child parameters widening to mixed (either explicitly or implicitly) should be deprecated, so I'm sorry but I don't see much value in that argument.

--Larry Garfield

Cheers

Stephen

1 year ago by Larry Garfield — view source

unread

== Re the $value variable in set

snip

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

My only question with this is the same as I had in an earlier reply
(and I'm not sure it was ever answered directly?), and you allude to
this yourself: everywhere else, ($var) means a parameter with type
mixed. Why is the type required here, when you've specifically said
you want to avoid boilerplate? If we're going to assume people can
understand that (implicit property-type $value) is implicit, surely we can also assume that they will understand "specifying a parameter without a type" means the parameter has no type (i.e. is mixed`).

Again, for myself I'd be likely to type it (or regular parameters,
properties, etc) as mixed if that's what I want anyway, but the
inconsistency here seems odd, unless there's some until-now unknown
drive to deprecate type-less parameters/properties/etc.

If we went this route, then an untyped set param would likely imply "mixed", just like on methods. Which, since mixed is the super type of everything, would still technically work, but would weaken the type enforcement and thus static analysis potential. (Just as a method param can be widened in a child class all the way to mixed/omitted, and it would be unwise for all the same reasons.)

Having a mixed param in the set hook shouldn't weaken the actual
backing parameter though - when the hook writes to $this->prop, the
parent type is still enforced, surely? If not, why not?

As for how static analysis tools handle this concept - I'd have thought
it's too early to suggest what static analysis tools will or won't
support given how much they already support based on less-formal syntax
like docblocks. It's already possible to have a property that is
reported (to static analysis tools/IDEs/etc) as a fixed type, but
accepts a wider type on write, with the current mish-mash of typed
properties, docblocks, magic getters and setters, and the bizarre
unset behaviour. This is simply converting that into standardised
syntax. The RFC itself proposes a scenario where a wider type is
accepted in the set hook. I find it hard to believe that a static
analysis tool can model "this property is Foo but it accepts
string|Foo on write" but not "this property is Foo but it accepts
mixed on write". Heck if that's such a problem what is said tool
going to do when someone explicitly widens the parameter to mixed
in a set hook?

I would argue that if the language is providing better support for
typed properties by adding 'hooks' like this, the need for static
analysis of those specific parts reduces greatly - if someone wants to
accept mixed when storing as a string, and convert it in the hook,
and the language can enforce those types at runtime, why should
some hypothetical static analysis be a hangup for that?

In the RFC as currently written, omitted means "derive from the property," which is a concept that doesn't exist in methods; the closest equivalent would be if omitting a type in a child method parameter meant "use the parent's type implicitly," which is not how that works right now.

For the third time: I'm well aware of how parameter types work
everywhere else, and that's why I'm asking why the same behaviour isn't
being followed here?

You've said you want to avoid boilerplate; and

You've said you would expect most people to just use the implicit
same-type $value parameter; and

You've acknowledged that the existing 'standard' is that a parameter
without a type is considered mixed; and

You've acknowledged in your RFC that there is a use-case for wanting
to accept a wider type than what a property stores internally.

So why then is it so unacceptable that the existing standard be
followed, such that a set hook with an "untyped" parameter would be
treated as mixed as it is everywhere else?

Yes, I know you said "widening to mixed is unwise". I don't seem to
recall amongst all the type-related previous RFCs, any that suggested
that child parameters widening to mixed (either explicitly or
implicitly) should be deprecated, so I'm sorry but I don't see much
value in that argument.

I think we're talking past each other here. I'm not saying that implicitly widening the type to mixed would break the world. Just that it's not a good idea in most cases.

Example:

public UnicodeString $s1;

public UnicodeString $s2 {
set (UnicodeString|string $value) {
$this->s2 = $value instanceof UnicodeString ? $value : new UnicodeString($value);
}

public UnicodeString $s3 {
set (mixed $value) {
$this->s3 = $value instanceof UnicodeString ? $value : new UnicodeString($value);
}
}

// These are all fine, of course.
$foo->s1 = new UnicodeString('beep);
$foo->s2 = new UnicodeString('beep);
$foo->s3 = new UnicodeString('beep);

// These would also be allowed by type widening:
$foo->s2 = 'beep';
$foo->s3 = 'beep';

// This would break with a type error. SA tools could detect it right on this line and flag it.
$foo->s2 = new User();

// This would break with a type error inside the set body, when calling new UnicodeString($value)
$foo->s3 = new User();

To an SA tool's point of view, the last example is valid, even though it will certainly break. In the worst case, it would type error when the set hook completes, so it has the same effect at runtime. But because the type information is weaker, SA tools can do less. The same is true for a method with a mixed param type: SA tools can't tell you at the call-site if it's going to be a problem or not, and where the error happens is less predictable.

Of course, if someone really wants to widen the set type to mixed and their code can handle it, go for it. I've no problem there. My issue is with designing a syntax such that the set type implicitly gets widened to mixed often.

public UnicodeString $s4 {
set ($name) {
$this->s4 = $name instanceof UnicodeString ? $name : new UnicodeString($name);
}
}

It would be very unexpected to me to now need to handle non-string|UnicodeString values of $name, or for my IDE to not flag when I try to assign it to a User object. That's my concern. (A concern I didn't realize was there until this thread, so I'm glad it was pointed out.) On the flipside, it's probably very unexpected for someone else for it to not implicitly mean mixed, the way it does for method parameters. Which unexpected is right?

Neither and both, I think. Hence why I think "if you set anything, you have to set both, but you can also omit both" is the best solution: There's a very clear and self-evident default behavior for the vast-majority case, and if you need a non-default behavior, the resulting code is fully self-documenting and allows SA tools to flag issues at the callsite.

The next-best alternative is to punt on both widening and custom variable names entirely and go with the C# model: The variable is called $value, always, you can't change it, deal. That leaves the syntax open to readd both in the future. I'd prefer to include both features now, but I'm OK with punting on both for now. I do think they have to come together, though, to avoid the unexpectedness described above.

--Larry Garfield

1 year ago by Gina P. Banyard — view source

unread

For the third time: I'm well aware of how parameter types work everywhere else, and that's why I'm asking why the same behaviour isn't being followed here?

You've said you want to avoid boilerplate; and

You've said you would expect most people to just use the implicit same-type $value parameter; and

You've acknowledged that the existing 'standard' is that a parameter without a type is considered mixed; and

You've acknowledged in your RFC that there is a use-case for wanting to accept a wider type than what a property stores internally.

So why then is it so unacceptable that the existing standard be followed, such that a set hook with an "untyped" parameter would be treated as mixed as it is everywhere else?

Yes, I know you said "widening to mixed is unwise". I don't seem to recall amongst all the type-related previous RFCs, any that suggested that child parameters widening to mixed (either explicitly or implicitly) should be deprecated, so I'm sorry but I don't see much value in that argument.

Deprecating this behaviour would effectively mean, types are now required everywhere, which is an unreasonable proposition.
The "implicit" widening to mixed is something that got introduced after typing was introduced to PHP, and quite "late" even (in 7.2) considering typing objects/arrays has existed since PHP 5. [1]

I don't think it's unreasonable that if property access hooks only work on typed properties, having no type means the type of the property and if you want to widen the type you must specify a type.
Especially if you want the backing type to be mixed well you are just required to write a typed property with type mixed, same as readonly.

Best regards,

Gina P. Banyard

[1] https://wiki.php.net/rfc/parameter-no-type-variance

1 year ago by Robert Landers — view source

unread

[Including my full previous reply, since the list and gmail currently
aren't being friends. Apologies that this leads to rather a lot of
reading in one go...]

Eh, I'd prefer a few big emails that come in slowly to lots of little emails that come in fast. :-)

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC.

Hello, and a huge thanks to both you and Ilija for the continued work
on this. I'd really like to see this feature make it into PHP, and
agree with a lot of the RFC.

My main concern is the proliferation of things that look the same but
act differently, and things that look different but act the same:

snip

a and b are both what we might call "traditional" properties, and
equivalent to each other; a uses legacy syntax which we haven't
removed for some reason

I don't know why we haven't removed var either. I can't recall the last time I saw it in real code. But that's out of scope here.

snip

I think there's some really great functionality in the RFC, and would
love for it to succeed in some form, but I think it would benefit from
removing some of the "magic".

Regards,

--
Rowan Tommins
[IMSoP]

I'm going to try and respond to a couple of different points together here, including from later in the thread, as it's just easier.

== Re, design philosophy:

In C#, all "properties" are virtual - as soon as you have any
non-default "get", "set" or "init" definition, it's up to you to declare
a separate "field" to store the value in. Swift's "computed properties"
are similar: if you have a custom getter or setter, there is no backing
store; to add behaviour to a "stored property", you use the separate
"property observer" hooks.

Kotlin's approach is philosophically the opposite: there are no fields,
only properties, but properties can access a hidden "backing field" via
the special keyword "field". Importantly, omitting the setter doesn't
make the property read-only, it implies set(value) { field = value }

A little history here to help clarify how we ended up where we are: The original RFC as we designed it modeled very closely on Swift, with 4 hooks. Using get/set at all would create a virtual property and you were on your own, while the beforeSet/afterSet hooks would not. We ran that design by some PHP Foundation sponsors a year ago (I don't actually know who, Roman did it for us), and the general feedback was "we like the idea, but woof this is complicated with all these hooks and having to make my own backing property for all these little things. Couldn't this be simplified?" We thought a bit more, and I off-handedly suggested to Ilija "I mean, would it be possible to just detect if a get/set hook is using a backing store and make it automatically? Then we could get rid of the before/after hooks." He gave it a quick try and found that was straightforward, so we pivoted to that simplified version. We then realized that we had... mostly just recreated Kotlin's design, so shrugged happily and went on with life.

As noted in an earlier email, C#, Kotlin, and Swift all have different stances on the variable name for the incoming value. We originally modeled on Swift so had that model (optional newVal name), and also because we liked how compact it was. When we switched to the simplified, incidentally Kotlin-esque approach, we just kept the optional variable as it works.

I think where that ended up is pretty nice, personally, even if it is not a direct map of any particular other language.

== Re asymmetric typing:

This is capability already present today if using a setter method.

class Person {
private $name;

public function setName(UnicodeString|string $name)
{
$this->name = $value instanceof UnicodeString ? $value : new UnicodeString($value);
}
}

And widening the parameter type in a child class is also entirely legal. As the goal of the RFC is, essentially, "make most common getter/setter patterns easy to add to a property without making an API-breaking method, so people don't have to add redundant just-in-case getters and setters all the time," covering an easy-to-cover use case seems like a good thing to do.

It also ties into the question of the explict/implicit name, for the reason you mentioned earlier (unspecified means mixed), not by intent. More on that in another section.

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a virtual keyword. Ilija also suggested a backed keyword, which forces a backed property to exist even if it's not used in the hook itself.

Adding virtual adds more work for the developer, but more clarity. It would also mean $this->$propName or $this->{PROPERTY} would work "as expected", since there's no auto-detection for virtual-ness. On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected. If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation, which could be unexpected. (Mostly this would be for a virtual-get that accidentally has a default setter because you forgot to mark it virtual.)

Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem. It would also allow for $this->$propName to work if a property is explicitly backed. On the flipside, it's one more thing to think about, and the above example it solves would be trivially solved by having the method just return the value to set and letting the set hook do the actual write, which is arguably better and more reliable code anyway.

The status quo (auto-detection based on the presence of $this->propName). This has the advantage it "just works" in the 95% case, without having to think about anything extra. The downside is it does have some odd edge cases, like needing $this->propName to be explicitly used.

I don't think any is an obvious winner. My personal preference would be for status quo (auto-detect) or explicit-virtual always. I could probably live with either, though I think I'd personally favor status quo. Thoughts from others?

I agree that a flag to make the field virtual (thus disabling the backing store) makes more sense than a flag to make it backed; It's also easier to understand when comparing hooked properties with regular properties (essentially, backed is the default, you have to opt-in to it being virtual). I don't think the edge cases of "auto" make it worthwhile just to not need "virtual".

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there. That is, it makes the validation unreliable. A major footgun. The question is, do we favor caveat-emptor flexibility or correct-by-construction safety? Personally I always lead toward the latter, though PHP in general is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's easier to enable that loophole in the future if we really want it than to get rid of it if it turns out to have been a bad idea.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook. I worry that is "one more quirky edge case", though, so as above it may be better to skip for now as it's easier to add later than remove. But if the consensus is to do that, we're open to it. (Question for everyone.)

I don't have strong feeling about this, but in general I usually tend to prefer options that are consistent, and give power/options to the developer. If references are opt-in anyway, I see that as accepting the trade-offs. If a developer doesn't want to allow by-ref modifications of the property, why would they make it referenceable in the first place? This sounds a bit like disallowing regular public properties because they might be modified outside the class - that's kind of the point, surely.

== Re

== Re arrays

Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook. This would have a large and obscure performance penalty, and a set implementation would have no way of knowing which values had changed, leading to, for instance, code needing to revalidate all elements even if only one has changed.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

Drop the "top level" shorthand, for get-only hooks.

Keep the => shorthand for the get hook itself.

For a set hook, the {} form has no return value; set the value yourself however you want.

For a set hook, the => form implies a backed value and will set the property to whatever value that evaluates to.

So these are equivalent:

public $foo { set { $this->foo = $value; } }
public $foo { set => $value; }

These are equivalent:

public string $foo {
get {
return strtoupper($this->foo);
}
}
public string $foo { get => strtoupper($this->foo); }

And this goes away:

public string $foo => strtoupper($this->foo);

That covers the common cases with an arrow-function-like syntax that behaves as you'd expect (it returns things), and allows a longer version with arbitrarily complex logic if desired. It also means that each syntax variant does mean something importantly different.

Would that be an acceptable compromise? (Question for everyone.)

I think the examples given are clear, and the lack of the top-level short closure-esque version makes it more obvious. Forgive me, I must have missed some of the previous comments - is there a reason the 'full' setter can't return a value, for the sake of consistency? I understand that you don't want "return to set" to be the only option, for the sake of e.g. change/audit logging type functionality (i.e. set and then some action to record that the change was made), but it seems a little odd and inconsistent to me that the return value of a short closure would be used when the return value of the long version isn't. This isn't really a major issue, I'm just curious if there was some explanation about it?

== Re the $value variable in set

Honestly, Rowan's earlier point here is the strongest argument in favor for me of the current RFC approach. Anywhere else in PHP, something that looks like a parameter and has no type, like ($param), means its type is mixed. It would be weird and confusing to be different here. That's above and beyond the issue of forcing people to retype something obvious every time. (I cite again, recent PHP's trend toward removing needless boilerplate, which is very good.) Requiring that the type be specified, for consistency, makes little sense if the type is not allowed to vary. You're just repeating a string from earlier on the same line, for no particular benefit.

I genuinely don't understand the pushback on $value. It's something you learn once and never have to think about again. It's consistent.

Ilija jokingly suggested making it always $value, unconditionally, and allowing only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Though I suspect that won't go over well, either. :-)

So what makes the most sense to me is to keep $value optional, but IF you specify an alternate name, you must also specify a type (which may be wider). So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal. But you could also do this, if the situation called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives the most flexibility where needed while still having the least redundancy when not needed, and when a name/type is provided, its behavior is the same as for a method being inherited.

Does that sound acceptable? (Again, question for everyone.)

My only question with this is the same as I had in an earlier reply (and I'm not sure it was ever answered directly?), and you allude to this yourself: everywhere else, ($var) means a parameter with type mixed. Why is the type required here, when you've specifically said you want to avoid boilerplate? If we're going to assume people can understand that (implicit property-type $value) is implicit, surely we can also assume that they will understand "specifying a parameter without a type" means the parameter has no type (i.e. is mixed`).

Again, for myself I'd be likely to type it (or regular parameters, properties, etc) as mixed if that's what I want anyway, but the inconsistency here seems odd, unless there's some until-now unknown drive to deprecate type-less parameters/properties/etc.

The alternative that gives the most future-flexibility is to do neither: The variable is called $value, period, you can't change it, and you can't change the type, either. There is no () after set, ever. Punt both of those to a later follow-up. I'd prefer to include both now, but including neither now is the next-safer option.

Regarding $field

Sigh, now y'all like it. :-P Most of the feedback on this has been negative, so I'm inclined to leave it out at this point, unless there's a major swing in feedback to bring it back. But the RFC seems more likely to pass without it than with right now.

--Larry Garfield

Cheers

Stephen

I would think that simply using return-to-set would be the simplest
solution, if you need to run something after it's set, you can use the
regular way of running code after a return:

try {
return $value + 100;
} finally {
// this runs after returning
}

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Larry Garfield — view source

unread

I would think that simply using return-to-set would be the simplest
solution, if you need to run something after it's set, you can use the
regular way of running code after a return:

try {
return $value + 100;
} finally {
// this runs after returning
}

That would not work. Fun fact, the finally block runs before return. (This is a common point of confusion.) So this would still not allow for statements to run after the assignment itself (the return) happens.

--Larry Garfield

1 year ago by Gina P. Banyard — view source

unread

I genuinely don't understand the pushback on $value. It's something you learn once and never have to think about again. It's consistent.

Ilija jokingly suggested making it always $value, unconditionally, and allowing only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Though I suspect that won't go over well, either. :-)

Same, I don't understand the pushback on $value, for me this seems clearer and less confusing than the existing $this variable when I was first learning about OOP.
And I would be in favour of just making it always $value, if people feel that strongly about being able to name it with a custom name they could propose this as a follow-up RFC.

One thing the RFC should note, which I would expect to be the same as the array cast, is how do backed/virtual properties interact with get_mangled_object_vars()

Best regards,

Gina P. Banyard

1 year ago by Rowan Tommins [IMSoP] — view source

unread

A little history here to help clarify how we ended up where we are: The original RFC as we designed it modeled very closely on Swift, with 4 hooks. Using get/set at all would create a virtual property and you were on your own, while the beforeSet/afterSet hooks would not.

This is interesting to hear, because the current RFC comes across - or at least did to me, on first reading - as "these are normal properties, and then you're adding some magic on top of them".

The Eureka! moment for me was realising that the whole thing makes more sense if you start with "virtual" properties, and then add some magic to avoid declaring a second property to store a backing value.

This "virtual-first" view is how my current thinking is framed, which I tried to demonstrate here: https://wiki.php.net/rfc/property-hooks/imsop-suggestion

== Re asymmetric typing:

This is capability already present today if using a setter method.

class Person {
private $name;
public function setName(UnicodeString|string $name)
{
    $this->name  =  $value  instanceof UnicodeString ? $value : new  UnicodeString($value);         
}
}

I find this unconvincing. Just because this method name starts with "set", doesn't mean everything it does should be possible in a property setter.

You could also have a method that took two arguments:

public function setName(string $first, string $last)
{
$this->name = $first . ' ' . $last;
}

Or a mandatory argument and an optional flag:

public function setName(string $name, bool $normalise=false)
{
$this->name = $normalise ? ucwords($name) : $name;
}

Neither of those are going to be translatable to property set hooks, and that's totally fine.

covering an easy-to-cover use case seems like a good thing to do.

This is the crux: I don't think asymmetric types are easy.

Firstly, they mean that every single piece of static analysis or reflection which wants to ask "what is the type of this property" has to take into account the new concept that the "settable type" might be different from the "gettable type".

Secondly, they mean that a user has to understand that this code might result in the variable magically changing type:

$me = 'Rowan';
$object->name = $me;
$me = $object->name;
// $me is now magically an object!
// how do I get my string back?

Thirdly, there's a simpler alternative, providing a separate virtual property, which can then be readable as well as writeable:

$me = 'Rowan';
$object->nameString = $me;
$me = $object->nameUnicode;
// easily visible that we're reading a different property, with a different type
$object->nameUnicode = $me;
$me = $object->nameString;
// fully reversible, and no need to know how the object implements it

It also ties into the question of the explict/implicit name, for the reason you mentioned earlier (unspecified means mixed)

Another reason to dislike it as complicating the proposal, IMHO.

== Re virtual properties:

On the downside, if you have a could-be-virtual property but never actually use the backing value, you have an extra backing value hanging around in memory that is inaccessible normally, but will still show up in some serialization formats, which could be unexpected.

This is a reasonable argument in favour of automatic detection.

If you omit one of the hooks and forget to mark it virtual, you'll still get the default of the other operation

My immediate question here is "why?" Why does the set hook magically come into existence, just because you defined the get hook a particular way?

Consider this example, using a virtual property:

private int $_nextId;
public int $nextId {
get { $this->_nextId ??= 0; return $this->_nextId++; }
}

This might not be the most common thing to do, but (as far as I know) it will work. There is no direct write access to the property, but we need somewhere to store the incrementing value.

Then we say "woof this is complicated having to make my own backing property for all these little things; can this be simplified?" and write this:

public int $nextId {
get { $this->nextId ??= 0; return $this->nextId++; }
}

Great! We've got rid of the explicit backing field, and the code still works... but wait! Suddenly, a setter has appeared, even though we never asked for one!

I suppose the reasoning is that it's quite common to want to implement the default version of one or other hook; but note that with the currently proposed short-hand the defaults can be written as:

get => $this->someName;
set($value) => $value;

If that's still too long, we can borrow from C#, where they can be written as:

get;
set;

That gives the choice of whether the default is implemented back to the user, and removes a foot-gun inconsistency between virtual and backed properties.

Doing autodetection as now, but with an added "make a backing value anyway" flag would resolve the use case of "My set hook just calls a method, and that method sets the property, but since the hook doesn't mention the property it doesn't get created" problem.

Unless I'm missing something, no it wouldn't. There's still no way for a method to refer to that backing value. If the method sets the property, it will just end up recursively calling the set hook. The backing value remains visible only inside the hooks.

Unless I've misunderstood, and the implementation somehow chooses the meaning of $this->foo based on whether the set hook is somewhere in the call stack, in which case ... yikes!

== Re reference-get

Allowing backed properties to have a reference return creates a situation where any writes would then bypass the set hook, and thus any validation implemented there.

As Stephen Reay says, isn't that up to the user to decide?

Again, the backed property isn't doing anything that a virtual property plus an explicit backing property couldn't:

private string _$foo;
public string $foo {
&get { return $this->_foo; }
set { /* whatever */ }
}

Either we're willing to trust the user with that power, and should let them do the same thing with a magic backing field; or we're not willing to trust them, and should not allow &get at all.

There is one edge case that might make sense: If there is no set hook defined, then there's no set hook to worry about bypassing. So it may be safe to allow &get on backed properties IFF there is no set hook.

Given my above argument that the set hook should never be added automatically / implicitly, this could be simplified to "allow an &get hook only if no other hook is defined" (i.e. you can't have both "get" and "&get", and you can't have both "&get" and "set").

== Re arrays

The simplest approach would be to copy the array, modify it accordingly, and pass it to set hook.

This isn't what I was suggesting.

I was suggesting that modifying the array called the &get hook, and modified whatever array that returned.

Unless we were OK with that bypassing the set hook entirely if defined, which, as noted above, means any safety guarantees provided by a set hook are bypassed, leading to untrustworthy code.

Again, already possible as soon as you have any &get hooks at all:

$temp =& $foo->bar;
$temp[42] = true;
var_dump($foo->bar);

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a little. :-) Here's our counter-proposal:

Drop the "top level" shorthand, for get-only hooks.

Keep the => shorthand for the get hook itself.

For a set hook, the {} form has no return value; set the value yourself however you want.

For a set hook, the => form implies a backed value and will set the property to whatever value that evaluates to.

I'm 100% behind this.

I genuinely don't understand the pushback on $value. It's something you learn once and never have to think about again. It's consistent.

For me, the problem is in having both a special name and the ability to choose the name. There's nowhere else in the language where that happens.

I also can't think of any reason someone would choose a name other than $value. I can well imagine coding standards mandating that it always be called that, making it boring boilerplate.

Ilija jokingly suggested making it always $value, unconditionally, and allowing only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Honestly, if you think asymmetric types are a good idea (which I don't), that makes a lot more sense.

Specifying the writeable type has nothing to do with the name of the value, it's a special case that will be rarely used, and should draw attention to its key feature: the type.

The alternative that gives the most future-flexibility is to do neither: The variable is called $value, period, you can't change it, and you can't change the type, either. There is no () after set, ever. Punt both of those to a later follow-up. I'd prefer to include both now, but including neither now is the next-safer option.

This is by far my preferred option. Asymmetric types are too much magic, and choosing the variable name is just one more case to consider.

Regarding $field

Sigh, now y'all like it. :-P

As I said at the top, the Eureka! moment for me was thinking "virtual first".

In the original RFC, it was implied (or, it seemed to me) that $field was just an alias for referencing the "real" property. That's a really tempting interpretation, but it's not what's happening.

What's really happening is that the property itself is virtual: every single access to it goes through the hooks. But, within the hooks, we have provided a magic variable, stored on the object but accessible only there, where the hooks can store a value of the same type as the virtual property.

Once I came to that interpretation, it became much more intuitive to call that magic variable by a magic name like $field; than to re-use the syntax that would normally refer to the property, and make it sometimes reference this new thing instead.

To re-iterate an earlier point, though, I think the language should choose. There should be exactly one way to refer to the backing field, whether that's $this->foo, $field, or get_backing_field(). Don't leave users reading each other's code and not being sure if it's doing the same thing.

Regards,

Rowan Tommins
[IMSoP]