Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday 25-09-2018.
Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
Bob and Nikita
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
For the record, I still think we will come to regret allowing non-nullable
type hints without any constraint on the author of the class to initialise
them correctly. As proposed, the invalid state may only be detected when it
causes a runtime error in completely unrelated code.
I gather that the authors of C# are currently going through a painful
development phase to introduce better support for non-nullable types, and
having to include many compromises and handle many edge-cases because it
was not done earlier. I realise this case is not completely comparable, but
we have an opportunity to get this right first time, and not just take the
easy option.
I am therefore going to make one last plea: if we don't yet know how to
assert that complex types are initialised, do not allow them to be
non-nullable in the first version of this feature.
That is, allow class Foo { public ?Foo $foo = null; }
, but not class Foo { public Foo $foo; }
.
This would still be a huge improvement to the language, but leaves us free
to design additional features to prevent Unitialized Property Errors
becoming as hated as Null Pointer Exceptions in Java or C#.
Regards,
Rowan Collins
[IMSoP]
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
For the record, I still think we will come to regret allowing non-nullable
type hints without any constraint on the author of the class to initialise
them correctly. As proposed, the invalid state may only be detected when it
causes a runtime error in completely unrelated code.I gather that the authors of C# are currently going through a painful
development phase to introduce better support for non-nullable types, and
having to include many compromises and handle many edge-cases because it
was not done earlier. I realise this case is not completely comparable, but
we have an opportunity to get this right first time, and not just take the
easy option.I am therefore going to make one last plea: if we don't yet know how to
assert that complex types are initialised, do not allow them to be
non-nullable in the first version of this feature.That is, allow
class Foo { public ?Foo $foo = null; }
, but notclass Foo { public Foo $foo; }
.This would still be a huge improvement to the language, but leaves us free
to design additional features to prevent Unitialized Property Errors
becoming as hated as Null Pointer Exceptions in Java or C#.Regards,
Rowan Collins
[IMSoP]
I posit that this code:
class Foo {
public Foo $foo;
}
Is superior to this code:
class Foo {
public ?Foo $foo = null;
}
If after "initialization" that $foo
is guaranteed to always contain
an object of type Foo
. The reason is simple: for the actual lifetime
of the object the former correctly states that it will never be null,
while the latter opens up possibilities of it. That is, the former
provides better support for non-nullable types.
To prevent all forms of initialization errors we would have to do
analysis at the initialization site and prevent dynamic behaviors in
that region. For now I believe such things are better left to static
analysis tools than the engine.
I posit that this code:
class Foo { public Foo $foo; }
Is superior to this code:
class Foo { public ?Foo $foo = null; }
If after "initialization" that
$foo
is guaranteed to always contain
an object of typeFoo
.
Yes, IF it guaranteed; but the current proposal offers no such guarantee.
Whichever of those definitions is used, the following code may fail:
function expectsFoo(Foo $foo) {
assert($foo->foo instanceOf Foo);
}
The reason is simple: for the actual lifetime
of the object the former correctly states that it will never be null,
while the latter opens up possibilities of it.
It correctly states that it will never be null, but it incorrectly implies
that it will always be an instance of Foo.
It's not even true that once initialised the property will always contain
a Foo, because (if I understand it correctly) it is also allowed to call
unset() on a non-nullable property at any time.
That is, the former
provides better support for non-nullable types.
The property is only "non-nullable" because we have invented a new state,
very similar to null, and called it something other than "null".
To prevent all forms of initialization errors we would have to do
analysis at the initialization site and prevent dynamic behaviors in
that region. For now I believe such things are better left to static
analysis tools than the engine.
I agree that this is a hard problem, but I don't agree that this decision
is being made "for now". If we allow "non-nullable but uninitialized"
properties now, it will be extremely hard to change their behaviour in
future.
My request is emphatically not to reject the entire RFC until this is
solved. It is to say that "for now", all typed properties must be
initialised inline to a valid value; and by implication, that a
non-nullable object hint cannot be used.
Regards,
Rowan Collins
[IMSoP]
I agree that this is a hard problem, but I don't agree that this decision
is being made "for now". If we allow "non-nullable but uninitialized"
properties now, it will be extremely hard to change their behaviour in
future.
I'm with Rowan on this one.
This concept of "uninitialized" frankly seems like an allowance for
people who insist on writing poor code.
Nulls are bad, and "unintialized" is just another kind of "null" with
a built-in run-time type-check that executes on read - too late.
The first example given is just bad:
class Point {
public float $x, $y;
private function __construct() {}
public static function fromEuclidean(float $x, float $y) {
$point = new Point;
$point->x = $x;
$point->y = $y;
return $point;
}
}
You define two invariants: $x and $y must be floats - and then proceed
to break those constraints in the constructor?
Wrong. The RFC itself accurately states that "this code can be
rewritten to indirect through __construct() instead" - as shown in the
previous example.
Now why would you deliberately open the fences and knowingly invite
people to write poor code like this?
As for the second example:
class Point {
public float $x, $y;
public function __construct(float $x, float $y) {
$this->doSomething();
$this->x = $x;
$this->y = $y;
}
}
If doSomething() attempts to read an uninitialized property while the
constructor is still executing, throwing a helpful "uninitialized"
error is fine.
But, in my opinion, once the constructor has executed, the invariants
as declared by the class itself must be satisfied.
If there's one meaningful use-case for allowing objects in a
partially-initialized state, it's during
hydration/unserialization/reflection scenarios, maybe - but in those
cases, you're willfully bypassing the constructor; it's not the
everyday 95% use-case and some risk is acceptable here, you'll get
around it with tests. But nobody wants to write tests all day to see
if any classes contain "unininitialized" properties - that misses half
the whole point of being able to declare those types in the first
place, e.g. makes type-hinted private/protected properties totally
unreliable.
Once this is in a release, it'll be unfixable, and in my opinion will
likely go down in history as another one of those little things we
wish we could go back in time and fix :-/
I agree that this is a hard problem, but I don't agree that this decision
is being made "for now". If we allow "non-nullable but uninitialized"
properties now, it will be extremely hard to change their behaviour in
future.I'm with Rowan on this one.
This concept of "uninitialized" frankly seems like an allowance for
people who insist on writing poor code.Nulls are bad, and "unintialized" is just another kind of "null" with
a built-in run-time type-check that executes on read - too late.The first example given is just bad:
class Point {
public float $x, $y;private function __construct() {} public static function fromEuclidean(float $x, float $y) { $point = new Point; $point->x = $x; $point->y = $y; return $point; }
}
You define two invariants: $x and $y must be floats - and then proceed
to break those constraints in the constructor?Wrong. The RFC itself accurately states that "this code can be
rewritten to indirect through __construct() instead" - as shown in the
previous example.Now why would you deliberately open the fences and knowingly invite
people to write poor code like this?As for the second example:
class Point {
public float $x, $y;public function __construct(float $x, float $y) { $this->doSomething(); $this->x = $x; $this->y = $y; }
}
If doSomething() attempts to read an uninitialized property while the
constructor is still executing, throwing a helpful "uninitialized"
error is fine.But, in my opinion, once the constructor has executed, the invariants
as declared by the class itself must be satisfied.If there's one meaningful use-case for allowing objects in a
partially-initialized state, it's during
hydration/unserialization/reflection scenarios, maybe - but in those
cases, you're willfully bypassing the constructor; it's not the
everyday 95% use-case and some risk is acceptable here, you'll get
around it with tests. But nobody wants to write tests all day to see
if any classes contain "unininitialized" properties - that misses half
the whole point of being able to declare those types in the first
place, e.g. makes type-hinted private/protected properties totally
unreliable.Once this is in a release, it'll be unfixable, and in my opinion will
likely go down in history as another one of those little things we
wish we could go back in time and fix :-/--
PHP permits skipping constructors. The code may not work if you do so,
but it's the state of how things are. Validating after a constructor
call will not catch all issues while requiring a constructor, whereas
I think this code should be allowed:
class User {
public int $id;
public string $preferred_name;
public string $username;
}
I doubt we will come to a resolution -- these points were already
pointed out in the discussion phase.
I think this code should be allowed:
class User { public int $id; public string $preferred_name; public string $username; }
Why? What contract is being enforced by that class that is not enforced
by this class?
class User {
public ?int $id=null;
public ?string $preferred_name=null;
public ?string $username=null;
}
Both require the consumer of the class to trust that someone, somewhere, has initialised the fields (and not subsequently unset them).
Or have I misunderstood what you intended with that example?
Regards,
--
Rowan Collins
[IMSoP]
On Wed, Sep 19, 2018 at 11:17 PM Rowan Collins rowan.collins@gmail.com
wrote:
I think this code should be allowed:
class User { public int $id; public string $preferred_name; public string $username; }
Why? What contract is being enforced by that class that is not enforced
by this class?class User { public ?int $id=null; public ?string $preferred_name=null; public ?string $username=null; }
Both require the consumer of the class to trust that someone, somewhere,
has initialised the fields (and not subsequently unset them).Or have I misunderstood what you intended with that example?
Regards,
--
Rowan Collins
[IMSoP]
At least the approach without nullable properties will lead to a Throwable
when a read is attempted on an uninitialized object, which is still better
than nullability checks all over the place.
And yes, constructing an object without going through its constructor is
quite common anyway - this was indeed part of the upfront discussion, and
is also part of why I gave a +1 to the current RFC.
Marco Pivetta
At least the approach without nullable properties will lead to a
Throwable when a read is attempted on an uninitialized object, which
is still better than nullability checks all over the place.
Is it? Doesn't it just mean writing this:
try {
someFunction($object->propertyThatClaimsToBeNonNullable);
} catch ( TypeError $e ) {
...
}
Instead of this:
if ( ! is_null($object->propertyThatClaimsToBeNonNullable) ) {
someFunction($object->propertyThatClaimsToBeNonNullable);
} else {
...
}
For that matter, all I need to do is define someFunction as taking a
non-nullable parameter, and I get the TypeError either way.
Surely the point of a non-nullable property shouldn't be "it gives a
slightly different error if it's not set", it should be "you don't have
to worry about this not being set, because the language will enforce
that somewhere". (And to cover your last point, that somewhere doesn't
need to be the constructor, if requiring that is really such a big problem.)
Regards,
--
Rowan Collins
[IMSoP]
On Wed, Sep 19, 2018 at 11:46 PM Rowan Collins rowan.collins@gmail.com
wrote:
At least the approach without nullable properties will lead to a
Throwable when a read is attempted on an uninitialized object, which
is still better than nullability checks all over the place.Is it? Doesn't it just mean writing this:
try {
someFunction($object->propertyThatClaimsToBeNonNullable);
} catch ( TypeError $e ) {
...
}Instead of this:
if ( ! is_null($object->propertyThatClaimsToBeNonNullable) ) {
someFunction($object->propertyThatClaimsToBeNonNullable);
} else {
...
}For that matter, all I need to do is define someFunction as taking a
non-nullable parameter, and I get the TypeError either way.Surely the point of a non-nullable property shouldn't be "it gives a
slightly different error if it's not set", it should be "you don't have
to worry about this not being set, because the language will enforce
that somewhere". (And to cover your last point, that somewhere doesn't
need to be the constructor, if requiring that is really such a big
problem.)Regards,
--
Rowan Collins
[IMSoP]
That's what static analysis is for (see a bit above).
Sadly, static analysis doesn't really fit the engine out of the box, as PHP
is a bit too dynamic for that, and it doesn't really consider cross-file
declared symbols anyway.
Still, tools like PHPStan or Psalm can easily aid with that.
Also, there are scenarios (discussed in typed properties v1) that make the
uninitialized state actually favorable (every serializer ever, like every
one, really).
Marco Pivetta
Also, there are scenarios (discussed in typed properties v1) that make the
uninitialized state actually favorable (every serializer ever, like every
one, really).
I still find a problem with this idea that everything must be
initialized. If I am working with business logic built into the database
then 'NULL' is very much a valid state and when I create an object
encapsulating a record it's values should be NULL
until it is actually
download, or more normally until a new record has SOME of it's fields
populated. It is simply not sensible to be nailing down every variable
that is being passed and it is certainly not 'bad coding' to be working
with uninitialized data - it's handling just what needs to be
initialized that is the job of the code. And that is unlikely to be done
in the constructor!
--
Lester Caine - G8HFL
Contact - https://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - https://lsces.co.uk
EnquirySolve - https://enquirysolve.com/
Model Engineers Digital Workshop - https://medw.co.uk
Rainbow Digital Media - https://rainbowdigitalmedia.co.uk
I think this code should be allowed:
class User {
public int $id;
public string $preferred_name;
public string $username;
}
This code is broken - by making the properties non-nullable, you're
literally saying "these properties will be initialized", then
proceeding to not initialize them. That's just incomplete code.
Note that we're talking about two different things here - I was
talking about bypassing declared constructors for technical reasons.
You're talking about omitting constructors from the declaration - but
all classes have a constructor. Not declaring it just implies an empty
constructor. You can even invoke it using reflection.
For your use-case, assuming you insist on writing bad code, is more
accurate like this:
class User {
public ?int $id;
public ?string $preferred_name;
public ?string $username;
}
These properties are valid without a constructor. You can safely infer
them as null, rather than as "uninitialized".
Non-nullable properties aren't valid without initialization. Not in
any language I've ever heard of.
Maybe PHP has to be the first to prove every other language right?
We have enough half-baked features and inconsistencies as it is.
We'll regret this forever. Like how Javascript developers regret
having null and undefined on a daily basis.
Uninitialized is the new null - it's the PHP equivalent of undefined
in Javascript.
Please no.
This will be my last reply to this thread. Fundamentally:
class User {
public ?int $id;
public ?string $preferred_name;
public ?string $username;
}
^ This permits null properties at all times. This is acceptable
behavior if null is valid for the domain. It is not valid for this
domain -- all 3 are required.
class User {
public int $id;
public string $preferred_name;
public string $username;
}
^ This never permits null properties, and using them without
initializing them is an error, and you get notified by the runtime
that such a thing happened. This is good and desirable behavior.
This will be my last reply to this thread.
This will be my first and, God willing, only reply to this thread.
Fundamentally:
class User {
public int $id;
public string $preferred_name;
public string $username;
}^ This never permits null properties, and using them without
initializing them is an error, and you get notified by the runtime
that such a thing happened. This is good and desirable behavior.
This is bad and undesirable behavior.
Deferring the error to on-read makes the properties magic and
unknowable. This is broken by design. The RFC got my vote because
broken never stood in PHP's way, and at the very least, as a library
author, I am empowered to aggressively initialize my properties even
if the runtime gives me insufficient protections from coding errors.
A static analysis engine can pick up the slack where the engine falls
short, so that's my yes vote, but it's not an endorsement of the
fundamentally broken design.
-Sara
This will be my last reply to this thread. Fundamentally:
class User {
public ?int $id;
public ?string $preferred_name;
public ?string $username;
}^ This permits null properties at all times. This is acceptable
behavior if null is valid for the domain. It is not valid for this
domain -- all 3 are required.
If all three are required, your constructor needs to reflect that.
The reason constructors even exist in the first place is to initialize
the object's data members and establishing the invariant of the class.
Constructors are supposed to prepare objects for use.
Your (empty, implicit) constructor isn't doing that.
Your reasoning about this is circular - you want your properties to
be optionally initialized and non-null at the same time. You get
around this by coming up with a new term "undefined" for a
special kind of null that triggers run-time exceptions on read.
The whole idea is unnecessarily complex, and will certainly lead
to silent bugs that allow partially-initialized domain objects to
pass through many layers of a system without getting caught
until some code tries to read an "uninitialized" property.
function make_foo(): Foo {
return new Foo(); // missing a property
}
function x() { ... }
function y() { ... }
function z() { ... }
x(y(z(make_foo()));
The instance travels through layers and layers of callls and
fails somewhere in the x() or y() function because of a missing
value ... why was the value missing? why wasn't it initialized?
where was it supposed to have been initialized? where did
this bad instance of Foo even come from?
All that information is lost in a call-stack that's long gone by
the time you invoke x() and you'll have to backtrack through
layers and layers of code to try to figure out where this bad
instance came from.
This will be an everyday thing with code like that.
To say that we don't need to enforce invariants at the time of
construction is to say we don't need to enforce them at all -
enforcing them "maybe later" is the same as not enforcing
them. That is, when they trigger an error, it's no different
from a null-value triggering a similar error.
No matter how you twist it, uninitialized is the new null.
I'm fine with unintialized as an implementation detail that
ensures you can't read from properties while the constructor
is busy establishing the invariant.
I'm not at all fine with unintialized as a new language feature.
No matter how you twist it, uninitialized is the new null.
I'm fine with unintialized as an implementation detail that
ensures you can't read from properties while the constructor
is busy establishing the invariant.I'm not at all fine with unintialized as a new language feature.
Ignoring the debate on uninitialized/null ... not all objects ARE
invariant and there are very good reasons for not setting values for
everything, but it seems that these types of object are deemed to be
'bad coding' where in fact the simply have elements that yet to be
'initialized' if at all for this instance of the object. The constructor
simply creates those elements that need to exit and does nothing with
the holders for elements that have yet to be populated ... they stay null.
--
Lester Caine - G8HFL
Contact - https://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - https://lsces.co.uk
EnquirySolve - https://enquirysolve.com/
Model Engineers Digital Workshop - https://medw.co.uk
Rainbow Digital Media - https://rainbowdigitalmedia.co.uk
Ignoring the debate on uninitialized/null ... not all objects ARE
invariant
hence nullable types.
and there are very good reasons for not setting values for
everything, but it seems that these types of object are deemed to be
'bad coding' where in fact the simply have elements that yet to be
'initialized' if at all for this instance of the object.
that's what nullable types are for.
The constructor
simply creates those elements that need to exit and does nothing with
the holders for elements that have yet to be populated ... they stay null.
hence nullable types.
the point of using type-hints on the properties of a class is to describe
the invariant state of that class - if the state of an instance of that class
is not what the class itself prescribed/promised at the time when you
return from the constructor, what's the point?
the only difference between using nullable vs non-nullable property
type-hints then, is whether you can set them back to null after setting
them to value - but it's okay for them to return from the constructor in
a "null-like" state?
this doesn't provide any additional guarantees:
class Foo {
public int $bar;
}
$foo = new Foo(); // invalid state allowed
$foo->bar = 123; // valid state
$foo->bar = null; // invalid state NOT allowed?!
Have the effects on the null coalesce operator even been considered?
$bar = $foo->bar ?? "what";
Is unintialized "null-like enough" or will this trigger an error?
Extremely confusing.
Type-annotations are essentially assertions about the state of a program -
if you can't count on those assertions to be fulfilled (which you
can't if they're
not checked at the time of initialization) then they're not useful.
The bottom line for me is that property type-hints were supposed to make
my programs more predictable, make the language more reliable.
Instead, we've introduced a whole new kind of uncertainties, new ways to
write unpredictable code, a new kind of null that makes the language even
more unreliable.
In terms of reliability, it's actually sub-par to hand-written accessors.
Shorter syntax, reflection support, great - but ultimately at the cost of
reliability and predictable state.
For one, what good is reflection, if it tells me I can expect a property to be
an integer, and it turns out it doesn't have a value after all?
If you want horrible code with no guarantees, there are plenty of ways to
do that already - you don't need property type-hints for anything more than
mere convenience in the first place.
this doesn't provide any additional guarantees:
class Foo {
public int $bar;
}$foo = new Foo(); // invalid state allowed
$foo->bar = 123; // valid state
$foo->bar = null; // invalid state NOT allowed?!
[snip]
In terms of reliability, it's actually sub-par to hand-written accessors.
You have no guarantee that a declared public property even exists
(regardless of the Typed Properties 2.0 RFC), because it might have been
unset. And no, the property is not really NULL
in this case, but rather
undefined.
--
Christoph M. Becker
Ignoring the debate on uninitialized/null ... not all objects ARE
invarianthence nullable types.
and there are very good reasons for not setting values for
everything, but it seems that these types of object are deemed to be
'bad coding' where in fact the simply have elements that yet to be
'initialized' if at all for this instance of the object.that's what nullable types are for.
The constructor
simply creates those elements that need to exit and does nothing with
the holders for elements that have yet to be populated ... they stay null.hence nullable types.
the point of using type-hints on the properties of a class is to describe
the invariant state of that class - if the state of an instance of that
class is not what the class itself prescribed/promised at the time when you
return from the constructor, what's the point?the only difference between using nullable vs non-nullable property
type-hints then, is whether you can set them back to null after setting
them to value - but it's okay for them to return from the constructor in
a "null-like" state?this doesn't provide any additional guarantees:
class Foo {
public int $bar;
}$foo = new Foo(); // invalid state allowed
$foo->bar = 123; // valid state
$foo->bar = null; // invalid state NOT allowed?!
Have the effects on the null coalesce operator even been considered?
$bar = $foo->bar ?? "what";
Is unintialized "null-like enough" or will this trigger an error?
Extremely confusing.
Type-annotations are essentially assertions about the state of a program -
if you can't count on those assertions to be fulfilled (which you
can't if they're
not checked at the time of initialization) then they're not useful.The bottom line for me is that property type-hints were supposed to make
my programs more predictable, make the language more reliable.Instead, we've introduced a whole new kind of uncertainties, new ways to
write unpredictable code, a new kind of null that makes the language even
more unreliable.In terms of reliability, it's actually sub-par to hand-written accessors.
Shorter syntax, reflection support, great - but ultimately at the cost of
reliability and predictable state.For one, what good is reflection, if it tells me I can expect a property to
be an integer, and it turns out it doesn't have a value after all?If you want horrible code with no guarantees, there are plenty of ways to
do that already - you don't need property type-hints for anything more than
mere convenience in the first place.
Rasmus, would the "checkpoint validity checks" I suggested in another branch
of this thread ameliorate your concerns, at least to some degree? I think
they're as good as we'd be able to get, given the nature of PHP, but should at
least cover the most common cases.
--Larry Garfield
On Wed, Sep 19, 2018 at 2:38 PM Rowan Collins rowan.collins@gmail.com
wrote:
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
For the record, I still think we will come to regret allowing non-nullable
type hints without any constraint on the author of the class to initialise
them correctly. As proposed, the invalid state may only be detected when it
causes a runtime error in completely unrelated code.I gather that the authors of C# are currently going through a painful
development phase to introduce better support for non-nullable types, and
having to include many compromises and handle many edge-cases because it
was not done earlier. I realise this case is not completely comparable, but
we have an opportunity to get this right first time, and not just take the
easy option.I am therefore going to make one last plea: if we don't yet know how to
assert that complex types are initialised, do not allow them to be
non-nullable in the first version of this feature.That is, allow
class Foo { public ?Foo $foo = null; }
, but notclass Foo { public Foo $foo; }
.This would still be a huge improvement to the language, but leaves us free
to design additional features to prevent Unitialized Property Errors
becoming as hated as Null Pointer Exceptions in Java or C#.
I am sympathetic to the wish of reporting initialization errors during
construction rather than at the use-site. However, there is no clear
proposal on how this will be reconciled with other requirements that we
have, such as support of uninitialized properties for lazy initialization,
etc.
I do not consider it advisable to require a null initialization as a "first
iteration" of the proposal. Regardless of what our intention might be, the
effect of such a restriction will not be "I'm not going to type this
property for now, because non-nullable types are not supported", it's going
to be "I'll give this a nullable type even though it isn't, because that's
better than no type at all." Use of nullable types where they are not
necessary would be a disastrous outcome of this proposal.
To move this discussion forward in a productive direction, we need a
concrete, detailed proposal of how enforcement of initialization should
work, while being compatible with secondary requirements. I believe it goes
without saying that we cannot change the typed properties RFC in such a
substantive way at this point in time. However, if you or someone else can
bring forward an RFC that specifies the precise semantics of initialization
checks as you envision them and which is endorsed by maintainers of
libraries with special initialization requirements, then we still have a
lot of time to discuss and incorporate such a proposal before typed
properties ship in PHP 7.4.
Thanks,
Nikita
I do not consider it advisable to require a null initialization as a
"first iteration" of the proposal. Regardless of what our intention might
be, the effect of such a restriction will not be "I'm not going to type
this property for now, because non-nullable types are not supported", it's
going to be "I'll give this a nullable type even though it isn't, because
that's better than no type at all." Use of nullable types where they are
not necessary would be a disastrous outcome of this proposal.
This, ultimately, is where we disagree. To me, the "non-nullable types"
this proposal provides are non-nullable in name only, and using them does
little more than adding a comment saying "I promise this won't be null".
Encouraging people to use them gives them a false guarantee, and allowing
them to do so prevents us adding a stricter version of the feature later.
To move this discussion forward in a productive direction, we need a
concrete, detailed proposal of how enforcement of initialization should
work, while being compatible with secondary requirements.
It feels to me that many of the "secondary requirements" are actually
separate problems which should be seen as pre-requisites, rather than
constraints. The lazy initialization pattern seems to be a hack around lack
of better support for property accessors. The mention of serializers
needing custom methods of initialisation reminded me of the occasional
discussion of better replacements for Serializable and JsonSerializable.
And so on.
Fixing that list of pre-requisites would obviously take time, which is why
I wanted to buy that time by releasing initialised-only type hints first,
and working towards a greater goal.
However, I've probably beaten this drum long enough. I will hope to be
wrong, and that making things stricter will still be possible later.
Regards,
Rowan Collins
[IMSoP]
On Thu, Sep 20, 2018 at 12:50 PM Rowan Collins rowan.collins@gmail.com
wrote:
Encouraging people to use them gives them a false guarantee, and allowing
them to do so prevents us adding a stricter version of the feature later.
What exactly would prevent us from enforcing it in the future? The way I
see it, whenever that would be possible, we could get rid of the
uninitialized state.
Regards,
Pedro
I do not consider it advisable to require a null initialization as a
"first iteration" of the proposal. Regardless of what our intention might
be, the effect of such a restriction will not be "I'm not going to type
this property for now, because non-nullable types are not supported", it's
going to be "I'll give this a nullable type even though it isn't, because
that's better than no type at all." Use of nullable types where they are
not necessary would be a disastrous outcome of this proposal.This, ultimately, is where we disagree. To me, the "non-nullable types"
this proposal provides are non-nullable in name only, and using them does
little more than adding a comment saying "I promise this won't be null".
Encouraging people to use them gives them a false guarantee, and allowing
them to do so prevents us adding a stricter version of the feature later.To move this discussion forward in a productive direction, we need a
concrete, detailed proposal of how enforcement of initialization should
work, while being compatible with secondary requirements.It feels to me that many of the "secondary requirements" are actually
separate problems which should be seen as pre-requisites, rather than
constraints. The lazy initialization pattern seems to be a hack around lack
of better support for property accessors. The mention of serializers
needing custom methods of initialisation reminded me of the occasional
discussion of better replacements for Serializable and JsonSerializable.
And so on.Fixing that list of pre-requisites would obviously take time, which is why
I wanted to buy that time by releasing initialised-only type hints first,
and working towards a greater goal.However, I've probably beaten this drum long enough. I will hope to be
wrong, and that making things stricter will still be possible later.Regards,
If I may...
I think the distinction here is that one group is arguing for "state of the
data assertions" while the RFC as implemented is "setter assertion shorthand".
That is, it doesn't assert that a value IS a given type, but that it can only
be SET TO a given type.
That naturally has some coverage holes. The question then being "is that
enough?"
I don't think a complete IS enforcement is possible given PHP's nature. We
don't enforce at compile time that you cannot call a function with an
incorrect type; that's enforced at runtime, or by a static analyzer. That is:
function test(int $a) {}
test($_GET['a']);
Cannot possibly be verified at compile time. This isn't a compile time check
either:
$b = 'foo';
test($b);
It's a runtime TypeError, not compile time.
That hasn't brought about the end of the world, so I'm not super worried about
imperfect property checks destroying everything.
That said, I totally appreciate the desire as a consumer of an object to know
that it's read-type-safe. I also get that there's many different pathways by
which an object could get initialized so there's no one clear validation
chokepoint.
But... could we have multiple?
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and then
call it on selected actions automatically and allow users to call it at
arbitrary times if they are doing more esoteric things?
I'm thinking:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
- When a user calls is_fully_initialized($obj); (or something)
is_fully_initialized() would return bool. The other checkpoints would throw a
TypeError.
That would layer on top of the current RFC cleanly, would give us the read-
assurance that we'd like in the 95% case, wouldn't require any new syntax
beyond a single function, and still lets serialization libraries do their
thing.
If needed, perhaps there's a way we can let a serialization tool disable that
check as an opt-out, and then the developer is just on-their-honor to get it
right. If it blows up later, well, file a bug with the serialization library.
In a dynamic runtime language I don't think we'll ever be able to do better
than that.
Would that be viable as a follow-up to the current RFC?
--Larry Garfield
I think the distinction here is that one group is arguing for "state of
the
data assertions" while the RFC as implemented is "setter assertion
shorthand".
That is, it doesn't assert that a value IS a given type, but that it can
only
be SET TO a given type.I don't think a complete IS enforcement is possible given PHP's nature.
...
That hasn't brought about the end of the world, so I'm not super worried
about
imperfect property checks destroying everything.That said, I totally appreciate the desire as a consumer of an object to
know
that it's read-type-safe.
This is a useful distinction, thank you for putting it so clearly.
Perhaps it depends whether you are looking at the feature as a library
author, or a library consumer: as an author, you want to protect "your"
objects from invalid assignments, inside and outside the library; as a
consumer, you want assurance that you can use the object in a particular
way. The current implementation provides a good tool for the author, but
falls short for the consumer.
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and
then
call it on selected actions automatically and allow users to call it at
arbitrary times if they are doing more esoteric things?I'm thinking:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
- When a user calls is_fully_initialized($obj); (or something)
is_fully_initialized() would return bool. The other checkpoints would
throw a
TypeError.
That would layer on top of the current RFC cleanly, would give us the read-
assurance that we'd like in the 95% case
I think this is a really sensible approach: as you say, it may never be
possible to assert the invariant in every case, so checking the most common
scenarios may be the pragmatic thing to do.
Conveniently, treating it as an occasional assertion, rather than a strict
invariant, means we can keep the unset() lazy-initialization hack; it's
still an odd feature IMO, but probably not all that likely to be triggered
by mistake.
Regards,
Rowan Collins
[IMSoP]
I think the distinction here is that one group is arguing for "state of
the
data assertions" while the RFC as implemented is "setter assertion
shorthand".
That is, it doesn't assert that a value IS a given type, but that it can
only
be SET TO a given type.I don't think a complete IS enforcement is possible given PHP's nature.
...
That hasn't brought about the end of the world, so I'm not super worried
about
imperfect property checks destroying everything.That said, I totally appreciate the desire as a consumer of an object to
know
that it's read-type-safe.This is a useful distinction, thank you for putting it so clearly.
Perhaps it depends whether you are looking at the feature as a library
author, or a library consumer: as an author, you want to protect "your"
objects from invalid assignments, inside and outside the library; as a
consumer, you want assurance that you can use the object in a particular
way. The current implementation provides a good tool for the author, but
falls short for the consumer.
Perhaps another disconnect here is that, in practice, the consumer of an
object property is, in my experience, almost always "me". I almost never have
public properties on my objects. On the rare occasion I do, it's for a
"struct object" I'm using internally as a more self-documenting and memory
efficient alternative to a nested associative array.
In either case, if I fail to initialize a variable the only code that would be
impacted is... my own, usually in the same class (or occasionally in a
subclass). Finding the bug then is pretty straightforward, and it's my own
damned fault, but therefore easy to fix.
In my experience at least, most of the modern code I see in the wild is the
same way. That means the potential impact of stray undefined properties
roaming around the code base is really small. Even vaguely reasonably
structured code already avoids this problem. I can't think of anything I've
written in the last few years that wouldn't work this way, even with a
constructor-exit-validation check, without any modification at all.
Naturally if someone is using a lot of public properties in their code the
potential for "undefined" bugs increases. That seems rather uncommon in my
world, though.
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and
then
call it on selected actions automatically and allow users to call it at
arbitrary times if they are doing more esoteric things?I'm thinking:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
- When a user calls is_fully_initialized($obj); (or something)
is_fully_initialized() would return bool. The other checkpoints would
throw a
TypeError.That would layer on top of the current RFC cleanly, would give us the
read-
assurance that we'd like in the 95% caseI think this is a really sensible approach: as you say, it may never be
possible to assert the invariant in every case, so checking the most common
scenarios may be the pragmatic thing to do.Conveniently, treating it as an occasional assertion, rather than a strict
invariant, means we can keep the unset() lazy-initialization hack; it's
still an odd feature IMO, but probably not all that likely to be triggered
by mistake.Regards,
Another benefit, since it would, I think, boil down to a series of isset()
calls implemented in C land the performance impact is probably not measurable.
(Obviously we'd need to measure that... :-) )
--Larry Garfield
Perhaps another disconnect here is that, in practice, the consumer of an
object property is, in my experience, almost always "me". I almost never
have
public properties on my objects. On the rare occasion I do, it's for a
"struct object" I'm using internally as a more self-documenting and memory
efficient alternative to a nested associative array.
My impression is that people will be encouraged by this feature to
implement more value objects with public properties, where they currently
have getters and setters doing nothing but type checks; and thus more code
will be exposed to uninitialized properties if there is a bug in a
constructor.
Indeed, it's arguable that if all properties were private, there would be
no need to enforce type hints all, since analysis of where they were set
would be trivial.
Regards,
Rowan Collins
[IMSoP]
I apologize for the long posts, but Larry asked me to comment on this.
I think the distinction here is that one group is arguing for "state of the
data assertions" while the RFC as implemented is "setter assertion shorthand".
The point of the setter assertions is to provide guarantees about the state of
the data - there is literally no difference.
That is, it doesn't assert that a value IS a given type, but that it can only
be SET TO a given type.
That is literally the same thing - if it can only be set to a given
type, is can only
BE a given type.
The problem here is you want to make exemptions for null as type - as though
nulls aren't types (which they are) and as though nullable types aren't distinct
from the types they're derived from. (again, they are.)
I don't think a complete IS enforcement is possible given PHP's nature.
I don't think that, and I don't expect that - I'm not suggesting we enforce
anything statically, I'm merely suggesting that a constructor needs to satisfy
the constraints specified by the class.
If you've type-hinted a property as int, you don't allow strings - that would be
pointless, right?
But that is precisely the problem we're talking about - if you've type-hinted a
property as non-nullable int, it shouldn't allow null... but that is
literally the
exemption you want to allow for - you just want to annotate the property
with an additional meta-data property "unintialized", which is literally the
same as adding a boolean "is set" for every property in your model.
This isn't a compile time check
either:$b = 'foo';
test($b);It's a runtime TypeError, not compile time.
This actually illustrates my point nicely - the variable $b in this
case has nothing
to do with the state of the parameter var inside the function body,
until you pass
it via a function call. That's perfectly fine. You'll get an error at
the point where
the parameter type-constraint was violated - you can look at a stack-trace and
debug that easily.
Now take your example and call a constructor instead of a function:
class Test {
public string $b;
}
$b = 'foo';
$test = new Test();
Your constructor call generates no run-time error here, and the program
continues to execute with $test in a state that, according to specification of
the class itself, isn't valid.
Since the constructor is how a valid instance of Test gets generated in the
first place, you've missed your only chance to enforce those constraints -
meaning, you've missed the only opportunity you had to verify that the
constructor does in deed generate an instance of Test that lives up to
Test's own specification of what a Test instance is.
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and then
call it on selected actions automatically and allow users to call it at
arbitrary times if they are doing more esoteric things?
In my opinion, this is a solution to the problem we created when we decided
every property should internally be annotated with an "initialized" boolean
property - you have all this extra state that wasn't part of the specification
of the class itself, and now you have to deal with that state.
In my opinion, tracking this extra state during the constructor call is
acceptable and necessary in a scripting language like PHP - I never
asked for static type-checking, I'm merely asking for this check to be
built-into the language and performed at run-time when you exit the
constructor, e.g. at the last moment where you can feasibly perform
this check.
If you don't perform that check, you have no guarantees at all.
Consider this example:
function login(int $user_id) {
// ...
}
Inside the body of this function, you can safely assume you have an integer,
right? A basic guarantee provided by the language.
Now consider this alternative:
function login(User $user) {
// ...
}
Inside the body of this function, you'd assume you have a valid instance
of User, right? The same basic guarantee provided by the language.
Wrong. Something might be "unintialized". This might in fact not be a valid
instance of User, it might be only partially initialized.
This is a problem, because I want my type User to be a reliable type - as
reliable as any other type.
Integers for example is nice, because I know I'm going to be able to add,
subtract, multiple, divide, and so on - if you've type-hinted a parameter as
int, you know it has those abilities, you know the + operator isn't going to
fail because there's a special kind of int that doesn't work with the
- operator.
We want those same guarantees from our own types, that's all.
In the case where a type doesn't provide such a guarantee, it can explicitly
state that something is nullable - not guaranteed to be set. You have that
option already, you don't need unintialized properties for that.
If there really is a use-case for "optional values that must not
be removed again once they've been set" - and I can't think of one - but
if there is, you can handle that rare case with logic in getters/setters.
The normal everyday use-case is you want to know if a property is going
to be set or not - hence nullable property-types, which are part of the RFC.
Userland work-arounds for basic guarantees that the language should be
providing, in my opinion, are totally unacceptable - in a nutshell, because
basic guarantees provided for other types won't apply to objects.
It's incoherent with the workings of type-hints in the language, and will
make those substantially less useful.
That is, it doesn't assert that a value IS a given type, but that it can only
be SET TO a given type.
That is literally the same thing - if it can only be set to a given
type, it can only BE a given type.
Consider this analogy: if I build a house with a standard, human-sized
door, I cannot walk an elephant through that door; however, if the
elephant was already there, I could build the house around it. So, the
assertion that "no elephant can get in through that door" is not the
same as the assertion "there is no elephant inside that house". If
you're interested in protecting your house from roaming elephants, the
small door is probably sufficient for your needs; if you want to know
that when you walk into a house, there is no elephant inside, you need
other assurances.
What Larry is pointing out is that Levi and Marco are happy to make the
door elephant-proof (prevent nulls being assigned to the properties);
whereas you and I are looking for a stronger guarantee that there will
never be an elephant inside (that the object will not be in an invalid
state).
I don't think that, and I don't expect that - I'm not suggesting we enforce
anything statically, I'm merely suggesting that a constructor needs to satisfy
the constraints specified by the class.
If you read carefully, that's exactly what Larry's proposal below requires.
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and then
call it on selected actions automatically and allow users to call it at
arbitrary times if they are doing more esoteric things?
In my opinion, this is a solution to the problem we created when we decided
every property should internally be annotated with an "initialized" boolean
property - you have all this extra state that wasn't part of the specification
of the class itself, and now you have to deal with that state.In my opinion, tracking this extra state during the constructor call is
acceptable and necessary in a scripting language like PHP - I never
asked for static type-checking, I'm merely asking for this check to be
built-into the language and performed at run-time when you exit the
constructor, e.g. at the last moment where you can feasibly perform
this check.
I'm not sure if you've misunderstood Larry's proposal, or are just
agreeing with it: the sentence you quote says "add an engine check" and
"call it on selected actions automatically"; and you ask for it to be
"built-into the language and performed at run-time"; you mention it
happening "when you exit the constructor", and just below the part you
quote, so does he:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
The key compromise, however, is that this still doesn't guarantee that
there is no elephant in the house: there will be ways to create an
object that don't go through the constructor, so won't trigger this
check; and ways to manipulate an object that will put it into an invalid
state. Not to mention that inside the constructor, $this will be usable
as a normal object, unlike in a true 2-phase initialisation system like
Swift's.
In short, we will still need runtime errors for attempting to read an
uninitialized property, but they will be much less likely to happen
accidentally and show up a long way from their cause.
I would be interested to hear if there are any use cases that this would
break - a static factory method can defer as much initialisation as
needed to the constructor, and cases where the constructor is bypassed
will also bypass the check.
Regards,
--
Rowan Collins
[IMSoP]
Rasmus: Please be careful when you refer to "you" as wanting or doing
something. I had no part in the patch design or implementation beyond saying
"yay!" a few times on the list/twitter. "I" don't want a particular engine
pseudo-type (uninitialized); I honestly don't care about that level of
implementation detail. I am only trying to offer suggestions for how to
tighten the guarantees further without violating the basic facts of how PHP
works.
That is, it doesn't assert that a value IS a given type, but that it can
only be SET TO a given type.That is literally the same thing - if it can only be set to a given
type, it can only BE a given type.Consider this analogy: if I build a house with a standard, human-sized
door, I cannot walk an elephant through that door; however, if the
elephant was already there, I could build the house around it. So, the
assertion that "no elephant can get in through that door" is not the
same as the assertion "there is no elephant inside that house". If
you're interested in protecting your house from roaming elephants, the
small door is probably sufficient for your needs; if you want to know
that when you walk into a house, there is no elephant inside, you need
other assurances.What Larry is pointing out is that Levi and Marco are happy to make the
door elephant-proof (prevent nulls being assigned to the properties);
whereas you and I are looking for a stronger guarantee that there will
never be an elephant inside (that the object will not be in an invalid
state).
LOL. Rowan, I love that analogy. It expresses the problem space very well.
I don't think that, and I don't expect that - I'm not suggesting we
enforce
anything statically, I'm merely suggesting that a constructor needs to
satisfy the constraints specified by the class.If you read carefully, that's exactly what Larry's proposal below requires.
To wit, could we add an engine check to scan an object and make sure its
objects are all type valid right-now (viz, nothing is unitialized), and
then call it on selected actions automatically and allow users to call
it at arbitrary times if they are doing more esoteric things?In my opinion, this is a solution to the problem we created when we
decided
every property should internally be annotated with an "initialized"
boolean
property - you have all this extra state that wasn't part of the
specification of the class itself, and now you have to deal with that
state.In my opinion, tracking this extra state during the constructor call is
acceptable and necessary in a scripting language like PHP - I never
asked for static type-checking, I'm merely asking for this check to be
built-into the language and performed at run-time when you exit the
constructor, e.g. at the last moment where you can feasibly perform
this check.I'm not sure if you've misunderstood Larry's proposal, or are just
agreeing with it: the sentence you quote says "add an engine check" and
"call it on selected actions automatically"; and you ask for it to be
"built-into the language and performed at run-time"; you mention it
happening "when you exit the constructor", and just below the part youquote, so does he:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
The key compromise, however, is that this still doesn't guarantee that
there is no elephant in the house: there will be ways to create an
object that don't go through the constructor, so won't trigger this
check; and ways to manipulate an object that will put it into an invalid
state. Not to mention that inside the constructor, $this will be usable
as a normal object, unlike in a true 2-phase initialisation system like
Swift's.In short, we will still need runtime errors for attempting to read an
uninitialized property, but they will be much less likely to happen
accidentally and show up a long way from their cause.I would be interested to hear if there are any use cases that this would
break - a static factory method can defer as much initialisation as
needed to the constructor, and cases where the constructor is bypassed
will also bypass the check.Regards,
Yes, Rowan has it exactly. To continue the analogy, I'm proposing that, once
the house is built, you take a moment to look inside and say "Holy crap,
there's an elephant in here!"
In more code terms, this is an extremely common pattern in PHP today (using
the post-RFC syntax):
class Doer {
protected Processor $processor;
public function __construct(Processor $p) {
$this->processor = $p;
}
// ...
}
This code is obviously valid, because while there is a moment at which the
Doer::$processor value is null/unintialized/not-a-Processor-object, it's a
short-lived moment and by the time anyone other than the class author cares
marked as nullable would, in fact, be a worse guarantee.
the constructor it would only blow up in my face, not someone else's.
class DoerOne {
public Processor $processor;
$d1 = new DoerOne();
$d->processor;
class DoerTwo {
public function processor() : Processor { return null; }
$d2 = new DoerTwo();
$d2->processor();
Both $d1->processor and $d2->processor() will fail with a TypeError. In both
cases it's very clearly and obviously the fault of the Doer class's author. In
both cases the TypeError happens on access, and I can work back from there to
figure out where it should have been set. In both cases my course of action
It can be done one better, though, by adding an is_there_an_elephpant() check
at key points. Effectively, it becomes logically equivalent to:
if (is_there_an_elephant($this)) {
throw new TypeError('Bad, Elephant, no cookie!');
}
Serializable::unserialize(), or whatever else. (By analogy, also checking
that an elephant didn't get in while the window was open.)
public Address $streetAddress;
}
Because you would at least need a constructor to set $streetAddress to an
empty Address value. Which... I am OK with and consider a good trade-off.
PS: It was so, SO hard to write elephant and not elePHPant throughout this
post... :-)
Larry,
this wasn't aimed at "you" personally, I'm just using the proverbial "you",
as in "not me".
if you read my last post (especially the last part) carefully, you'll see
why this elephant analogy is incomplete.
the issue is not whether or not something gets in - it's much more far
reaching than that.
the issue is, once something gets in, can you even be sure that that
something is what it claims to be?
at the moment you can't, and that's a serious issue - type hints appear to
provide some guarantees that in fact aren't provided at all. it's
confusing, and the explanations are more complex than they need to be.
(and I guess that's generally an issue in very dynamic languages, but type
hints are supposed to provide a means to improve on these problems where it
makes sense - not make the problems worse.)
Rasmus: Please be careful when you refer to "you" as wanting or doing
something. I had no part in the patch design or implementation beyond
saying
"yay!" a few times on the list/twitter. "I" don't want a particular
engine
pseudo-type (uninitialized); I honestly don't care about that level of
implementation detail. I am only trying to offer suggestions for how to
tighten the guarantees further without violating the basic facts of how
PHP
works.That is, it doesn't assert that a value IS a given type, but that it
can
only be SET TO a given type.That is literally the same thing - if it can only be set to a given
type, it can only BE a given type.Consider this analogy: if I build a house with a standard, human-sized
door, I cannot walk an elephant through that door; however, if the
elephant was already there, I could build the house around it. So, the
assertion that "no elephant can get in through that door" is not the
same as the assertion "there is no elephant inside that house". If
you're interested in protecting your house from roaming elephants, the
small door is probably sufficient for your needs; if you want to know
that when you walk into a house, there is no elephant inside, you need
other assurances.What Larry is pointing out is that Levi and Marco are happy to make the
door elephant-proof (prevent nulls being assigned to the properties);
whereas you and I are looking for a stronger guarantee that there will
never be an elephant inside (that the object will not be in an invalid
state).LOL. Rowan, I love that analogy. It expresses the problem space very
well.I don't think that, and I don't expect that - I'm not suggesting we
enforce
anything statically, I'm merely suggesting that a constructor needs to
satisfy the constraints specified by the class.If you read carefully, that's exactly what Larry's proposal below
requires.To wit, could we add an engine check to scan an object and make sure
its
objects are all type valid right-now (viz, nothing is unitialized),
and
then call it on selected actions automatically and allow users to call
it at arbitrary times if they are doing more esoteric things?In my opinion, this is a solution to the problem we created when we
decided
every property should internally be annotated with an "initialized"
boolean
property - you have all this extra state that wasn't part of the
specification of the class itself, and now you have to deal with that
state.In my opinion, tracking this extra state during the constructor call
is
acceptable and necessary in a scripting language like PHP - I never
asked for static type-checking, I'm merely asking for this check to be
built-into the language and performed at run-time when you exit the
constructor, e.g. at the last moment where you can feasibly perform
this check.I'm not sure if you've misunderstood Larry's proposal, or are just
agreeing with it: the sentence you quote says "add an engine check" and
"call it on selected actions automatically"; and you ask for it to be
"built-into the language and performed at run-time"; you mention it
happening "when you exit the constructor", and just below the part youquote, so does he:
- on __construct() exit.
- on __wakeup() exit.
- Possibly other similar checkpoints.
The key compromise, however, is that this still doesn't guarantee that
there is no elephant in the house: there will be ways to create an
object that don't go through the constructor, so won't trigger this
check; and ways to manipulate an object that will put it into an invalid
state. Not to mention that inside the constructor, $this will be usable
as a normal object, unlike in a true 2-phase initialisation system like
Swift's.In short, we will still need runtime errors for attempting to read an
uninitialized property, but they will be much less likely to happen
accidentally and show up a long way from their cause.I would be interested to hear if there are any use cases that this would
break - a static factory method can defer as much initialisation as
needed to the constructor, and cases where the constructor is bypassed
will also bypass the check.Regards,
Yes, Rowan has it exactly. To continue the analogy, I'm proposing that,
once
the house is built, you take a moment to look inside and say "Holy crap,
there's an elephant in here!"In more code terms, this is an extremely common pattern in PHP today
(using
the post-RFC syntax):class Doer {
protected Processor $processor;
public function __construct(Processor $p) {
$this->processor = $p;
}
// ...
}This code is obviously valid, because while there is a moment at which the
Doer::$processor value is null/unintialized/not-a-Processor-object, it's a
short-lived moment and by the time anyone other than the class author
cares
marked as nullable would, in fact, be a worse guarantee.the constructor it would only blow up in my face, not someone else's.
class DoerOne {
public Processor $processor;
$d1 = new DoerOne();
$d->processor;
class DoerTwo {
public function processor() : Processor { return null; }
$d2 = new DoerTwo();
$d2->processor();Both $d1->processor and $d2->processor() will fail with a TypeError. In
both
cases it's very clearly and obviously the fault of the Doer class's
author. In
both cases the TypeError happens on access, and I can work back from there
to
figure out where it should have been set. In both cases my course of
action
It can be done one better, though, by adding an is_there_an_elephpant()
check
at key points. Effectively, it becomes logically equivalent to:
if (is_there_an_elephant($this)) {
throw new TypeError('Bad, Elephant, no cookie!');
}
Serializable::unserialize(), or whatever else. (By analogy, also checking
that an elephant didn't get in while the window was open.)
public Address $streetAddress;
}
Because you would at least need a constructor to set $streetAddress to an
empty Address value. Which... I am OK with and consider a good trade-off.
PS: It was so, SO hard to write elephant and not elePHPant throughout this
post... :-)
Larry,
this wasn't aimed at "you" personally, I'm just using the proverbial "you",
as in "not me".if you read my last post (especially the last part) carefully, you'll see
why this elephant analogy is incomplete.the issue is not whether or not something gets in - it's much more far
reaching than that.the issue is, once something gets in, can you even be sure that that
something is what it claims to be?
.. Yes?
at the moment you can't, and that's a serious issue - type hints appear to
provide some guarantees that in fact aren't provided at all. it's
confusing, and the explanations are more complex than they need to be.(and I guess that's generally an issue in very dynamic languages, but type
hints are supposed to provide a means to improve on these problems where it
makes sense - not make the problems worse.)
Do you have a code sample to explain what you mean? At the moment I really
have a hard time envisioning what the pitfall is, especially if we were to add
checkpoint validity checks.
What's the code that would appear safe but really isn't? Please show us,
because I don't know what it is.
--Larry Garfield
if you read my last post (especially the last part) carefully, you'll
see
why this elephant analogy is incomplete.the issue is not whether or not something gets in - it's much more far
reaching than that.the issue is, once something gets in, can you even be sure that that
something is what it claims to be?
That is the entire point of the elephant analogy: that knowing what can get in doesn't necessarily mean knowing what is already inside - BUT, knowing what can get in may still useful in itself.
The positions being expressed are therefore, roughly:
a) That it's not the job of this feature to prove what is inside, only to guard the door. (That is, that the current implementation is sufficient.)
b) That it is vital to always know what is inside, regardless of how it got there. (That is, that we must prevent all mechanisms where the value is uninitialised.)
c) That there will always be some ways for the wrong thing to end up inside, but that we can add checks at key moments to see if that's happened. (That is, that we should detect uninitialised values automatically at the end of the constructor, and in similar places, but that there well be other ways that uninitialised values can come about.)
In your last message, you seemed to be accepting position c - that not all scenarios could be prevented, but that the common case of the constructor should be checked. That is the same position Larry is suggesting, so I'm not sure why you seem keen to disagree with him.
Regards,
--
Rowan Collins
[IMSoP]
That is the entire point of the elephant analogy: that knowing what can
get in doesn't necessarily mean knowing what is already inside - BUT,
knowing what can get in may still useful in itself.
I understood that, and I disagree - just knowing what can get in is not
useful or interesting in itself. It's not even a new language feature - you
can already achieve the same with a simple setter. It's just new syntax.
The only reason it's interesting to control what can get inside with type
hints, is to enable the language to do something new - something it
couldn't already do, which is to guarantee that the type enforces it's own
state specification.
I understood the analogy, and I don't agree.
On 22 September 2018 20:32:04 BST, Rasmus Schultz rasmus@mindplay.dk
wrote:if you read my last post (especially the last part) carefully, you'll
see
why this elephant analogy is incomplete.the issue is not whether or not something gets in - it's much more far
reaching than that.the issue is, once something gets in, can you even be sure that that
something is what it claims to be?That is the entire point of the elephant analogy: that knowing what can
get in doesn't necessarily mean knowing what is already inside - BUT,
knowing what can get in may still useful in itself.The positions being expressed are therefore, roughly:
a) That it's not the job of this feature to prove what is inside, only to
guard the door. (That is, that the current implementation is sufficient.)b) That it is vital to always know what is inside, regardless of how it
got there. (That is, that we must prevent all mechanisms where the value is
uninitialised.)c) That there will always be some ways for the wrong thing to end up
inside, but that we can add checks at key moments to see if that's
happened. (That is, that we should detect uninitialised values
automatically at the end of the constructor, and in similar places, but
that there well be other ways that uninitialised values can come about.)In your last message, you seemed to be accepting position c - that not all
scenarios could be prevented, but that the common case of the constructor
should be checked. That is the same position Larry is suggesting, so I'm
not sure why you seem keen to disagree with him.Regards,
--
Rowan Collins
[IMSoP]
That is the entire point of the elephant analogy: that knowing what can
get in doesn't necessarily mean knowing what is already inside - BUT,
knowing what can get in may still useful in itself.I understood that, and I disagree - just knowing what can get in is not
useful or interesting in itself. It's not even a new language feature - you
can already achieve the same with a simple setter. It's just new syntax.The only reason it's interesting to control what can get inside with type
hints, is to enable the language to do something new - something it
couldn't already do, which is to guarantee that the type enforces it's own
state specification.I understood the analogy, and I don't agree.
You previously listed this as an example of problematic code:
class Test {
public string $b;
}
$b = 'foo';
$test = new Test();
And I agree, that's problematic, just not as show-stopping as you seem to feel
it is. Which is why I suggested adding a post-constructor-validation-check,
in which case the above code would fail with a TypeError on the last line (new
Test()), since it reached the end of the constructor without fulfilling its
property validation requirements. Because that is, in the end, the best we
can do without radical changes to how the language works.
Basically, we have 3 options:
-
Object properties may not ever be type hinted except as nullable, which is
kinda useless. -
Object properties may be type hinted but a sloppy class author may still
leave them undefined/null if they're careless, which is the RFC today. (How
much of a problem you think this is depends on how sloppy you expect class
authors to be.) -
Object properties may be type hinted and the class author has until the end
of the constructor to make sure they're fulfilled, otherwise TypeError on the
spot (what I'm proposing).
If option 3 is insufficient, please explain why. You have yet to comment on
that, just say the current RFC is awful. If adding the extra checks is
insufficient, offer an alternative. I cannot think of one that doesn't
involve adopting Rust's type system. (Which would be kinda cool, but also
horribly impractical to do within the next decade.)
--Larry Garfield
Em dom, 23 de set de 2018 às 13:09, Larry Garfield larry@garfieldtech.com
escreveu:
- Object properties may be type hinted and the class author has until the
end
of the constructor to make sure they're fulfilled, otherwise TypeError on
the
spot (what I'm proposing).
Item 3 could not be enough, for instance:
<<<
class Example {
public /* string */ $string;
public function __construct() { static:: stringParameter
($this->string); }
public static function stringParameter(string $string) {}
}
new Example;
It will fail because __construct() uses an uninitialized non-nullable
property.
Maybe it should throw a new Exception like UninitializedPropertyException
when you try to read a uninitialized property like this case.
--
David Rodrigues
Em dom, 23 de set de 2018 às 13:09, Larry Garfield larry@garfieldtech.com
escreveu:
- Object properties may be type hinted and the class author has until the
end
of the constructor to make sure they're fulfilled, otherwise TypeError on
the
spot (what I'm proposing).Item 3 could not be enough, for instance:
<<<
class Example {
public /* string */ $string;
public function __construct() { static:: stringParameter
($this->string); }
public static function stringParameter(string $string) {}
}new Example;
It will fail because __construct() uses an uninitialized non-nullable
property.
Yes, there are various cases where checking at the end of the
constructor is not enough. The general consensus is that we can't catch
all of them without major changes to the language, but can catch some of
the more common scenarios earlier than the current implementation does.
Maybe it should throw a new Exception like UninitializedPropertyException
when you try to read a uninitialized property like this case.
That's exactly what the current implementation will do, and what Larry's
proposed addition would continue doing in cases where it can't be
spotted at a "checkpoint" like the end of the constructor.
Regards,
--
Rowan Collins
[IMSoP]
- Object properties may be type hinted and the class author has until the end
of the constructor to make sure they're fulfilled, otherwise TypeError on the
spot (what I'm proposing).
Just to be sure you don’t miss the herd that this elephant is concealing:
In addition, you must forbid unset() on those properties...
—Claude
- Object properties may be type hinted and the class author has until the end
of the constructor to make sure they're fulfilled, otherwise TypeError on the
spot (what I'm proposing).
Just to be sure you don’t miss the herd that this elephant is concealing:In addition, you must forbid unset() on those properties...
We "must" forbid this IF we aim to guarantee that the object never has
uninitialised properties; but the current consensus is that we can't
make such a guarantee without changing a lot of other parts of the language.
There are strong feelings that unset should be available for use in
lazy-initialisation hacks, so this is likely to remain one of the
back-doors which will let elephants in, unless and until someone comes
up with a replacement for that hack.
Regards,
--
Rowan Collins
[IMSoP]
- Object properties may be type hinted and the class author has
until the end
of the constructor to make sure they're fulfilled, otherwise
TypeError on the
spot (what I'm proposing).
Just to be sure you don’t miss the herd that this elephant is concealing:In addition, you must forbid unset() on those properties...
We "must" forbid this IF we aim to guarantee that the object never has
uninitialised properties; but the current consensus is that we can't
make such a guarantee without changing a lot of other parts of the
language.There are strong feelings that unset should be available for use in
lazy-initialisation hacks, so this is likely to remain one of the
back-doors which will let elephants in, unless and until someone comes
up with a replacement for that hack.
In my opinion, explicitly declared properties should not be
unsettable. We don't allow to undefine constants, functions, classes
etc. either. Adding and removing other properties could still be allowed.
--
Christoph M. Becker
In my opinion, explicitly declared properties should not be
unsettable. We don't allow to undefine constants, functions, classes
etc. either. Adding and removing other properties could still be allowed.
While I agree, I think that's a somewhat separate discussion: right now,
you can unset() a declared property, and that is handled differently
from setting it to null. Weird though it is, there is widely-used code
which actively relies on this fact, so it's not something we can just "fix".
The solution might be some way of marking a class as "completely
specified" - no dynamic properties may be added, no declared properties
may be unset; but like I say, that's a separate discussion.
Regards,
--
Rowan Collins
[IMSoP]
On Sun, Sep 23, 2018 at 10:08 PM Rowan Collins rowan.collins@gmail.com
wrote:
- Object properties may be type hinted and the class author has until
the end
of the constructor to make sure they're fulfilled, otherwise TypeError
on the
spot (what I'm proposing).
Just to be sure you don’t miss the herd that this elephant is concealing:In addition, you must forbid unset() on those properties...
We "must" forbid this IF we aim to guarantee that the object never has
uninitialised properties; but the current consensus is that we can't
make such a guarantee without changing a lot of other parts of the
language.There are strong feelings that unset should be available for use in
lazy-initialisation hacks, so this is likely to remain one of the
back-doors which will let elephants in, unless and until someone comes
up with a replacement for that hack.
There might be a compromise here, which is to only perform a ctor
initialization check and forbid explicit unset()s if the class does not use
property accessors (i.e. does not define __get). This allows the lazy
initialization pattern but is stricter for everything else. (Possibly __get
in subclasses would also count.)
Nikita
There might be a compromise here, which is to only perform a ctor
initialization check and forbid explicit unset()s if the class does not use
property accessors (i.e. does not define __get). This allows the lazy
initialization pattern but is stricter for everything else. (Possibly __get
in subclasses would also count.)
Hm, that's an interesting suggestion, but I think it might get a bit
complicated and confusing.
There's also the paradoxical fact that you can implement __get() to make
your classes stricter: public function __get($prop) { throw new
\LogicException("Attempt to access non-existent property $prop"); }
I think I'd prefer to leave the unset() behaviour in, and then:
a) come up with a better mechanism for lazy-initialisation
and b) come up with a way for classes to opt out of having any dynamic
property manipulation (no unset, no reads or writes to undeclared
properties)
I might start a thread to bikeshed an appropriate mechanism for (b) because
it's something I've wanted for a long time.
Regards,
Rowan Collins
[IMSoP]
Just to be sure you don’t miss the herd that this elephant is concealing:
In addition, you must forbid unset() on those properties...
Shouldn't we delegate the whole problem to object type resolving and make it more strict? Right now properties are not guaranteed at all - you can
have Foo
class with $bar
property, but it does not mean that instance of Foo
will actually have this property. The current implementation of
typed properties seems to be pretty consistent with this. Type check gives you nothing as far as the object's properties are concerned.
But if $foo instanceof Foo
or function (Foo $foo)
will test that $foo
:
- is an instance of
Foo
, - has all properties defined in
Foo
, - all typehinted properties are initialized,
then the problem will basically disappear - you can protect yourself from propagating uninitialized object by typehints or type checks (which you
would probably do anyway). And you still can create an uninitialized object for lazy initialization or whatever you want.
--
Regards,
Robert Korulczyk
Wrong reply button :(
That is the entire point of the elephant analogy: that knowing what can
get in doesn't necessarily mean knowing what is already inside - BUT,
knowing what can get in may still useful in itself.I understood that, and I disagree - just knowing what can get in is not
useful or interesting in itself. It's not even a new language feature - you
can already achieve the same with a simple setter. It's just new syntax.The only reason it's interesting to control what can get inside with type
hints, is to enable the language to do something new - something it
couldn't already do, which is to guarantee that the type enforces it's own
state specification.I understood the analogy, and I don't agree.
In other words it's yet another bodge to add strict typing without
properly addressing the real problem. Adding 'types' to properties only
covers the properties that are supplied via that interface? I think that
is what is being proposed here? Internally the object may have other
variables that depend on the supplied properties, or be populated from
another interface such as a database in my case. THOSE variables need to
be managed in the same way as properties, and magic code like setters or
PROPER handling of variables is needed to ensure that the variables are
suitable TO initialize a variable. All these little patches to making
PHP strict STILL need all of the old code to validate that the data is
actually usable!
PLEASE can we get back to making variables manage their own VALIDITY
rather than just some arbitrary concept of type!
--
Lester Caine - G8HFL
Contact - https://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - https://lsces.co.uk
EnquirySolve - https://enquirysolve.com/
Model Engineers Digital Workshop - https://medw.co.uk
Rainbow Digital Media - https://rainbowdigitalmedia.co.uk
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
Bob and Nikita
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize and
merge the implementation in the next few days.
Regards,
Bob and Nikita
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
Bob and Nikita
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize and
merge the implementation in the next few days.
\o/
This is a great news !
At AFUP (French Php User Group) we all wanted to +1.
Thanks for your work on this
Le mer. 26 sept. 2018 à 15:46, Nikita Popov nikita.ppv@gmail.com a écrit :
Hey,
As announced, we are starting the vote on typed properties today.
The voting period is two weeks, until sometime in the evening on Tuesday
25-09-2018.Please find the RFC at https://wiki.php.net/rfc/typed_properties_v2.
Bob and Nikita
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize and
merge the implementation in the next few days.Regards,
Bob and Nikita
--
Mathieu GIRARD, AFUP - French UG
http://php-internals.afup.org/
Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize and
merge the implementation in the next few days.
Any update on when this will be merged? Thanks!
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:
Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and...
...Typed properties are now merged into master! 1
It would be great if people could start experimenting with typed properties
(e.g. by migrating over existing @var annotations and seeing what happens).
The earlier we find issues in the implementation or semantics, the better.
Thanks to everyone who worked on this RFC and participated in discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.
Regards,
Nikita
Hi Nikita,
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize
and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and......Typed properties are now merged into master! [1]
It would be great if people could start experimenting with typed properties
(e.g. by migrating over existing @var annotations and seeing what happens).
The earlier we find issues in the implementation or semantics, the better.Thanks to everyone who worked on this RFC and participated in discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.Regards,
Nikita[1]:
https://github.com/php/php-src/commit/e219ec144ef6682b71e135fd18654ee1bb4676b4
Great news! Just in time to ruin the weekend with some OSS work :D
Marco Pivetta
Thank you all!
Is the Perfomance section of the RFC updated?
https://wiki.php.net/rfc/typed_properties_v2#performance would be great to
know now that the patch is final and merged
Hi Nikita,
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been
accepted
with 70 votes in favor and one vote against. We will work to finalize
and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and......Typed properties are now merged into master! [1]
It would be great if people could start experimenting with typed
properties
(e.g. by migrating over existing @var annotations and seeing what
happens).
The earlier we find issues in the implementation or semantics, the
better.Thanks to everyone who worked on this RFC and participated in
discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.Regards,
Nikita[1]:
https://github.com/php/php-src/commit/e219ec144ef6682b71e135fd18654ee1bb4676b4
Great news! Just in time to ruin the weekend with some OSS work :D
Marco Pivetta
Hi Nikita,
While playing with typed properties, something that annoys me a lot, in the
context of a data mapper, is not to be able to use isset() to check whether
an object's property is initialized. The issue was already there with
properties that have been explicitly unset(), but will be exacerbated by
typed properties, which will be unset by default.
Sure, we have ReflectionProperty::isInitialized(), but reflection is slow
compared to direct property access, when reading a lot of objects.
For example, check these 3 ways of reading initialized public/protected
object properties by name, along with their timings for 100,000 iterations
(benchmark here
https://gist.github.com/BenMorel/9a920538862e4df0d7041f8812f069e5#file-reflection-vs-array-cast-benchmark-php
):
Using reflection: (890 ms, down to 384 ms using pre-instantiated
ReflectionProperty objects)
$r = new \ReflectionClass($class);
foreach ($props as $prop) { $p = $r->getProperty($prop); $p->setAccessible(true); if ($p->isInitialized($object)) { $values[$p->getName()] = $p->getValue($object); } }
Using array cast: (193 ms)
foreach ((array) $object as $key => $value) {
// Remove the "\0*\0" in front of protected properties $pos = strrpos($key, "\0"); if ($pos !== false) { $key = substr($key, $pos + 1); } $values[$key] = $value; }
Using a bound closure and isset(): (145 ms)
(function() use ($props, & $values) { foreach ($props as $prop) { if (isset($this->{$prop})) { // skips `NULL` values as well :-( $values[$prop] = $this->{$prop}; } } })->bindTo($object, $object)();
Unfortunately, while the last approach is the fastest, and IMO the cleanest
one, it is currently unusable because there is no way to differentiate
between an uninitialized property and a NULL
property.
Would it be possible to introduce another isset() operator, that would
return true for NULL
values?
Note that this would also be useful when checking if an array key exists,
instead of having to switch from isset() to array_key_exists()
when the
array may contain NULL
values.
Ben
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been accepted
with 70 votes in favor and one vote against. We will work to finalize
and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and......Typed properties are now merged into master! [1]
It would be great if people could start experimenting with typed properties
(e.g. by migrating over existing @var annotations and seeing what happens).
The earlier we find issues in the implementation or semantics, the better.Thanks to everyone who worked on this RFC and participated in discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.Regards,
Nikita[1]:
https://github.com/php/php-src/commit/e219ec144ef6682b71e135fd18654ee1bb4676b4
Heya,
Reflection is not slow: it is only slow if you instantiate it continuously.
Hi Nikita,
While playing with typed properties, something that annoys me a lot, in the
context of a data mapper, is not to be able to use isset() to check whether
an object's property is initialized. The issue was already there with
properties that have been explicitly unset(), but will be exacerbated by
typed properties, which will be unset by default.Sure, we have ReflectionProperty::isInitialized(), but reflection is slow
compared to direct property access, when reading a lot of objects.For example, check these 3 ways of reading initialized public/protected
object properties by name, along with their timings for 100,000 iterations
(benchmark here
<
https://gist.github.com/BenMorel/9a920538862e4df0d7041f8812f069e5#file-reflection-vs-array-cast-benchmark-php):
Using reflection: (890 ms, down to 384 ms using pre-instantiated
ReflectionProperty objects)$r = new \ReflectionClass($class);
foreach ($props as $prop) { $p = $r->getProperty($prop); $p->setAccessible(true); if ($p->isInitialized($object)) { $values[$p->getName()] = $p->getValue($object); } }
Using array cast: (193 ms)
foreach ((array) $object as $key => $value) {
// Remove the "\0*\0" in front of protected properties $pos = strrpos($key, "\0"); if ($pos !== false) { $key = substr($key, $pos + 1); } $values[$key] = $value; }
Using a bound closure and isset(): (145 ms)
(function() use ($props, & $values) { foreach ($props as $prop) { if (isset($this->{$prop})) { // skips `NULL` values as well :-( $values[$prop] = $this->{$prop}; } } })->bindTo($object, $object)();
Unfortunately, while the last approach is the fastest, and IMO the cleanest
one, it is currently unusable because there is no way to differentiate
between an uninitialized property and aNULL
property.Would it be possible to introduce another isset() operator, that would
return true forNULL
values?
Note that this would also be useful when checking if an array key exists,
instead of having to switch from isset() toarray_key_exists()
when the
array may containNULL
values.Ben
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been
accepted
with 70 votes in favor and one vote against. We will work to finalize
and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and......Typed properties are now merged into master! [1]
It would be great if people could start experimenting with typed
properties
(e.g. by migrating over existing @var annotations and seeing what
happens).
The earlier we find issues in the implementation or semantics, the
better.Thanks to everyone who worked on this RFC and participated in
discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.Regards,
Nikita[1]:
https://github.com/php/php-src/commit/e219ec144ef6682b71e135fd18654ee1bb4676b4
On Wed, Feb 6, 2019 at 1:38 PM Benjamin Morel benjamin.morel@gmail.com
wrote:
Hi Nikita,
While playing with typed properties, something that annoys me a lot, in
the context of a data mapper, is not to be able to use isset() to check
whether an object's property is initialized. The issue was already there
with properties that have been explicitly unset(), but will be
exacerbated by typed properties, which will be unset by default.Sure, we have ReflectionProperty::isInitialized(), but reflection is slow
compared to direct property access, when reading a lot of objects.For example, check these 3 ways of reading initialized public/protected
object properties by name, along with their timings for 100,000 iterations
(benchmark here
https://gist.github.com/BenMorel/9a920538862e4df0d7041f8812f069e5#file-reflection-vs-array-cast-benchmark-php
):Using reflection: (890 ms, down to 384 ms using pre-instantiated
ReflectionProperty objects)$r = new \ReflectionClass($class);
foreach ($props as $prop) { $p = $r->getProperty($prop); $p->setAccessible(true); if ($p->isInitialized($object)) { $values[$p->getName()] = $p->getValue($object); } }
Using array cast: (193 ms)
foreach ((array) $object as $key => $value) {
// Remove the "\0*\0" in front of protected properties $pos = strrpos($key, "\0"); if ($pos !== false) { $key = substr($key, $pos + 1); } $values[$key] = $value; }
Using a bound closure and isset(): (145 ms)
(function() use ($props, & $values) { foreach ($props as $prop) { if (isset($this->{$prop})) { // skips `NULL` values as well :-( $values[$prop] = $this->{$prop}; } } })->bindTo($object, $object)();
Unfortunately, while the last approach is the fastest, and IMO the
cleanest one, it is currently unusable because there is no way to
differentiate between an uninitialized property and aNULL
property.Would it be possible to introduce another isset() operator, that would
return true forNULL
values?
Note that this would also be useful when checking if an array key exists,
instead of having to switch from isset() toarray_key_exists()
when the
array may containNULL
values.
It's possible and someone even worked on an implementation at some point (
https://github.com/php/php-src/pull/1530), but I don't believe this ever
proceeded to the RFC stage. (I'm rather doubtful about introducing
something like this, due to the quite significant increase in conceptual
language complexity for little benefit. I definitely have no plans to
pursue such a feature myself.)
Nikita
On Mon, Oct 22, 2018 at 1:58 PM Sebastian Bergmann sebastian@php.net
wrote:Am 26.09.2018 um 15:46 schrieb Nikita Popov:
I'm pleased to announce that the typed properties RFC has been
accepted
with 70 votes in favor and one vote against. We will work to finalize
and
merge the implementation in the next few days.Any update on when this will be merged? Thanks!
Sorry for the long delay here. I've spent the last week fixing the
remaining issues and cleaning up the patch, and......Typed properties are now merged into master! [1]
It would be great if people could start experimenting with typed
properties
(e.g. by migrating over existing @var annotations and seeing what
happens).
The earlier we find issues in the implementation or semantics, the better.Thanks to everyone who worked on this RFC and participated in discussions.
Special thanks to Bob Weinand who did most of the initial implementation
work on this RFC.Regards,
Nikita[1]:
https://github.com/php/php-src/commit/e219ec144ef6682b71e135fd18654ee1bb4676b4
On Wed, 6 Feb 2019 at 12:38, Benjamin Morel benjamin.morel@gmail.com
wrote:
While playing with typed properties, something that annoys me a lot, in the
context of a data mapper, is not to be able to use isset() to check whether
an object's property is initialized.
This sounds like a reasonable problem to raise...
Would it be possible to introduce another isset() operator, that would
return true forNULL
values?
...but this sounds like the wrong solution.
The distinction you need has nothing to do with nulls, it has to do with
the very specific case of "this property currently violates its contract
and I want to treat that as something other than an error". I personally
still think that unset() should raise an error for typed properties, but
apparently there are cunning hacks that it makes possible; as such, 99% of
PHP developers should never need to detect this state.
It therefore makes more sense to me to have this as a specific API relating
to typed properties; if it can be done faster outside the reflection class,
it could be in the form of a new function like
property_is_initialized($object, $propertyName); this would be an extension
of property_exists, that handles the edge case of "declared with a type and
then explicitly unset".
Regards,
Rowan Collins
[IMSoP]
it could be in the form of a new function like
property_is_initialized($object, $propertyName); this would be an
extension
of property_exists, that handles the edge case of "declared with a type
and
then explicitly unset".
PS: I realise that it's also possible to accidentally leave a property in this state with a badly written constructor. I am drafting a proposal to make this much less likely, and contend that it should not be something users should be detecting, other than catching the resulting error.
Regards,
--
Rowan Collins
[IMSoP]
Reflection is not slow: it is only slow if you instantiate it continuously.
Even when pre-instantiating the ReflectionProperty objects, it's still 2.5x
slower than a closure-based approach, according to my benchmarks (linked
below), so if possible, I'd like to avoid using Reflection for this sole
reason.
It's possible and someone even worked on an implementation at some point (
https://github.com/php/php-src/pull/1530), but I don't believe this ever
proceeded to the RFC stage. (I'm rather doubtful about introducing
something like this, due to the quite significant increase in conceptual
language complexity for little benefit. I definitely have no plans to
pursue such a feature myself.)
Thanks for the pointer! I can see from this thread that another PR has been
merged just over a month ago, to implement a ZEND_ARRAY_KEY_EXISTS opcode
to overcome the performance penalty of array_key_exists()
:
https://github.com/php/php-src/pull/3360
Couldn't we broaden the scope of this opcode, or add a new one, to support
object properties? Pardon my ignorance, but what is exactly increasing the
language complexity? Do you mean the addition of a language construct to
the parser?
The distinction you need has nothing to do with nulls, it has to do with
the very specific case of "this property currently violates its contract
and I want to treat that as something other than an error". I personally
still think that unset() should raise an error for typed properties, but
apparently there are cunning hacks that it makes possible; as such, 99% of
PHP developers should never need to detect this state.
You're right, in most of the cases properties SHOULD be set by the
constructor. I do find the current behaviour interesting, however, from a
data mapper point of view: you might want to retrieve a partial object from
the database, and get an Error if you accidentally access an uninitialized
property, as opposed to the pre-typed properties era, where you would get
null and the "error" would be silenced.
It therefore makes more sense to me to have this as a specific API relating
to typed properties; if it can be done faster outside the reflection class,
it could be in the form of a new function like
property_is_initialized($object, $propertyName); this would be an extension
of property_exists, that handles the edge case of "declared with a type and
then explicitly unset".
I don't mind if, instead of another language construct, it is a function;
to avoid the same performance penalty as array_key_exists()
however, this
function should have a companion opcode (just like ZEND_ARRAY_KEY_EXISTS).
Would that solve what you call increasing language complexity, Nikita?
Ben
On 6 February 2019 13:46:32 GMT+00:00, Rowan Collins <
rowan.collins@gmail.com> wrote:it could be in the form of a new function like
property_is_initialized($object, $propertyName); this would be an
extension
of property_exists, that handles the edge case of "declared with a type
and
then explicitly unset".PS: I realise that it's also possible to accidentally leave a property in
this state with a badly written constructor. I am drafting a proposal to
make this much less likely, and contend that it should not be something
users should be detecting, other than catching the resulting error.Regards,
--
Rowan Collins
[IMSoP]
On Wed, 6 Feb 2019 at 15:21, Benjamin Morel benjamin.morel@gmail.com
wrote:
Thanks for the pointer! I can see from this thread that another PR has
been merged just over a month ago, to implement a ZEND_ARRAY_KEY_EXISTS
opcode to overcome the performance penalty ofarray_key_exists()
:
https://github.com/php/php-src/pull/3360Couldn't we broaden the scope of this opcode, or add a new one, to support
object properties?
We already have a function property_exists()
. I don't know if it covers the
case you want, though.
Regards,
Rowan Collins
[IMSoP]
We already have a function
property_exists()
. I don't know if it covers
the case you want, though.
property_exists()
returns whether the property is defined (as opposed to
initialized) in the given class / instance:
class A {
private int $id;
}
$a = new A;
var_export(property_exists($a, 'id')); // true
So this would require a new function.
On Wed, 6 Feb 2019 at 15:21, Benjamin Morel benjamin.morel@gmail.com
wrote:Thanks for the pointer! I can see from this thread that another PR has
been merged just over a month ago, to implement a ZEND_ARRAY_KEY_EXISTS
opcode to overcome the performance penalty ofarray_key_exists()
:
https://github.com/php/php-src/pull/3360Couldn't we broaden the scope of this opcode, or add a new one, to
support object properties?We already have a function
property_exists()
. I don't know if it covers
the case you want, though.Regards,
Rowan Collins
[IMSoP]
On Wed, 6 Feb 2019 at 15:21, Benjamin Morel benjamin.morel@gmail.com
wrote:
You're right, in most of the cases properties SHOULD be set by the
constructor. I do find the current behaviour interesting, however, from a
data mapper point of view: you might want to retrieve a partial object from
the database, and get an Error if you accidentally access an uninitialized
property, as opposed to the pre-typed properties era, where you would get
null and the "error" would be silenced.
The problem with that is, the code doing the partial loading might be
thousands of lines from the code "accidentally" accessing the property. If,
for example, a User class supported partial loading of this sort, any
function that takes a User would have to handle the possibility that what
was actually passed was a partial User. If an error was raised, it would
require tracing back through the code to work out where the conversion from
partial to full object should have happened.
A cleaner implementation would have PartialUser as a different type, which
could not be passed to a function expecting a User; it would then be clear
that a specific code needed to perform a conversion to fetch the rest of
the data, e.g. function getUserFromPartial(PartialUser $partial): User
In other words, if the definition of User includes a non-nullable property
$name, then any object that doesn't have a value for $name is not actually
a User object.
Regards,
Rowan Collins
[IMSoP]