Consensus Gathering: is_initialized

4 years ago by Marco Pivetta — view source

unread

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

You can read both of our thoughts about it on the PR.

What I'm looking for now, is reasons that we are wrong - good use cases
that aren't born from code that is doing bad things ...

Equally if you think this is really bad even if the reason has already been
mentioned, please make noise.

What's the problem with good 'ol (new ReflectionProperty(Foo::class, 'bar'))->isInitialized($baz) ?

If it's instantiation time/performance, I'm fairly sure that typical
reflection instantiation caching works just fine (and a new API would need
a lot of meaningful benchmarks to be warranted).

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/

4 years ago by Joe Watkins — view source

unread

Hi Marco,

What do you mean by "meaningful benchmarks" ?

There's no question that is_initialized will be faster than the reflector
version, even with reflection caching.

Here's a naive bench:
https://gist.github.com/krakjoe/cef6452281624bdf1b46788f52a01521

krakjoe@Fiji:/opt/src/php-src$ sapi/cli/php initialized.php
is_initialized: 0.04558 seconds
reflection: 0.07186 seconds

However is_initialized has a narrower scope than the reflector version, as
mentioned in the PR, it specifically can only work on typed (accessible
from calling scope) properties.

I dunno if any of this helps you to decide if it's good or bad, but I'm
particularly interested in your concrete answer to the question of badness.

Cheers
Joe

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

You can read both of our thoughts about it on the PR.

What I'm looking for now, is reasons that we are wrong - good use cases
that aren't born from code that is doing bad things ...

Equally if you think this is really bad even if the reason has already
been
mentioned, please make noise.

What's the problem with good 'ol (new ReflectionProperty(Foo::class, 'bar'))->isInitialized($baz) ?

If it's instantiation time/performance, I'm fairly sure that typical
reflection instantiation caching works just fine (and a new API would need
a lot of meaningful benchmarks to be warranted).

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/

4 years ago by Marco Pivetta — view source

unread

Hey Joe,
http://ocramius.github.com/

Hi Marco,

What do you mean by "meaningful benchmarks" ?

There's no question that is_initialized will be faster than the reflector
version, even with reflection caching.

Ack, but real-world impact seems irrelevant? I've worked on stuff that
heavily relies on checking initialized state, and most of it is mostly
unaffected.
"meaningful benchmark" means "here's a use-case that really needs
is_initialized(), and absolutely cannot run into the penalties of
ReflectionProperty#isInitialized()

Here's a naive bench:
https://gist.github.com/krakjoe/cef6452281624bdf1b46788f52a01521

krakjoe@Fiji:/opt/src/php-src$ sapi/cli/php initialized.php
is_initialized: 0.04558 seconds
reflection: 0.07186 seconds

However is_initialized has a narrower scope than the reflector version, as
mentioned in the PR, it specifically can only work on typed (accessible
from calling scope) properties.

Sounds weird? Why would a narrower version be needed at all?

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/

4 years ago by michal.brzuchalski@gmail.com — view source

unread

śr., 26 maj 2021 o 12:14 Joe Watkins krakjoe@php.net napisał(a):

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

You can read both of our thoughts about it on the PR.

What I'm looking for now, is reasons that we are wrong - good use cases
that aren't born from code that is doing bad things ...

IMHO we don't need it, working with the typed properties since the
beginning
I thought only for a short period that I'd need this kind of feature but
shortly realized
that I'm doing it wrong if my code can reach such state,
given that all paths now lead to a known state,
object construction in all cases specifies all required properties
without having any doubts like uninitialized properties so this is doable.

If one really needs to check if a property is initialized or not I consider
it a dark magic
and such can reach up to the reflection mechanism to just check it.
IMHO it's not common (or at least should not be) to have such a need.

Cheers,
Michał Marcin Brzuchalski

4 years ago by Joe Watkins — view source

unread

Thanks Michal for input ...

Ack, but real-world impact seems irrelevant? I've worked on stuff that
heavily relies on checking initialized state, and most of it is mostly
unaffected.
"meaningful benchmark" means "here's a use-case that really needs
is_initialized(), and absolutely cannot run into the penalties of
ReflectionProperty#isInitialized()

So what I hear you saying is that perf is actually irrelevant regardless of
what kind of benchmark you want to do ...

Sounds weird? Why would a narrower version be needed at all?

The pattern ReflectionProperty#isInitialized() doesn't allow you to access
inaccessible properties without first setting them accessible, so that part
is not really different.

The scope could be extended to include untyped properties, but extending as
far as dynamic properties makes no sense because there's not going to be a
significant advantage - it would be doing the same work in the same way as
Reflector::IsInitialized.

I thought keeping scope as narrow as possible was better, but I can have my
mind changed on untyped properties, if that would make the difference
between you saying it's good or bad.

FWIW I've pretty much talked myself out of it at this point already ... I'm
hearing what I expected to hear.

Cheers
Joe

On Wed, 26 May 2021 at 14:57, Michał Marcin Brzuchalski <
michal.brzuchalski@gmail.com> wrote:

śr., 26 maj 2021 o 12:14 Joe Watkins krakjoe@php.net napisał(a):

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

You can read both of our thoughts about it on the PR.

What I'm looking for now, is reasons that we are wrong - good use cases
that aren't born from code that is doing bad things ...

IMHO we don't need it, working with the typed properties since the
beginning
I thought only for a short period that I'd need this kind of feature but
shortly realized
that I'm doing it wrong if my code can reach such state,
given that all paths now lead to a known state,
object construction in all cases specifies all required properties
without having any doubts like uninitialized properties so this is doable.

If one really needs to check if a property is initialized or not I
consider it a dark magic
and such can reach up to the reflection mechanism to just check it.
IMHO it's not common (or at least should not be) to have such a need.

Cheers,
Michał Marcin Brzuchalski

4 years ago by Rowan Tommins — view source

unread

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

My general feeling remains that the "uninitialized" state is an awkward
hack that we should be working to eliminate with better constructor
semantics. A variable that remains uninitialised after the constructor
almost always indicates a bug in the constructor, not a state that the
rest of the application should care about.

On the other hand, we have already allowed unset() to produce this
special state, with defined behaviour for __get, so possibly the genie's
out of the bottle, and we have to embrace it as a new "type".

On the gripping hand, this is likely to lead to even more confusion
about what exactly the different states are: properties declared with a
type, properties declared without a type, properties not declared at
all; all of which can be null, a non-null value, or completely unset.
People already complain that isset() returns false for null variables (I
disagree, but...); this function could end up even harder to explain.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by Larry Garfield — view source

unread

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

My general feeling remains that the "uninitialized" state is an awkward
hack that we should be working to eliminate with better constructor
semantics. A variable that remains uninitialised after the constructor
almost always indicates a bug in the constructor, not a state that the
rest of the application should care about.

I am inclined to agree here. What I don't know about is the cases noted in the bug, such as GraphQL or other serialization cases where "null" and "absent" are not quite the same thing. That is probably sufficiently edge-case to not deal with directly, especially when the more verbose alternative still exists, but that's the only reason I'd even consider making uninitialized something other than "your constructor is bad and you should feel bad."

--Larry Garfield

4 years ago by Guilliam Xavier — view source

unread

On Wed, May 26, 2021 at 4:09 PM Larry Garfield larry@garfieldtech.com
wrote:

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

My general feeling remains that the "uninitialized" state is an awkward
hack that we should be working to eliminate with better constructor
semantics. A variable that remains uninitialised after the constructor
almost always indicates a bug in the constructor, not a state that the
rest of the application should care about.

I am inclined to agree here. What I don't know about is the cases noted
in the bug, such as GraphQL or other serialization cases where "null" and
"absent" are not quite the same thing. That is probably sufficiently
edge-case to not deal with directly, especially when the more verbose
alternative still exists, but that's the only reason I'd even consider
making uninitialized something other than "your constructor is bad and you
should feel bad."

I think you said the word: serialization. And especially deserialization,
e.g. from a JSON payload into a typed DTO without calling the constructor
(then the DTO is passed through validation, which must handle uninitialized
typed properties "gracefully").

--
Guilliam Xavier

4 years ago by Rowan Tommins — view source

unread

I think you said the word: serialization. And especiallydeserialization,
e.g. from a JSON payload into a typed DTOwithout calling the constructor
(then the DTO is passed through validation, which must handle uninitialized
typed properties "gracefully").

Would is_initialized() actually be all that useful there though?

If you're iterating the properties dynamically, you'll probably be
using the reflection API already
If you're listing them manually, you can validate the input rather
than the output, which you'd have to do if there's any mapping of names
or types anyway

It's also worth noting that If the property is not nullable, you can use
isset() to tell if it's been initialised; and if it is nullable, you
probably want it to default to null anyway.

A separate function is only needed if you've decided to use the
uninitialised state as an extra "value", distinct from null.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by michal.brzuchalski@gmail.com — view source

unread

śr., 26 maj 2021 o 18:20 Guilliam Xavier guilliam.xavier@gmail.com
napisał(a):

On Wed, May 26, 2021 at 4:09 PM Larry Garfield larry@garfieldtech.com
wrote:

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

My general feeling remains that the "uninitialized" state is an awkward
hack that we should be working to eliminate with better constructor
semantics. A variable that remains uninitialised after the constructor
almost always indicates a bug in the constructor, not a state that the
rest of the application should care about.

I am inclined to agree here. What I don't know about is the cases noted
in the bug, such as GraphQL or other serialization cases where "null" and
"absent" are not quite the same thing. That is probably sufficiently
edge-case to not deal with directly, especially when the more verbose
alternative still exists, but that's the only reason I'd even consider
making uninitialized something other than "your constructor is bad and
you
should feel bad."

I think you said the word: serialization. And especially deserialization,
e.g. from a JSON payload into a typed DTO without calling the constructor
(then the DTO is passed through validation, which must handle uninitialized
typed properties "gracefully").

I don't think nowadays anyone does that without a kind of deserializer
which
reads the metadata of desired DTO and like Symfony's Serializer or JMS
which just deal with such tasks!?

Not using ctor even for manual construction would be just a waste of time
writing validation etc.
Consider JSON decoded into:

$arr = ['name' => 'main_window','width' => 500,'height' => 500];
class Window {
public function __construct(
public string $title,
public string $name,
public int $width,
public int $height,
public ?bool $unknown = null,
...$additional,
) {}
}
var_dump(new Window(...$arr));

That example errors, since the title property is not given in decoded
payload.
The behaviour is what we expect here since we require it.
The solution is to pass a valid payload with the title.
Additionally, you can see "unknown" was also missing but it was initialized
with default null value since we don't require that.

If you add nullability into the title as follows:

class Window {
public function __construct(
public ?string $title = null,
public string $name,
public int $width,
public int $height,
public ?bool $unknown = null,
...$additional,
) {}
}

Then is passing built-in language sort of validation.

If you wanna allow passing more values than expected, add not used variadic
argument at the end.
Variadic argument solves the issue with missing named arguments which might
appear in the future.

These examples show a very simple solution to container classes such as DTO.
All of them avoid dealing with an uninitialized state.

IMHO there really is no need to deal with uninitialized properties in DTO!

Cheers,
Michał Marcin Brzuchalski

4 years ago by Pierre — view source

unread

Le 26/05/2021 à 21:24, Michał Marcin Brzuchalski a écrit :

I don't think nowadays anyone does that without a kind of deserializer
which
reads the metadata of desired DTO and like Symfony's Serializer or JMS
which just deal with such tasks!?

Hello,

Just for your information, we do have our own serialization mechanism in
some projects, when writing workaround around existing serializers bugs
or behaviors become too difficult, or when we want drastic speed-ups, or
want to avoid dependencies. Sometime even just because the target
production environment is not suitable for those well-known libraries
(even in the containerized dockerized, vmized world, this still happen,
sometime).

Don't assume that everyone use X or Y library, there are some mad guys
everywhere around the globe writing their own proprietary very complex
APIs for pretty much anything when they have critical needs or legit
constraints. For example we do maintain our own database access layer
and SQL query builder, and we are pretty much anyone, just like anyone
(why wouldn't we use Doctrine or PDO ? Yet we don't).

For the record, I'm not opposed to have the is_initialized() method in
PHP, I probably wouldn't use it personally but if it makes sense for
some, why not having it ?

Regards,

--

Pierre

4 years ago by Marco Pivetta — view source

unread

why not having it ?

More API, similar-but-not-exactly-like ReflectionProperty#isInitialized()

Minimalims should really be valued more :D

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/

4 years ago by Pierre — view source

unread

Le 27/05/2021 à 13:42, Marco Pivetta a écrit :

why not having it ?

More API, similar-but-not-exactly-like ReflectionProperty#isInitialized()

Minimalims should really be valued more :D

I agree and disagree at the same time. I agree because my absolute
opinion is the same as yours, but I disagree because my pragmatism is
not all about being an idealist, having an is_initialized() could
probably allow the compiler to have a specialized opcode, and make this
call much much faster than using the reflection in userland (I'm only
guessing, but I think it probably would allow that).

So, if it's going to be used and abused in those magic APIs we all use
daily without thinking about it (let's say, hydrators or serializers,
dependency injection containers and all that stuff, and you do maintain
some if I recall) and unlock some performance optimizations in there,
I'd say go for it.

Regards,

--

Pierre

4 years ago by Rowan Tommins — view source

unread

So, if it's going to be used and abused in those magic APIs we all use
daily without thinking about it (let's say, hydrators or serializers,
dependency injection containers and all that stuff, and you do
maintain some if I recall) and unlock some performance optimizations
in there, I'd say go for it.

I think "abused" is the right word here. The uninitialized state exists
in the first place because we don't have a way to enforce initialization
in constructors, not because we really wanted a new special value that
is "like null, but even more so".

The combination of union types and enums actually gives a much more
expressive way of representing "valid value or explicit null or special
default". To use the ORM lazy-loading example:

enum ORMState {
case Unloaded;
}

class Example {
private int|ORMState|null $foo = ORMState::Unloaded;

    public function getFoo(): ?int {
        if ( $this->foo === ORMState::Unloaded ) {
            $this->foo = $this->loadFromDatabase('foo');
        }
        return $this->foo;
    }

    private function loadFromDatabase($fieldName): ?int {
        echo "Fetching '$fieldName' from database...\n";
        return 42;
    }
}

// Create object; note there's no constructor, but the property has a
default so is never uninitialized
$obj = new Example;
// On first call to method, the default value is detected and the
property lazy-loaded
var_dump($obj->getFoo());
// Subsequent accesses return the loaded value directly, at the cost of
a single strict comparison, no magic function needed
var_dump($obj->getFoo());

Unlike abusing "uninitialized" as a third state (next to int and null),
this allows any number of states to be recorded, and treated
appropriately. For instance, an auto-increment key might default to
ORMState::NotYetCalculated, and trigger an error rather than a lazy-load.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by Larry Garfield — view source

unread

The combination of union types and enums actually gives a much more
expressive way of representing "valid value or explicit null or special
default". To use the ORM lazy-loading example:

enum ORMState {
    case Unloaded;
}

class Example {
    private int|ORMState|null $foo = ORMState::Unloaded;

    public function getFoo(): ?int {
        if ( $this->foo === ORMState::Unloaded ) {
            $this->foo = $this->loadFromDatabase('foo');
        }
        return $this->foo;
    }

    private function loadFromDatabase($fieldName): ?int {
        echo "Fetching '$fieldName' from database...\n";
        return 42;
    }
}

// Create object; note there's no constructor, but the property has a
default so is never uninitialized
$obj = new Example;
// On first call to method, the default value is detected and the
property lazy-loaded
var_dump($obj->getFoo());
// Subsequent accesses return the loaded value directly, at the cost of
a single strict comparison, no magic function needed
var_dump($obj->getFoo());

I had not thought of this use of Enums, but I love it. It's not as elegant as a tagged enum for a monad, but it gets you 80% of the way there. Especially if you combine it with match().

enum Result {
case None;
}

function find_user(int $id): User|Result { ... }

$user = match($u = find_user($id)) {
Result::None => some error handling,
default: $u,
};

--Larry Garfield

4 years ago by Hossein Baghayi — view source

unread

Out of curiosity,
Couldn't PHP raise an error when there is an uninitialized state left
unhandled after calling the constructor?

It might make the developer think of either providing a default value or
initializing it in the constructor.
Hence no uninitialized state.

4 years ago by michal.brzuchalski@gmail.com — view source

unread

czw., 27 maj 2021 o 13:29 Pierre pierre-php@processus.org napisał(a):

Le 26/05/2021 à 21:24, Michał Marcin Brzuchalski a écrit :

I don't think nowadays anyone does that without a kind of deserializer
which
reads the metadata of desired DTO and like Symfony's Serializer or JMS
which just deal with such tasks!?

Hello,

Just for your information, we do have our own serialization mechanism in
some projects, when writing workaround around existing serializers bugs
or behaviors become too difficult, or when we want drastic speed-ups, or
want to avoid dependencies. Sometime even just because the target
production environment is not suitable for those well-known libraries
(even in the containerized dockerized, vmized world, this still happen,
sometime).

Don't assume that everyone use X or Y library, there are some mad guys
everywhere around the globe writing their own proprietary very complex
APIs for pretty much anything when they have critical needs or legit
constraints. For example we do maintain our own database access layer
and SQL query builder, and we are pretty much anyone, just like anyone
(why wouldn't we use Doctrine or PDO ? Yet we don't).

Please take no offense. I just think it's not that common.
While I do work with many integration projects most of the time already
could forget about writing my own stuff for this kind of thing where I have
learned to rely on community-developed components
which just speeds up development and just gets the things done.

I think described cases are rarely needing such function especially
if it's easy to write one using reflection.

My personal preference is to reduce the amount of standard library shipped
than add.

Cheers,
Michał Marcin Brzuchalski

4 years ago by Pierre — view source

unread

Le 27/05/2021 à 14:18, Michał Marcin Brzuchalski a écrit :

Please take no offense. I just think it's not that common.
While I do work with many integration projects most of the time already
could forget about writing my own stuff for this kind of thing where I have
learned to rely on community-developed components
which just speeds up development and just gets the things done.

I think described cases are rarely needing such function especially
if it's easy to write one using reflection.

My personal preference is to reduce the amount of standard library shipped
than add.

Cheers,
Michał Marcin Brzuchalski

Don't worry, no offense taken ! I'm sorry if my message sounded otherwise.

Regards,

--

Pierre

4 years ago by David Gebler — view source

unread

Hi internals,

In response to: https://bugs.php.net/bug.php?id=78480

I implemented: https://github.com/php/php-src/pull/7029

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

I would just like to second (or third, or fourth) the notion that the very
possibility of accessing uninitialized state is a bug in the user's code,
not a problem for which the language should provide a workaround. If a
property is required (i.e. not nullable) it should absolutely and always be
initialized in the constructor.

I do appreciate PHP being how it is, there might be "valid" use cases for
is_initialized() but I also think it will encourage bad practice.

Regards
Dave

4 years ago by Andreas Leathley — view source

unread

Not absolutely convinced that it's a good idea, I asked Nikita to review,
and he's unconvinced also and suggested a discussion should be started.

You can read both of our thoughts about it on the PR.

What I'm looking for now, is reasons that we are wrong - good use cases
that aren't born from code that is doing bad things ...

Equally if you think this is really bad even if the reason has already been
mentioned, please make noise.

I like it because it is very clear and takes into account the current
scope. The alternatives are more messy:

When using "isset" on class properties, it is not obvious if you are
checking for initialized state or for null, as it does both
When using reflection, scope does not matter (and you have to
explicitely set a property to accessible for private properties)

While it is not common that I need to check if a property is
initialized, I do think there should be a straightforward way to do it.
Setting properties through a constructor is not the only way of
initializing parts of a class - some use cases were already brought up,
like caching or lazy loading. Also, is_initialized cannot be done in
userland, as passing an unitialized property to a function already leads
to an error, so every way of checking now is a workaround that cannot be
made clearer for the reader of the code. I would prefer it if it also
worked for untyped properties, just to be consistent.

Static analyzers would also have a much easier time understanding the
code compared to now. When using isset on a non-nullable property a
static analyzer would rightfully complain that the property is not
nullable, as it assumes you are checking for null, not for uninitialized

isset in this case is abused to check for uninitialized, hiding the
actual intent. Not sure how good static analyzers are at understanding
the reflection code, but is_initialized would make things clearer for
both humans and analyzers.

4 years ago by Rowan Tommins — view source

unread

When using isset on a non-nullable property a
static analyzer would rightfully complain that the property is not
nullable

Precisely, so don't mark a property as non-nullable and then leave it unset.

isset in this case is abused to check for uninitialized, hiding the
actual intent

On the contrary, the uninitialized state is what is being abused. If you
want to make the intent clear, use a clear flag of "needs lazy-loading",
or whatever you're actually using this magic "not null but not really
anything else either" state for.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by Joe Watkins — view source

unread

Thanks for all the input everyone, very successful consensus gathering
exercise.

Since the response was pretty overwhelmingly negative, I think we can just
stop here.

I've closed the PR and wrapped up the FR.

Anyone is of course free to pursue the RFC that would be required, however
ill advised.

Forget I said anything, and carry on with your lives.

Cheers
Joe

When using isset on a non-nullable property a
static analyzer would rightfully complain that the property is not
nullable

Precisely, so don't mark a property as non-nullable and then leave it
unset.

isset in this case is abused to check for uninitialized, hiding the
actual intent

On the contrary, the uninitialized state is what is being abused. If you
want to make the intent clear, use a clear flag of "needs lazy-loading",
or whatever you're actually using this magic "not null but not really
anything else either" state for.

Regards,

--
Rowan Tommins
[IMSoP]

--

To unsubscribe, visit: https://www.php.net/unsub.php

4 years ago by Mike Schinkel — view source

unread

When using isset on a non-nullable property a
static analyzer would rightfully complain that the property is not
nullable

Precisely, so don't mark a property as non-nullable and then leave it unset.

isset in this case is abused to check for uninitialized, hiding the
actual intent

On the contrary, the uninitialized state is what is being abused. If you want to make the intent clear, use a clear flag of "needs lazy-loading", or whatever you're actually using this magic "not null but not really anything else either" state for.

If in fact that uninitialized state is "being abused" as asserted, maybe the real abuse is the fact that PHP allows such a state to exist in the first place?

Since the argument against is_initialized() seems to almost universally be that we should not write code that lets type properties be uninitialized past the constructor then it would follow that PHP is in the wrong to even allow uninitialized state past the constructor?

If a property is defined as ?int or null|int why should PHP allow an uninitialized state to begin with? Why shouldn't PHP initialize any of those properties to null to begin with?

And for non-nullable, why shouldn't PHP default int to 0, string to "", array to [], etc.?

But if the answer is that PHP should have an uninitialized state past the constructor then it makes good sense to have a performant is_initialized() function to allow userland to discern the difference when applicable use-cases occur.

Maybe the proper solution would be for PHP to set a default value to any uninitialized properties at the end of the constructor, null for properties that are nullable or objects, and a known default for any scalar types? Then post-constructor it would not be possible to have uninitialized and thus no need for the function?

-Mike

4 years ago by Rowan Tommins — view source

unread

Since the argument against is_initialized() seems to almost universally be that we should not write code that lets type properties be uninitialized past the constructor then it would follow that PHP is in the wrong to even allow uninitialized state past the constructor?

Yes, that is my view. The problem is that there is currently no easy way
to enforce this, because constructors are just methods that run after
the object is created, and there are a handful of ways to create objects
without calling them at all which would need to be accounted for as well.

If a property is defined as ?int or null|int why should PHP allow an uninitialized state to begin with? Why shouldn't PHP initialize any of those properties to null to begin with?

And for non-nullable, why shouldn't PHP default int to 0, string to "", array to [], etc.?

Implicitly assigning a default from some magic list would be messy, but
requiring that the user specified one would be trivial (or would have
been, when typed properties were added): ?Foo $foo=null; int $bar=0; etc

The problem is that for non-scalar, non-nullable properties, there is no
way to specify a default. One solution would have been to simply ban
such declarations (i.e. require ?Foo $foo=null rather than Foo $foo) but
that was considered too restrictive, so marking the properties
"uninitialized" is the compromise.

We're probably stuck with that compromise for now, but continuing to
treat it primarily as an error state hopefully leaves us room to do
something cleaner in future if we can work out how.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by Mike Schinkel — view source

unread

Since the argument against is_initialized() seems to almost universally be that we should not write code that lets type properties be uninitialized past the constructor then it would follow that PHP is in the wrong to even allow uninitialized state past the constructor?

Yes, that is my view. The problem is that there is currently no easy way to enforce this, because constructors are just methods that run after the object is created, and there are a handful of ways to create objects without calling them at all which would need to be accounted for as well.

If a property is defined as ?int or null|int why should PHP allow an uninitialized state to begin with? Why shouldn't PHP initialize any of those properties to null to begin with?

And for non-nullable, why shouldn't PHP default int to 0, string to "", array to [], etc.?

Implicitly assigning a default from some magic list would be messy, but requiring that the user specified one would be trivial (or would have been, when typed properties were added): ?Foo $foo=null; int $bar=0; etc

The problem is that for non-scalar, non-nullable properties, there is no way to specify a default. One solution would have been to simply ban such declarations (i.e. require ?Foo $foo=null rather than Foo $foo) but that was considered too restrictive, so marking the properties "uninitialized" is the compromise.

We're probably stuck with that compromise for now, but continuing to treat it primarily as an error state hopefully leaves us room to do something cleaner in future if we can work out how.

So we are full circle.

Since we cannot guarantee that properties are not in said "error" state it begs the question as to why developers shouldn't be able to introspect this error state in a performant manner.

-Mike

4 years ago by Andreas Leathley — view source

unread

isset in this case is abused to check for uninitialized, hiding the
actual intent

On the contrary, the uninitialized state is what is being abused. If
you want to make the intent clear, use a clear flag of "needs
lazy-loading", or whatever you're actually using this magic "not null
but not really anything else either" state for.

How is "uninitialized" magic? It is a state of a class property in PHP
that is currently being exposed via reflection, and I have seen quite a
few places where it occurs - for example with setter injection in
frameworks, as an alternative to setting a property in the constructor.

I also prefer properties with default values or values defined in the
constructor, yet making it easy to check on something that already
exists in the language only seems sensible to me - and I don't see a way
of getting rid of that state anytime soon.

4 years ago by Rowan Tommins — view source

unread

How is "uninitialized" magic? It is a state of a class property in PHP
that is currently being exposed via reflection

Apologies if my last message was a bit too hastily written, and may have
come across as rude.

What I meant by it being "magic" is that it can't be represented by any
value or type. A lot of the examples in this thread attempt to use
"uninitialized" like it was an extra value for the variable, as though
the type of the property was "int|null|uninitialized". But that can't be
declared in the property's type, and I don't know if there's even a
docblock convention for "note that this property might be deliberately
uninitialized, when it is, that means this ...".

Since we now have union types, you can actually be a lot more explicit
about the extra state, and use something like "int|false|null"; in 8.1,
you'll also be able to use "int|SomeEnum|null". Since those are real
types, with real values, no extra functions need to be added to the
language, and everything feels a lot less "magic".

I have seen quite a
few places where it occurs - for example with setter injection in
frameworks, as an alternative to setting a property in the constructor.

Just to reiterate, the new function would only be useful if you were
trying to distinguish the uninitialized state from a null value.

If you have a property which you know might not be provided, you
already have "null" to represent that. For instance, a property that
might hold an instance of LoggerInterface, but only if the user calls a
setLogger method, can be declared as "?LoggerInterface $logger=null",
and tested with "isset($this->logger)".

PS: A much sillier reason why I hate "uninitialized" is that I
consistently misspell it as "unitialized" :P

--
Rowan Tommins
[IMSoP]

4 years ago by Guilliam Xavier — view source

unread

Also, is_initialized cannot be done in
userland, as passing an unitialized property to a function already leads
to an error

The feature request was indeed is_initialized($foo->bar) (language
construct, akin to isset); but the PR was only proposing
is_initialized($foo, 'bar') (regular function, akin to property_exists).

Anyway, that's a "Wont fix".

(PS about previous messages: I actually don't write custom serializers
myself)

--
Guilliam Xavier