[RFC] Lazy Objects

1 year ago by Valentin Udaltsov — view source

unread

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for
over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Hi, Nicolas and Arnaud!

I like the idea, thank you for the RFC!

Here are some initial thoughts and questions:

It doesn't seem right that calling ReflectionLazyObject::makeLazyGhost
has an implicit side effect on $instance and returns reflection. It does
2 things and thus breaks the SRP. Having smth like $lazyGhost = new ReflectionClass(MyClass)->newLazyGhost($initializer) and/or
ReflectionLazyObject::makeLazy($object, $initializer): void seems better.
If ReflectionLazyObject extends ReflectionObject, then how new ReflectionLazyObject($object) will work for non-lazy objects? Will it
throw?
Is extending ReflectionObject really necessary? What about creating
ReflectionLazyObject as a standalone class without abusing inheritance?
Or simply adding methods to ReflectionObject / ReflectionClass?
The RFC says that Virtual state-proxies are necessary because of
circular references. It's difficult to accept this reasoning, because using
circular references is a bad practice and the given example is something I
try to avoid by all means in my code.

--
Best regards,
Valentin

1 year ago by Nicolas Grekas — view source

unread

Hello Valentin,

Thanks for having a look.

Arnaud and I are pleased to share with you the RFC we've been shaping for

over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Hi, Nicolas and Arnaud!

I like the idea, thank you for the RFC!

Here are some initial thoughts and questions:

It doesn't seem right that calling
ReflectionLazyObject::makeLazyGhost has an implicit side effect on
$instance and returns reflection. It does 2 things and thus breaks the
SRP. Having smth like $lazyGhost = new ReflectionClass(MyClass)->newLazyGhost($initializer) and/or
ReflectionLazyObject::makeLazy($object, $initializer): void seems better.

About "new ReflectionClass(MyClass)->newLazyGhost", we could add this but
it would provide a duplicate way to achieve what can be done with
"makeLazy($object)" variants. As you might have read in the RFC, being able
to make a pre-existing instance lazy is needed to cover all use cases.

Then about "ReflectionLazyObject::makeLazy($object, $initializer): void",
is the difference only to return void and not a ReflectionLazyObject
instance? Then this might provide an underperforming API: creating a lazy
object and immediately after setting some of its properties is an use case
that can happen on the hot path (of e.g. Doctrine entities creation steps).
Returning "void" would force an extra call to
ReflectionLazyObject::fromInstance() that the proposed API prevents.

If ReflectionLazyObject extends ReflectionObject, then how new ReflectionLazyObject($object) will work for non-lazy objects? Will it
throw?

The constructor is private so this is not allowed (I should add this to the
RFC).

Is extending ReflectionObject really necessary? What about creating
ReflectionLazyObject as a standalone class without abusing inheritance?
Or simply adding methods to ReflectionObject / ReflectionClass?

I don't think extending ReflectionObject is necessary. I don't know if
doing so is "abusing" inheritance. It might make sense either way. For the
use cases I identified, it wouldn't harm to not extend anything. Does
anyone else see a reason to go in one or the other direction? To me it just
makes sense to have ReflectionLazyObject extend ReflectionObject.

About adding methods to ReflectionObject / ReflectionClass, you mention
SRP in your message; ReflectionClass/ReflectionObject is already crowded,
and this lazy object topic is better separated to me.

The RFC says that Virtual state-proxies are necessary because of

circular references. It's difficult to accept this reasoning, because using
circular references is a bad practice and the given example is something I
try to avoid by all means in my code.

Yet circular references happen all the time in any non-trivial app, so this
has to be supported.

Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

I've gave the RFC three or four passes and I'm not quite sure if I
follow everything, here's a list of some questions / remarks that came
to mind, roughly ordered by the order of things appearing in the RFC.

"been tested successfully on the Doctrine and on the Symfony projects"

Is there a PoC patch showcasing how the code would change / be
simplified for those pre-existing codebases?

int $options = 0

Not a fan of flag parameters that take a bitset, those provide for a
terrible DX due to magic numbers. Perhaps make this a regular (named)
parameter, or an list of enum LazyObjectOptions { case
SkipInitOnSerialize; }?

skipProperty()

Not a fan of the method name, because it doesn't really say what it
does, without consulting the docs. Perhaps skipInitializationFor() or
similar?

setProperty()

Not a fan of the method name, because it is not a direct counterpart to
getProperty(). Unfortunately I don't have a better suggestion.

The examples should be expanded and clarified, especially the one for
makeLazyProxy():

My understanding is that the $object that is passed to the first
parameter of makeLazyProxy() is completely replaced. Is this
understanding correct? What does that mean for spl_object_hash(),
spl_object_id()? What does this mean for WeakMap and WeakReference? What
does this mean for objects that are only referenced from within $object?

Consider this example:

 class Foo {
   public function __destruct() { echo __METHOD__; }
 }

 class Bar {
   public string $s;
   public ?Foo $foo;

  public function __destruct() { echo __METHOD__; }
 }

 $bar = new Bar();
 $bar->foo = new Foo();

 ReflectionLazyObject::makeLazyProxy($bar, function (Bar $bar) {
   $result = new Bar();
   $result->foo = null;
   $result->s = 'init';
   return $result;
 });

 var_dump($bar->s);

My understanding is that this will dump string(4) "init". Will the
destructor of Foo be called? Will the destructor of Bar be called?

What happens if I make an object lazy that already has all properties
initialized? Will that be a noop? Will that throw? Will that create a
lazy object that will never automatically be initialized?
Cloning, unless __clone() is implemented and accesses a property.

The semantics of cloning a lazy object should be explicitly spelled out
in the RFC, ideally with an example of the various edge cases (should
any exist).

Before calling the initializer, properties that were not initialized
with ReflectionLazyObject::skipProperty(),
ReflectionLazyObject::setProperty(),
ReflectionLazyObject::setRawProperty() are initialized to their default
value.

Should skipProperty() also skip the initialization to the default value?
My understanding is that it allows skipping the initialization on
access, but when initialization actually happens it should probably be
set to a well-defined value, no?

Am I also correct in my understanding that this should read "initialized
to their default value (if any)", meaning that properties without a
default value are left uninitialized?

If an exception is thrown while calling the initializer, the object is
reverted to its pre-initialization state and is still considered lazy.

Does this mean that the initializer will be called once again when
accessing another property? Will the "revert to its pre-initialization"
work properly when you have nested lazy objects? An example would
probably help.

The initializer is called with the object as first parameter.

What is the behavior of accessing the object properties, while the
initializer is active? Based on the examples, I assume it will not be
recursively called, similarly to how the hooks work?

The object is marked as non-lazy and the initializer is released.

What does it mean for the initializer to be released? Consider the
following example:

 ReflectionLazyObject::makeLazyGhost($o, $init = function ($o) use

(&$init) {
$o->init = $init;
});

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:

 class A { public string $s; }
 class B extends A { public function foo() { } }

 $o = new B();
 ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
   return new A();
 });

 $o->foo(); // works
 $o->s = 'init';
 $o->foo(); // breaks

The destructor of lazy non-initialized objects is not called.

That sounds unsafe. Consider the following example:

 class Mutex {
   public string $s;
   public function __construct() {
     // take lock
   }

   public function __destruct() {
     // release lock
   }
 }

 $m = new Mutex();
 ReflectionLazyObject::makeLazyGhost($m, function ($m) {
 });
 unset($m); // will not release the lock.

Using the Manager::createManager() factory is not compatible with
ghost objects because their initializer requires initializing the ghost
object in place,

I don't understand that example, because it doesn't actually use lazy
objects and thus I don't understand if Manager, Dispatcher, or both are
intended to be lazily initialized. It would help to rewrite the example
to use makeLazyGhost() and indicate with a comment in which cases the
problem would arise.

Backward Incompatible Changes

There a technicality: The ReflectionLazyObject class name will no
longer be available to userland.

Best regards
Tim Düsterhus

1 year ago by Robert Landers — view source

unread

Hi

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

I've gave the RFC three or four passes and I'm not quite sure if I
follow everything, here's a list of some questions / remarks that came
to mind, roughly ordered by the order of things appearing in the RFC.

"been tested successfully on the Doctrine and on the Symfony projects"

Is there a PoC patch showcasing how the code would change / be
simplified for those pre-existing codebases?

int $options = 0

Not a fan of flag parameters that take a bitset, those provide for a
terrible DX due to magic numbers. Perhaps make this a regular (named)
parameter, or an list of enum LazyObjectOptions { case
SkipInitOnSerialize; }?

skipProperty()

Not a fan of the method name, because it doesn't really say what it
does, without consulting the docs. Perhaps skipInitializationFor() or
similar?

setProperty()

Not a fan of the method name, because it is not a direct counterpart to
getProperty(). Unfortunately I don't have a better suggestion.

The examples should be expanded and clarified, especially the one for
makeLazyProxy():

My understanding is that the $object that is passed to the first
parameter of makeLazyProxy() is completely replaced. Is this
understanding correct? What does that mean for spl_object_hash(),
spl_object_id()? What does this mean for WeakMap and WeakReference? What
does this mean for objects that are only referenced from within $object?

Consider this example:
 class Foo {
   public function __destruct() { echo __METHOD__; }
 }

 class Bar {
   public string $s;
   public ?Foo $foo;

  public function __destruct() { echo __METHOD__; }
 }

 $bar = new Bar();
 $bar->foo = new Foo();

 ReflectionLazyObject::makeLazyProxy($bar, function (Bar $bar) {
   $result = new Bar();
   $result->foo = null;
   $result->s = 'init';
   return $result;
 });

 var_dump($bar->s);
My understanding is that this will dump string(4) "init". Will the
destructor of Foo be called? Will the destructor of Bar be called?

What happens if I make an object lazy that already has all properties
initialized? Will that be a noop? Will that throw? Will that create a
lazy object that will never automatically be initialized?

Cloning, unless __clone() is implemented and accesses a property.

The semantics of cloning a lazy object should be explicitly spelled out
in the RFC, ideally with an example of the various edge cases (should
any exist).

Before calling the initializer, properties that were not initialized
with ReflectionLazyObject::skipProperty(),
ReflectionLazyObject::setProperty(),
ReflectionLazyObject::setRawProperty() are initialized to their default
value.

Should skipProperty() also skip the initialization to the default value?
My understanding is that it allows skipping the initialization on
access, but when initialization actually happens it should probably be
set to a well-defined value, no?

Am I also correct in my understanding that this should read "initialized
to their default value (if any)", meaning that properties without a
default value are left uninitialized?

If an exception is thrown while calling the initializer, the object is
reverted to its pre-initialization state and is still considered lazy.

Does this mean that the initializer will be called once again when
accessing another property? Will the "revert to its pre-initialization"
work properly when you have nested lazy objects? An example would
probably help.

The initializer is called with the object as first parameter.

What is the behavior of accessing the object properties, while the
initializer is active? Based on the examples, I assume it will not be
recursively called, similarly to how the hooks work?

The object is marked as non-lazy and the initializer is released.

What does it mean for the initializer to be released? Consider the
following example:
 ReflectionLazyObject::makeLazyGhost($o, $init = function ($o) use
(&$init) {
$o->init = $init;
});

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:
 class A { public string $s; }
 class B extends A { public function foo() { } }

 $o = new B();
 ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
   return new A();
 });

 $o->foo(); // works
 $o->s = 'init';
 $o->foo(); // breaks
The destructor of lazy non-initialized objects is not called.

That sounds unsafe. Consider the following example:
 class Mutex {
   public string $s;
   public function __construct() {
     // take lock
   }

   public function __destruct() {
     // release lock
   }
 }

 $m = new Mutex();
 ReflectionLazyObject::makeLazyGhost($m, function ($m) {
 });
 unset($m); // will not release the lock.
Using the Manager::createManager() factory is not compatible with
ghost objects because their initializer requires initializing the ghost
object in place,

I don't understand that example, because it doesn't actually use lazy
objects and thus I don't understand if Manager, Dispatcher, or both are
intended to be lazily initialized. It would help to rewrite the example
to use makeLazyGhost() and indicate with a comment in which cases the
problem would arise.

Backward Incompatible Changes

There a technicality: The ReflectionLazyObject class name will no
longer be available to userland.

Best regards
Tim Düsterhus

As someone who has had to maintain these proxies/ghosts before, this
looks quite useful and powerful. I feel it has rather wonky syntax,
but it is clearly better than the alternative of implementing it
yourself. I'm also a huge fan that the syntax allows for
usage/creation far and away from the definition/class itself.

Good luck, and I hope it passes.

Robert Landers
Software Engineer
Utrecht NL

1 year ago by Arnaud Le Blanc — view source

unread

Hi Tim,

That's a lot of interesting feedback. I will try to answer some of
your points, and Nicolas will follow with other points.

int $options = 0

Not a fan of flag parameters that take a bitset, those provide for a
terrible DX due to magic numbers. Perhaps make this a regular (named)
parameter, or an list of enum LazyObjectOptions { case
SkipInitOnSerialize; }?

The primary reason for choosing to represent $options as a bitset is
that it's consistent with the rest of the Reflection API (e.g.
ReflectionClass::getProperties() uses a bitset for the $filter
parameter).

I don't get your point about magic numbers since we are using
constants to abstract them.

skipProperty()

Not a fan of the method name, because it doesn't really say what it
does, without consulting the docs. Perhaps skipInitializationFor() or
similar?

We have opted for skipInitializerForProperty()

setProperty()

Not a fan of the method name, because it is not a direct counterpart to
getProperty(). Unfortunately I don't have a better suggestion.

We have renamed setRawProperty() to setRawPropertyValue() in the RFC.
We are open to other suggestions.

We have also removed setProperty(), as we believe that there is no
use-case for it.

The examples should be expanded and clarified, especially the one for
makeLazyProxy():

Agreed. We will add examples and clarify some behaviors.

My understanding is that the $object that is passed to the first
parameter of makeLazyProxy() is completely replaced. Is this
understanding correct? What does that mean for spl_object_hash(),
spl_object_id()? What does this mean for WeakMap and WeakReference? What
does this mean for objects that are only referenced from within $object?

The object is updated in-place, and retains its identity. It is not
replaced. What makeLazyGhost() and makeLazyProxy() do is equivalent to
calling unset() on all properties, and setting a flag on the object
internally. Apart from setting the internal flag, this is achievable
in userland by iterating on all properties via the Reflection API, and
using unset() in the right scope with a Closure.

spl_object_id(), spl_object_hash(), SplObjectStorage, WeakMap,
WeakReference, strict equality, etc are not affected by makeLazy*().

The intended use of makeLazyGhost() and makeLazyProxy() is to call
them either on an object created with
ReflectionClass::newInstanceWithoutConstructor(), or on $this in a
constructor. The latter is the reason why these APIs take an existing
object.

The proposed patch integrates into the object handlers fallback code
path used to manage accesses to undefined properties. We implement
lazy initialization by hooking into undefined property accesses,
without impacting the fast path.

Consider this example:

 class Foo {
   public function __destruct() { echo __METHOD__; }
 }

 class Bar {
   public string $s;
   public ?Foo $foo;

  public function __destruct() { echo __METHOD__; }
 }

 $bar = new Bar();
 $bar->foo = new Foo();

 ReflectionLazyObject::makeLazyProxy($bar, function (Bar $bar) {
   $result = new Bar();
   $result->foo = null;
   $result->s = 'init';
   return $result;
 });

 var_dump($bar->s);

My understanding is that this will dump string(4) "init". Will the
destructor of Foo be called? Will the destructor of Bar be called?

This will print:

Foo::__destruct (during makeLazyProxy())
string(4) "init" (during `var_dump()`)

and eventually

Bar::__destruct (when $bar is released)

What happens if I make an object lazy that already has all properties
initialized? Will that be a noop? Will that throw? Will that create a
lazy object that will never automatically be initialized?

All properties are unset as described earlier, and the object is
flagged as lazy. The object will automatically initialize when trying
to observe its properties.

However, making a fully initialized object lazy is not the intended use-case.

Cloning, unless __clone() is implemented and accesses a property.

The semantics of cloning a lazy object should be explicitly spelled out
in the RFC, ideally with an example of the various edge cases (should
any exist).

Agreed. We are working on expanding the RFC about this

Before calling the initializer, properties that were not initialized
with ReflectionLazyObject::skipProperty(),
ReflectionLazyObject::setProperty(),
ReflectionLazyObject::setRawProperty() are initialized to their default
value.

Should skipProperty() also skip the initialization to the default value?
My understanding is that it allows skipping the initialization on
access, but when initialization actually happens it should probably be
set to a well-defined value, no?

Am I also correct in my understanding that this should read "initialized
to their default value (if any)", meaning that properties without a
default value are left uninitialized?

The primary effect of skipProperty() is to mark a property as
non-lazy, so that accessing it does not trigger the initialization of
the entire object. It also sets the property to its default value if
any, otherwise it is left as undef.

Accessing this property afterwards has exactly the same effect as
doing so on an object created with
ReflectionClass::newInstanceWithoutConstructor() (including triggering
errors when reading an uninitialized property).

If an exception is thrown while calling the initializer, the object is
reverted to its pre-initialization state and is still considered lazy.

Does this mean that the initializer will be called once again when
accessing another property?

Yes. The goal is to prevent transition from lazy to initialized when
an unexpected error occurred in the initializer.

Will the "revert to its pre-initialization"
work properly when you have nested lazy objects? An example would
probably help.

Only the effects on the object itself are reverted. External side
effects are not reverted.

The initializer is called with the object as first parameter.

What is the behavior of accessing the object properties, while the
initializer is active? Based on the examples, I assume it will not be
recursively called, similarly to how the hooks work?

For ghost objects, the initializer is supposed to initialize the
object itself like a constructor would do. During initializer
execution, the object has exactly the same state and behavior as it
would have in its constructor during new:

The object is not lazy.
Properties have their default value (if any) and are accessed
without triggering a nested initialization.
If setRawPropertyValue() was used, some properties may have a value
different from their default.

For virtual proxies, the initializer is supposed to return another
object. Accessing the proxy object during initialization does not
trigger recursive initializations. Some properties will have a value
if setRawPropertyValue() or skipInitializationForProperty().

The object is marked as non-lazy and the initializer is released.

What does it mean for the initializer to be released? Consider the
following example:
 ReflectionLazyObject::makeLazyGhost($o, $init = function ($o) use
(&$init) {
$o->init = $init;
});

"released" here means that the initializer is not referenced anymore
by this object, and may be freed if it is not referenced anywhere
else.

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:
 class A { public string $s; }
 class B extends A { public function foo() { } }

 $o = new B();
 ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
   return new A();
 });

 $o->foo(); // works
 $o->s = 'init';
 $o->foo(); // breaks

$o->foo() calls B::foo() in both cases here, as $o is always the proxy
object. We need to double check, but we believe that this rule doesn't
break LSP.

The destructor of lazy non-initialized objects is not called.

That sounds unsafe. Consider the following example:

 class Mutex {
   public string $s;
   public function __construct() {
     // take lock
   }

   public function __destruct() {
     // release lock
   }
 }

 $m = new Mutex();
 ReflectionLazyObject::makeLazyGhost($m, function ($m) {
 });
 unset($m); // will not release the lock.

Good point. The Mutex constructor is called during "new Mutex()", but
the object is made lazy after that, and the destructor is never
called.

We have made the following changes to the RFC:

makeLazyGhost / makeLazyProxy will call the object destructor
A new option flag is added, ReflectionLazyObject::SKIP_DESTRUCTOR,
that disables this behavior

This is not ideal since the intended use of these methods is to call
them on objects created with newInstanceWithoutConstructor(), or
directly in a constructor, and both of these will need this flag, but
at least it's safe by default.

Thanks again for the feedback.

Best Regards,
Arnaud

1 year ago by Larry Garfield — view source

unread

Good point. The Mutex constructor is called during "new Mutex()", but
the object is made lazy after that, and the destructor is never
called.

We have made the following changes to the RFC:

makeLazyGhost / makeLazyProxy will call the object destructor

A new option flag is added, ReflectionLazyObject::SKIP_DESTRUCTOR,
that disables this behavior

This is not ideal since the intended use of these methods is to call
them on objects created with newInstanceWithoutConstructor(), or
directly in a constructor, and both of these will need this flag, but
at least it's safe by default.

Thanks again for the feedback.

Best Regards,
Arnaud

Let me make sure I am following, since I'm still having a hard time with the explanation in the RFC (as I discussed with Nicolas off list).

The two use cases intended here are, essentially, "delay invoking the constructor until first use" (ghost) and "sneak a factory object in the way that is called on first use" (virtual). Right?

The ghost initializer callback is basically an alternate constructor (which will probably just call the real constructor in the typical case), and the virtual initializer callback is basically the body of the factory object.

So the most common ghost implementation for a service would be something like:

$c = the_container();

$object = new ReflectionClass(Foo::class)->newInstanceWithoutConstructor();

$initializer = static function(Foo $foo) => $foo->__construct($c->dep1, $c->dep2, $c->dep3);

$lazyReflector = ReflectionLazyObject::makeLazyGhost($object, $initializer, ReflectionLazyObject::SKIP_DESTRUCTOR);

Am I following? Because if so, the proposed API is extremely clunky. For one thing, using an input/output parameter ($object) is a code smell 99% of the time. It's changing an un-constructed object into a ghost object, and returning... um, I'm not sure what. It also means I need to use both reflection classes in different ways to achieve the result.

It seems a much cleaner API would be something like:

$object = new ReflectionClass(Foo::class)->newInstanceWithLazyConstructor($initializer);

In which case it becomes a lot more obvious that we are, essentially, "swapping out" the constructor for a lazy one. It also suggests that perhaps the function should be using $this, not $foo, as it's running within the context of the object (I presume? Can it call private methods? I assume so since it can set private properties.)

That would also suggest this API for the other approach:

$initializer = static function() => new Foo($c->dep1, $c->dep2, $c->dep3);

$object = new ReflectionClass(Foo::class)->newInstanceWithLazyFactory($initializer);

In which case $object is the proxy, and gets "swapped out" for the return value of the $initializer on first use.

Am I understanding all this correctly? Because if so, I think the above simplified API would make it much more obvious what is going on, much easier to work with, easier to document/explain, and simple enough that it could conceivably be used in cases outside of DI or ORMs, too.

If I'm way off and don't understand what you're doing, then please explain as I'm clearly very confused. :-)

--Larry Garfield

1 year ago by Arnaud Le Blanc — view source

unread

Hi Larry,

Thank you for the feedback.

I think you got the two strategies right. However, there is a use-case
in which an object manages its own laziness by making itself lazy:

class C {
     public function __construct() {
        ReflectionLazyObject::makeLazyGhost($this, $this->init(...));
    }
}

This one can not be addressed by a newInstance*() method since the
object to be made lazy already exists.

The makeLazyGhost() / makeLazyProxy() methods are the minimal methods
necessary to address all use-cases, but the methods you are suggesting
are a better API most of the time, so we are adding approximately this
to the proposal [1]. We are keeping them in a separate class to not
pollute ReflectionClass.

It also suggests that perhaps the function should be using $this, not $foo, as it's running within the context of the object (I presume? Can it call private methods? I assume so since it can set private properties.)

The function is not running in the context of the object. It can only
access private members via Reflection or if the closure was bound to
the right scope by the user. This should not be an issue when the
initializer just calls a public constructor.

In which case $object is the proxy, and gets "swapped out" for the return value of the $initializer on first use.

Just to be sure: $object continues to be the proxy instance after the
initializer is called, but it forwards all property accesses to the
return value of the $initializer.

[1] https://news-web.php.net/php.internals/123518

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

Thank you Larry for your email. Your suggested API is basically what I
also had in mind after Arnaud's clarification.

I think you got the two strategies right. However, there is a use-case
in which an object manages its own laziness by making itself lazy:

What is the corresponding real-world use case for that? When would I
want to create an object that makes itself lazy?

FWIW: If I understand it right, this use case would be supported by
Larry's proposed API by writing a named constructor:

 class C {
   private function __construct($foo, $bar) { }
   public static function lazyNew(...$args) {
     $r = new ReflectionClass(self::class);

     return $r->newInstanceWithLazyConstructor(fn ($o) =>

$o->__construct(...$args));
}
}

It also suggests that perhaps the function should be using $this, not $foo, as it's running within the context of the object (I presume? Can it call private methods? I assume so since it can set private properties.)

The function is not running in the context of the object. It can only
access private members via Reflection or if the closure was bound to
the right scope by the user. This should not be an issue when the
initializer just calls a public constructor.

Please clarify the interaction with visibility in the RFC.

Best regards
Tim Düsterhus

1 year ago by michal.brzuchalski@gmail.com — view source

unread

Hi Arnaud,

śr., 5 cze 2024 o 20:08 Arnaud Le Blanc arnaud.lb@gmail.com napisał(a):

Hi Larry,

Thank you for the feedback.

I think you got the two strategies right. However, there is a use-case
in which an object manages its own laziness by making itself lazy:
class C {
     public function __construct() {
        ReflectionLazyObject::makeLazyGhost($this, $this->init(...));
    }
}
This one can not be addressed by a newInstance*() method since the
object to be made lazy already exists.

Did you consider implementing it using some attribute?

On constructor like:

class C {
    #[LazyInitialization]
     public function __construct(private readonly string $foo) {
        // ... init executes after first use, but all promoted properties
are already initialized
    }
}

or on specialized initializer:

class C {
     public function __construct(private readonly string $foo) {
        // do something light
    }

    #[LazyInitialization]
    private function initi(): void
    {
        // do something heavy
    }
}

I don't know if this is a good example of doing the same thing or if it
doesn't limit functionality,
but for me, it is way more clean and easier to understand.

Cheers,
Michał Marcin Brzuchalski

1 year ago by Arnaud Le Blanc — view source

unread

Hi Michał, Chris,

On Thu, Jun 6, 2024 at 8:53 AM Michał Marcin Brzuchalski
michal.brzuchalski@gmail.com wrote:

Did you consider implementing it using some attribute?

I'm wondering why this has been attached to the existing reflection API instead of being a new thing in and of itself? It doesn't seem strictly related to reflection other than currently the solutions for this rely on reflection to work.

Thank you for your feedback.

Currently the lazy objects feature is designed as a low-level
technical building block that libraries and frameworks can use. FFI
and Fibers are examples of such features that most users do not use
directly, but can benefit greatly from within libraries they use.

We have considered the proposed syntaxes, such as using annotations,
but these do not match the core use case of creating a lazy instance
of a class without requiring cooperation from the class itself.
Furthermore we believe that designing a higher-level API in the
language would be considerably more difficult and could put the RFC at
risk.

However, it is possible to introduce a higher-level way to create lazy
objects in a future RFC.

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

Working through your reply in order of the email, without any
backtracking, because the complexity of this topic makes it hard to keep
the entire email in mind. This might mean that I am asking follow-up
questions that you already answered further down. Please apologize if
that is the case :-)

One general note: Please include the answers to my questions in the RFC
text as appropriate for reference of other readers and so that all my
questions are answered when re-reading the RFC without needing to refer
back to your email.

Not a fan of flag parameters that take a bitset, those provide for a
terrible DX due to magic numbers. Perhaps make this a regular (named)
parameter, or an list of enum LazyObjectOptions { case
SkipInitOnSerialize; }?

The primary reason for choosing to represent $options as a bitset is
that it's consistent with the rest of the Reflection API (e.g.
ReflectionClass::getProperties() uses a bitset for the $filter
parameter).

I don't get your point about magic numbers since we are using
constants to abstract them.

It's not a magic number in the classic sense, but when trying to observe
it, e.g. by means of a debugger and the $options have been passed
through some other functions that wrap the lazy object API, it will
effectively be an opaque number that one will need to decode manually,
whereas a list of enums is immediately clear.

My understanding is that the $object that is passed to the first
parameter of makeLazyProxy() is completely replaced. Is this
understanding correct? What does that mean for spl_object_hash(),
spl_object_id()? What does this mean for WeakMap and WeakReference? What
does this mean for objects that are only referenced from within $object?

The object is updated in-place, and retains its identity. It is not
replaced. What makeLazyGhost() and makeLazyProxy() do is equivalent to
calling unset() on all properties, and setting a flag on the object
internally. Apart from setting the internal flag, this is achievable

Oh. It was not clear at all to me that all existing properties will be
unset. Did I miss it or is that not written down in the RFC?

Is there any reason to call the makeLazyX() methods on an object that
was not just freshly created with ->newInstanceWithoutConstructor()
then? Anything I do with the object before the call to makeLazyX() will
effectively be reverted, no?

An example showcasing the intended usage, e.g. a simplified ORM example,
would really be helpful here.

in userland by iterating on all properties via the Reflection API, and
using unset() in the right scope with a Closure.

spl_object_id(), spl_object_hash(), SplObjectStorage, WeakMap,
WeakReference, strict equality, etc are not affected by makeLazy*().

That is true for both makeLazyGhost(), and makeLazyProxy()?

What would the following example output?

 $object = new MyObject();
 var_dump(spl_object_id($object));
 $r = ReflectionLazyObject::makeLazyGhost($object, function

(MyObject $object) {
$object2 = new MyObject();
var_dump(spl_object_id($object2));
return $object2;
});
var_dump(spl_object_id($object));
$r->initialize();
var_dump(spl_object_id($object));

What would happen if I would expose the inner $object2 to the outer
world by means of the super globals or by means of use (&$out) + $out = $object2?

The intended use of makeLazyGhost() and makeLazyProxy() is to call
them either on an object created with
ReflectionClass::newInstanceWithoutConstructor(), or on $this in a
constructor. The latter is the reason why these APIs take an existing
object.

Okay, that answers the question above. Technically being capable of
calling it on an object that was not just freshly created sounds like a
footgun, though. What is the interaction with readonly objects? My
understanding is that it would allow an readonly object with initialized
properties to change after-the-fact?

Consider this example:

  class Foo {
    public function __destruct() { echo __METHOD__; }
  }

  class Bar {
    public string $s;
    public ?Foo $foo;

   public function __destruct() { echo __METHOD__; }
  }

  $bar = new Bar();
  $bar->foo = new Foo();

  ReflectionLazyObject::makeLazyProxy($bar, function (Bar $bar) {
    $result = new Bar();
    $result->foo = null;
    $result->s = 'init';
    return $result;
  });

  var_dump($bar->s);

My understanding is that this will dump string(4) "init". Will the
destructor of Foo be called? Will the destructor of Bar be called?

This will print:

 Foo::__destruct (during makeLazyProxy())
 string(4) "init" (during `var_dump()`)

and eventually

 Bar::__destruct (when $bar is released)

Okay, so only one Bar::__destruct(), despite two Bar objects being
created. I assume it's the destructor of the second Bar, i.e. if I would
dump $this->foo within the destructor, it would dump null?

What happens if I make an object lazy that already has all properties
initialized? Will that be a noop? Will that throw? Will that create a
lazy object that will never automatically be initialized?

All properties are unset as described earlier, and the object is
flagged as lazy. The object will automatically initialize when trying
to observe its properties.

However, making a fully initialized object lazy is not the intended use-case.

Understood. See above with my follow-up question then.

Before calling the initializer, properties that were not initialized
with ReflectionLazyObject::skipProperty(),
ReflectionLazyObject::setProperty(),
ReflectionLazyObject::setRawProperty() are initialized to their default
value.

Should skipProperty() also skip the initialization to the default value?
My understanding is that it allows skipping the initialization on
access, but when initialization actually happens it should probably be
set to a well-defined value, no?

Am I also correct in my understanding that this should read "initialized
to their default value (if any)", meaning that properties without a
default value are left uninitialized?

The primary effect of skipProperty() is to mark a property as
non-lazy, so that accessing it does not trigger the initialization of
the entire object. It also sets the property to its default value if
any, otherwise it is left as undef.

Accessing this property afterwards has exactly the same effect as
doing so on an object created with
ReflectionClass::newInstanceWithoutConstructor() (including triggering
errors when reading an uninitialized property).

I'm rereading my own question and can't make sense of it any more. I
probably forgot that skipProperty() is defined to set the default value
in the PHPDoc when I got down to the bit that I quoted.

Please just insert the 'if any' after 'default value' for clarity.

Will the "revert to its pre-initialization"
work properly when you have nested lazy objects? An example would
probably help.

Only the effects on the object itself are reverted. External side
effects are not reverted.

Yes, it's obvious that external side effects are not reverted. I was
thinking about a situation like:

 $a = new A();
 $b = new B();
 ReflectionLazyObject::makeLazyGhost($b, function ($b) {
     throw new \Exception('xxx');
 });
 ReflectionLazyObject::makeLazyGhost($a, function ($a) use ($b) {
     $a->b = $b->somevalue;
 });
 $a->init = 'please';

The initialization of $a will implicitly attempt to initialize $b, which
will fail. Am I correct in my understanding that both $a and $b will be
reverted back to a lazy object afterwards? If so, adding that example to
the RFC would help to make possible edge cases clear.

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:
  class A { public string $s; }
  class B extends A { public function foo() { } }

  $o = new B();
  ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
    return new A();
  });

  $o->foo(); // works
  $o->s = 'init';
  $o->foo(); // breaks
$o->foo() calls B::foo() in both cases here, as $o is always the proxy
object. We need to double check, but we believe that this rule doesn't
break LSP.

I don't understand what happens with the 'A' object then, but perhaps
this will become clearer once you add the requested examples.

Best regards
Tim Düsterhus

1 year ago by Arnaud Le Blanc — view source

unread

Hi Tim,

We have updated the RFC to address your feedback. Please find
additional answers below.

Is there any reason to call the makeLazyX() methods on an object that
was not just freshly created with ->newInstanceWithoutConstructor()
then?

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:

class C {
     public function __construct() {
        new ReflectionLazyObjectFactory(C::class)->makeInstanceLazyGhost($this,
$this->init(...));
    }
}

When drafting this API we figured that makeLazy addressed all
use-cases, and this allowed us to keep the API minimal. However,
following different feedback we have now introduced two separate
newLazyGhostInstance() and newLazyProxyInstance() methods.

Anything I do with the object before the call to makeLazyX() will
effectively be reverted, no?

Yes, after calling makeLazy() the object is in the same state as if it
was made lazy immediately after instantiation.

An example showcasing the intended usage, e.g. a simplified ORM example,
would really be helpful here.

Agreed

in userland by iterating on all properties via the Reflection API, and
using unset() in the right scope with a Closure.

spl_object_id(), spl_object_hash(), SplObjectStorage, WeakMap,
WeakReference, strict equality, etc are not affected by makeLazy*().

That is true for both makeLazyGhost(), and makeLazyProxy()?

What would the following example output?
 $object = new MyObject();
 var_dump(spl_object_id($object));
 $r = ReflectionLazyObject::makeLazyGhost($object, function
(MyObject $object) {
$object2 = new MyObject();
var_dump(spl_object_id($object2));
return $object2;
});
var_dump(spl_object_id($object));
$r->initialize();
var_dump(spl_object_id($object));

What would happen if I would expose the inner $object2 to the outer
world by means of the super globals or by means of use (&$out) + $out = $object2?

We have clarified the RFC on these points

The intended use of makeLazyGhost() and makeLazyProxy() is to call
them either on an object created with
ReflectionClass::newInstanceWithoutConstructor(), or on $this in a
constructor. The latter is the reason why these APIs take an existing
object.

Okay, that answers the question above. Technically being capable of
calling it on an object that was not just freshly created sounds like a
footgun, though. What is the interaction with readonly objects? My
understanding is that it would allow an readonly object with initialized
properties to change after-the-fact?

Good point about readonly, this is something we had overlooked. We
have updated the RFC to address this.

Consider this example:
  class Foo {
    public function __destruct() { echo __METHOD__; }
  }

  class Bar {
    public string $s;
    public ?Foo $foo;

   public function __destruct() { echo __METHOD__; }
  }

  $bar = new Bar();
  $bar->foo = new Foo();

  ReflectionLazyObject::makeLazyProxy($bar, function (Bar $bar) {
    $result = new Bar();
    $result->foo = null;
    $result->s = 'init';
    return $result;
  });

  var_dump($bar->s);
My understanding is that this will dump string(4) "init". Will the
destructor of Foo be called? Will the destructor of Bar be called?
This will print:
 Foo::__destruct (during makeLazyProxy())
 string(4) "init" (during `var_dump()`)
and eventually
 Bar::__destruct (when $bar is released)
Okay, so only one Bar::__destruct(), despite two Bar objects being
created. I assume it's the destructor of the second Bar, i.e. if I would
dump $this->foo within the destructor, it would dump null?

Following your feedback we have updated the RFC to include calling
destructors in makeLazy by default. In the updated version,
Bar::__destruct is called twice: Once on the first instance during the
call to makeLazyProxy(), and another time on the second instance (the
one created in the closure) when it's released. It is not called when
the first instance is released, because it's now a proxy.

Will the "revert to its pre-initialization"
work properly when you have nested lazy objects? An example would
probably help.

Only the effects on the object itself are reverted. External side
effects are not reverted.

Yes, it's obvious that external side effects are not reverted. I was
thinking about a situation like:
 $a = new A();
 $b = new B();
 ReflectionLazyObject::makeLazyGhost($b, function ($b) {
     throw new \Exception('xxx');
 });
 ReflectionLazyObject::makeLazyGhost($a, function ($a) use ($b) {
     $a->b = $b->somevalue;
 });
 $a->init = 'please';
The initialization of $a will implicitly attempt to initialize $b, which
will fail. Am I correct in my understanding that both $a and $b will be
reverted back to a lazy object afterwards? If so, adding that example to
the RFC would help to make possible edge cases clear.

We have added an example showing that.

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:
  class A { public string $s; }
  class B extends A { public function foo() { } }

  $o = new B();
  ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
    return new A();
  });

  $o->foo(); // works
  $o->s = 'init';
  $o->foo(); // breaks
$o->foo() calls B::foo() in both cases here, as $o is always the proxy
object. We need to double check, but we believe that this rule doesn't
break LSP.
I don't understand what happens with the 'A' object then, but perhaps
this will become clearer once you add the requested examples.

The 'A' object is what is called the "actual instance" in the RFC. $o
acts as a proxy to the actual instance: Any property access on $o is
forwarded to the actual instance A.

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

We have updated the RFC to address your feedback. Please find
additional answers below.

for some preliminary feedback: I've given the RFC another quick read and
it already reads much better, thank you. The two examples for the two
strategies were pretty illustrative.

I noticed a small typo "makeInstanceLazyHost".
The ORM example uses 'setPropertyValue', I believe it should read
'setRawPropertyValue'.
The LazyConnection example uses 'ReflectionLazyObject', should that
read 'ReflectionLazyObjectFactory'?

I plan to give more detailed feedback at a later point, when i have the
time to think through your reply and the semantics updated RFC.

Best regards
Tim Düsterhus

PS: Thanks, I hate the example of how readonly properties can already
change in existing versions :-(

1 year ago by Larry Garfield — view source

unread

Hi Tim,

We have updated the RFC to address your feedback. Please find
additional answers below.

The updated RFC looks much better, thank you. Though I still have some thoughts, in no particular order.

The actual instance is allowed to escape the proxy and to create direct references to itself.

How? Is this a "return $this" type of situation? This could use more fleshing out and examples.

The terms "virtual" and "proxy" seem to be used interchangeably in different places, including in the API. Please just use one, and purge the other. It's confusing as is. :-) (I'd favor "proxy", as it seems more accurate to what is happening.) For that matter, I'd be very tempted to remove the word "lazy" from the API calls. newGhostInstance() and newProxyInstance() are plenty understandable, and shorter/easier to read. (Similarly, makeGhostInstance() and makeProxyInstance(). Although since those are more about modifying a provided object to be lazy, perhaps "make" isn't the right verb to use as that often means "create".)

Under Common Behavior, you have an example of calling the constructor directly, using the reflection API, but not of binding the callable, which the text says is also available. Please include an example of that so we can evaluate how clumsy (or not) it would be.

After calling newLazyGhostInstance(), the behavior of the object is the same as an object created by newLazyGhostInstance().

I think the first is supposed be a make* call?

When making an existing object lazy, the makeInstanceLazy*() methods call the destructor unless the SKIP_DESTRUCTOR flag is given.

I don't quite get why this is. Admittedly destructors are rarely used, but why does it need to call the destructor?

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:

class Service {
public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class), $c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class, $init);

(Most likely in generated code that can dynamically sort out the container calls to inline.)

Am I missing something?

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

As I said, definitely improved, and I like where it's going. I think it can improve further, though.

--Larry Garfield

1 year ago by Larry Garfield — view source

unread

And of course I got the code sample wrong. It should be:

class Service {
public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class),
$c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class)->newGhostInstance(init);

Sorry about that.

--Larry Garfield

1 year ago by Arnaud Le Blanc — view source

unread

Hi Larry,

The actual instance is allowed to escape the proxy and to create direct references to itself.

How? Is this a "return $this" type of situation? This could use more fleshing out and examples.

"return $this" will return the proxy object, but it is possible to
create references to the actual instance during initialization, either
directly in the initializer function, or in methods called by the
initializer. The "About Proxies" section discusses this a bit. I've
added an example.

The terms "virtual" and "proxy" seem to be used interchangeably in different places, including in the API. Please just use one, and purge the other. It's confusing as is. :-) (I'd favor "proxy", as it seems more accurate to what is happening.)

Agreed

Under Common Behavior, you have an example of calling the constructor directly, using the reflection API, but not of binding the callable, which the text says is also available. Please include an example of that so we can evaluate how clumsy (or not) it would be.

I've clarified that binding can be achieved with Closure::bind(). In
practice I expect there will be two kinds of ghost initializers:

Those that just call one public method of the object, such as the constructor
Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section

After calling newLazyGhostInstance(), the behavior of the object is the same as an object created by newLazyGhostInstance().

I think the first is supposed be a make* call?

Thank you!

When making an existing object lazy, the makeInstanceLazy*() methods call the destructor unless the SKIP_DESTRUCTOR flag is given.

I don't quite get why this is. Admittedly destructors are rarely used, but why does it need to call the destructor?

The rationale is that unless specified otherwise, we must assume that
the constructor has been called on the object. Therefore we must call
the destructor before resetting the object's state entirely. See also
the Mutex example given by Tim. I've added the rationale and an
example.

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:

class Service {
public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class), $c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class, $init);

(Most likely in generated code that can dynamically sort out the container calls to inline.)

Am I missing something?

No you are right, but they must fallback to the proxy strategy when
the user provides a factory.

E.g. this will use the ghost strategy because the DIC
instantiates/initializes the service itself:

my_service:
class: MyClass
arguments: [@service_a, @service_b]
lazy: true

But this will use the proxy strategy because the DIC doesn't
instantiate/initialize the service itself:

my_service:
class: MyClass
arguments: [@service_a, @service_b]
factory: [@my_service_factory, createService]
lazy: true

The RFC didn't make it clear enough that the example was about the
factory case specifically.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

1 year ago by Larry Garfield — view source

unread

Hi Larry,

Under Common Behavior, you have an example of calling the constructor directly, using the reflection API, but not of binding the callable, which the text says is also available. Please include an example of that so we can evaluate how clumsy (or not) it would be.

I've clarified that binding can be achieved with Closure::bind(). In
practice I expect there will be two kinds of ghost initializers:

Those that just call one public method of the object, such as the constructor

Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section

I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :-)

$init = function() use ($c) {
$this->a = $c->get(ServiceA::class);
$this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

In the case an object does not have a destructor, it won't make a difference either way, correct?

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:

class Service {
public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class), $c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class, $init);

(Most likely in generated code that can dynamically sort out the container calls to inline.)

Am I missing something?

No you are right, but they must fallback to the proxy strategy when
the user provides a factory.

E.g. this will use the ghost strategy because the DIC
instantiates/initializes the service itself:

my_service:
class: MyClass
arguments: [@service_a, @service_b]
lazy: true

But this will use the proxy strategy because the DIC doesn't
instantiate/initialize the service itself:

my_service:
class: MyClass
arguments: [@service_a, @service_b]
factory: [@my_service_factory, createService]
lazy: true

The RFC didn't make it clear enough that the example was about the
factory case specifically.

Ah, got it. That makes more sense.

Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing, and I'm unclear what it would do with the proxy object itself that's passed in.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :-)

If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like

$ghost = new ReflectionGhostInstance(SomeClass::class, $init);
$proxy = new ReflectionProxyINstance(SOmeClass::class, $init);

And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

--Larry Garfield

1 year ago by Arnaud Le Blanc — view source

unread

In practice I expect there will be two kinds of ghost initializers:

Those that just call one public method of the object, such as the constructor

Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section
I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :-)

$init = function() use ($c) {
$this->a = $c->get(ServiceA::class);
$this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:

$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));

If you really want to bind $this you could achieve it in a more convoluted way:

$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);

This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

In the case an object does not have a destructor, it won't make a difference either way, correct?

Yes

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:
The RFC didn't make it clear enough that the example was about the
factory case specifically.

Ah, got it. That makes more sense.

Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing,

Good point, $factory would be a good name for this parameter.

and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:

$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);

This was on ghosts, but I think it's also a valid use-case example for proxies.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :-)

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.

foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}

If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like

$ghost = new ReflectionGhostInstance(SomeClass::class, $init);
$proxy = new ReflectionProxyINstance(SOmeClass::class, $init);

And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

Thank you for your help. We will think about a better API.

1 year ago by Larry Garfield — view source

unread

In practice I expect there will be two kinds of ghost initializers:

Those that just call one public method of the object, such as the constructor

Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section
I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :-)

$init = function() use ($c) {
$this->a = $c->get(ServiceA::class);
$this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:
$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));
If you really want to bind $this you could achieve it in a more convoluted way:
$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);
This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

Primarily I was just reacting to this line:

However, for more complex use-cases where the initializer wishes to access non-public properties, it is required to bind the initializer function to the right scope (with Closure::bind()), or to access properties with ReflectionProperty.

And asking "OK, um, how?" Because what I was coming up with didn't make any sense. :-)

It's debatable if the use case of wanting to assign to private properties in the initializer without using Reflection is common enough to warrant more. I'm not sure at this point. If we wanted to, I could see an extra flag that would tell the system to bind the closure to the object before calling it. I don't know how common a need that will be, though, so I won't insist it be included. But I would like to see that line clarified with one of the above examples, because as is, I would expect to be able to bind to the object as I was trying to do and it (obviously) didn't work.

and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:
$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);
This was on ghosts, but I think it's also a valid use-case example for proxies.

Hm, interesting. Please include this sort of example in the RFC, so we know what the use is (and when it won't matter, which seems like it would be the more common case).

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :-)

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.
foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}

Ah, now I see what you mean. Interesting. Including that reasoning in the RFC would be good. Though, I don't know how often I'd be calling isInitialized() on a larger set of objects, hot path or no.

--Larry Garfield

1 year ago by Arnaud Le Blanc — view source

unread

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options): object {}

    public function newLazyGhost(callable $initializer, int $options): object {}

    public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

    public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

    public function initialize(object $object): object {}

    public function isInitialized(object $object): bool {}

    // existing methods
}

class ReflectionProperty
{
    public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

    public function skipInitialization(object $object): void {}

    // existing methods
}

Comments / rationale:

Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

In practice I expect there will be two kinds of ghost initializers:

Those that just call one public method of the object, such as the constructor

Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section
I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :-)

$init = function() use ($c) {
$this->a = $c->get(ServiceA::class);
$this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:
$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));
If you really want to bind $this you could achieve it in a more convoluted way:
$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);
This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

In the case an object does not have a destructor, it won't make a difference either way, correct?

Yes

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:
The RFC didn't make it clear enough that the example was about the
factory case specifically.

Ah, got it. That makes more sense.

Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing,

Good point, $factory would be a good name for this parameter.

and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:
$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);
This was on ghosts, but I think it's also a valid use-case example for proxies.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :-) Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :-)

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.
foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}
If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like

$ghost = new ReflectionGhostInstance(SomeClass::class, $init);
$proxy = new ReflectionProxyINstance(SOmeClass::class, $init);

And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

Thank you for your help. We will think about a better API.

1 year ago by Larry Garfield — view source

unread

Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options): object {}

    public function newLazyGhost(callable $initializer, int $options): object {}

    public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

    public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

    public function initialize(object $object): object {}

    public function isInitialized(object $object): bool {}

    // existing methods
}

class ReflectionProperty
{
    public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

    public function skipInitialization(object $object): void {}

    // existing methods
}
Comments / rationale:

Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier

Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.

We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state

setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization

Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn't been updated yet.)

--Larry Garfield

1 year ago by Nicolas Grekas — view source

unread

Le mar. 18 juin 2024 à 22:59, Larry Garfield larry@garfieldtech.com a
écrit :

Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):

object {}

public function newLazyGhost(callable $initializer, int $options):

object {}

public function resetAsLazyProxy(object $object, callable

$factory, int $options): void {}

public function resetAsLazyGhost(object $object, callable

$initializer, int $options): void {}

public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods

}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

public function skipInitialization(object $object): void {}

// existing methods

}


Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it.
Thanks! (Looks like the RFC text hasn't been updated yet.)

Happy you like it so much! The text of the RFC is now up to date. Note that
we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?

Cheers,
Nicolas

1 year ago by Marco Pivetta — view source

unread

Hey Nicolas,

On Thu, 20 Jun 2024 at 10:50, Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?

Due to IRL events, I haven't checked the text thoroughly yet, but I think I
should, given the previous experience here.

I'd only be able to do so this weekend, though.

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

1 year ago by kontakt@beberlei.de — view source

unread

On Thu, Jun 20, 2024 at 10:52 AM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:

Le mar. 18 juin 2024 à 22:59, Larry Garfield larry@garfieldtech.com a
écrit :
Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):
object {}
public function newLazyGhost(callable $initializer, int $options):
object {}
public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}
public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}
public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}
public function skipInitialization(object $object): void {}

// existing methods
}
Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud
Oh, that looks so much more self-explanatory and readable. I love it.
Thanks! (Looks like the RFC text hasn't been updated yet.)
Happy you like it so much! The text of the RFC is now up to date. Note
that we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?

Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use
naming carefully to make sure that users identify them as belonging to a
sub.feature (lazy objects) in particular, so i would prefer we rename some
of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy /
lazyInitialize - other methods in this RFC are already very outspoken, so I
don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API users
make assumptions about what this relates to (readonly, lazy, …)

2.) I am 100% behind the implementation of lazy ghosts, its really great
work with all the behaviors. Speaking with my Doctrine ORM core developer
hat this has my full support.

3.) the lazy proxies have me worried that we are opening up a can of worms
by having the two objects and the magic of using only the properties of one
and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a
compelling argument for having it, but it is a leaky abstraction solving
the identity issue only on one side, but the factory code might not know
its used for a proxy and make all sorts of decisions based on identity that
lead to problems.

Correct me if i am wrong or missing something, but If the factory does not
know about proxying, then it would also be fine to build a lazy ghost and
copy over all state after using the factory. This creates a similar amount
of problems with identity, but is less magic while doing so. All of the
inheritance heirachy and properties must exist logic can also be
implemented in the userland initializer, passing the responsibility for the
mess over to userland ;)

class Container

{

public function getClientService(): Client

{

    $reflector = new ReflectionClass(Client::class);



    $client = $reflector->newLazyGhost(function (Client $ghost) use

($container) {

        $clientFactory = $container->get('client_factory');

        $client = $clientFactory->createClient();


// not sure this is 100% right, the idea is to copy all state over

        $vars = get_mangled_object_vars($client);

        foreach ($vars as $k => $v) { $ghost->$k = $v; }

    });



    return $client;

}

This would also allow to make „initialize“ return void and simplify this
part of the API.

4.) I am wondering, do we need the resetAs* methods? You can already
implement lazy proxies in userland code by manually writing the code, we
don’t need engine support for that. Not having these two methods would
reduce the surface of the RFC / API considerably. And given the „real
world“ example is not really real world, only the Doctrine
(createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are,
this shows maybe its not needed.

5.) The RFC does not spell it out, but I assume this does not have any
effect on stacktraces, i.e. since properties are proxied, there are no
„magic“ frames appearing in the stacktraces?

Cheers,
Nicolas

1 year ago by Nicolas Grekas — view source

unread

Hi Ben,

Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):
object {}
public function newLazyGhost(callable $initializer, int $options):
object {}
public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}
public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}
public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}
public function skipInitialization(object $object): void {}

// existing methods
}
Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud
Oh, that looks so much more self-explanatory and readable. I love
it. Thanks! (Looks like the RFC text hasn't been updated yet.)
Happy you like it so much! The text of the RFC is now up to date. Note
that we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?
Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use
naming carefully to make sure that users identify them as belonging to a
sub.feature (lazy objects) in particular, so i would prefer we rename some
of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy /
lazyInitialize - other methods in this RFC are already very outspoken, so I
don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API
users make assumptions about what this relates to (readonly, lazy, …)

I get this aspect, I'm fine with either option, dunno if anyone has a
strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.

2.) I am 100% behind the implementation of lazy ghosts, its really great

work with all the behaviors. Speaking with my Doctrine ORM core developer
hat this has my full support.

\o/

3.) the lazy proxies have me worried that we are opening up a can of
worms by having the two objects and the magic of using only the properties
of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a
compelling argument for having it, but it is a leaky abstraction solving
the identity issue only on one side, but the factory code might not know
its used for a proxy and make all sorts of decisions based on identity that
lead to problems.

Correct me if i am wrong or missing something, but If the factory does not
know about proxying, then it would also be fine to build a lazy ghost and
copy over all state after using the factory.

Unfortunately no, copying doesn't work in the generic case: when the
object's dependencies involve a circular reference with the object itself,
the copying strategy can lead to a sort of "brain split" situation where we
have two objects (the proxy and the real object) which still coexist but
can have diverging states.

This is what virtual state proxies solve, by making sure that while we have
two objects, we're sure by design that they have synchronized state.

Yes, $this can leak with proxies, but this is reduced to the strict minimum
in the state-proxy design. Compared to the "brain split" I mentioned, this
is a minor concern.

State-synchronization is costly currently since it relies on magic methods
on every single property access.
From this angle, state-proxies are the ones that benefit the most from
being in the engine.

4.) I am wondering, do we need the resetAs* methods? You can already

implement lazy proxies in userland code by manually writing the code, we
don’t need engine support for that. Not having these two methods would
reduce the surface of the RFC / API considerably. And given the „real
world“ example is not really real world, only the Doctrine
(createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are,
this shows maybe its not needed.

Yes, this use case of making an object lazy after it's been created is
quite useful. It makes it straightforward to turn a class lazy using
inheritance for example (LazyClass extends NonLazyClass), without having to
write nor maintain any decorating logic. From a technical pov, this is just
a different flavor of the same code infrastructure, so this is pretty
aligned with the rest of the proposed API.

5.) The RFC does not spell it out, but I assume this does not have any
effect on stacktraces, i.e. since properties are proxied, there are no
„magic“ frames appearing in the stacktraces?

Nothing special on this domain indeed, there are no added frames (unlike
inheritance proxies since they'd decorate methods).

As a general note, an important design criterion for the RFC has been to
make it a superset of what we can achieve in userland already. Ghost
objects, state proxies, capabilities of resetAsLazy* methods, etc are all
possible today. Making the RFC a subset of those existing capabilities
would defeat the purpose of this proposal, since it would mean we'd have to
keep maintaining the existing code to support the use cases it enables,
with all the associated drawbacks for the PHP community at large.

Nicolas

1 year ago by kontakt@beberlei.de — view source

unread

On Fri, Jun 21, 2024 at 12:24 PM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:

Hi Ben,
Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):
object {}
public function newLazyGhost(callable $initializer, int
$options): object {}
public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}
public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}
public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}
public function skipInitialization(object $object): void {}

// existing methods
}
Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud
Oh, that looks so much more self-explanatory and readable. I love
it. Thanks! (Looks like the RFC text hasn't been updated yet.)
Happy you like it so much! The text of the RFC is now up to date. Note
that we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?
Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use
naming carefully to make sure that users identify them as belonging to a
sub.feature (lazy objects) in particular, so i would prefer we rename some
of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy /
lazyInitialize - other methods in this RFC are already very outspoken, so I
don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API
users make assumptions about what this relates to (readonly, lazy, …)
I get this aspect, I'm fine with either option, dunno if anyone has a
strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.

2.) I am 100% behind the implementation of lazy ghosts, its really great

work with all the behaviors. Speaking with my Doctrine ORM core developer
hat this has my full support.

\o/

3.) the lazy proxies have me worried that we are opening up a can of
worms by having the two objects and the magic of using only the properties
of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a
compelling argument for having it, but it is a leaky abstraction solving
the identity issue only on one side, but the factory code might not know
its used for a proxy and make all sorts of decisions based on identity that
lead to problems.

Correct me if i am wrong or missing something, but If the factory does
not know about proxying, then it would also be fine to build a lazy ghost
and copy over all state after using the factory.

Unfortunately no, copying doesn't work in the generic case: when the
object's dependencies involve a circular reference with the object itself,
the copying strategy can lead to a sort of "brain split" situation where we
have two objects (the proxy and the real object) which still coexist but
can have diverging states.

This is what virtual state proxies solve, by making sure that while we
have two objects, we're sure by design that they have synchronized state.

Yes, $this can leak with proxies, but this is reduced to the strict
minimum in the state-proxy design. Compared to the "brain split" I
mentioned, this is a minor concern.

State-synchronization is costly currently since it relies on magic methods
on every single property access.
From this angle, state-proxies are the ones that benefit the most from
being in the engine.

Makes sense to me.

4.) I am wondering, do we need the resetAs* methods? You can already

implement lazy proxies in userland code by manually writing the code, we
don’t need engine support for that. Not having these two methods would
reduce the surface of the RFC / API considerably. And given the „real
world“ example is not really real world, only the Doctrine
(createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are,
this shows maybe its not needed.

Yes, this use case of making an object lazy after it's been created is
quite useful. It makes it straightforward to turn a class lazy using
inheritance for example (LazyClass extends NonLazyClass), without having to
write nor maintain any decorating logic. From a technical pov, this is just
a different flavor of the same code infrastructure, so this is pretty
aligned with the rest of the proposed API.

Will you use this in Symfony DIC for something? While I can understand the
argument that its easy to integrate with the lazy object code that you
have, i don't see how the argument "without having to write nor maintain
any dceoarting logic" is true.

The example in the RFC is very much written code. Yes its only one line of
new ReflectionClass()->initialie($this)->send($data), but something like
https://gist.github.com/beberlei/568719a1c5536cc5f59a60381c37aa05 is not
more code and works fine.

The (Lazy-)Connection example can be re-written as:

class LazyConnection extends Connection
{
public function create(): Connection
{
return (new
ReflectionClass(Connection::class))->newLazyGhost(function (Connection
$connection) {
$connection->__construct(); // Or any heavier initialization
logic
$connection->ttl = 2.0;
});
}

private function __construct() {
    parent::__construct();
}

}

This to me reads easier, especially when Connection has more than one
public method (send) it requires way less code.

Given the complexities of newLazy* already, i am just trying to find
arguments to keep the public surface of this API as small as posisble, as
its intricacies are hard to grasp and simplicity / less ways to use it will
be a benefit.

So far i don't see that with resetAsLazy* you can impmlement something new
that cannot also be done with newLazy* methods.

5.) The RFC does not spell it out, but I assume this does not have any
effect on stacktraces, i.e. since properties are proxied, there are no
„magic“ frames appearing in the stacktraces?

Nothing special on this domain indeed, there are no added frames (unlike
inheritance proxies since they'd decorate methods).

As a general note, an important design criterion for the RFC has been to
make it a superset of what we can achieve in userland already. Ghost
objects, state proxies, capabilities of resetAsLazy* methods, etc are all
possible today. Making the RFC a subset of those existing capabilities
would defeat the purpose of this proposal, since it would mean we'd have to
keep maintaining the existing code to support the use cases it enables,
with all the associated drawbacks for the PHP community at large.

I very much appreciate the benefits this brings as primary language concept.

Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

Given the complexities of newLazy* already, i am just trying to find
arguments to keep the public surface of this API as small as posisble, as
its intricacies are hard to grasp and simplicity / less ways to use it will
be a benefit.

So far i don't see that with resetAsLazy* you can impmlement something new
that cannot also be done with newLazy* methods.

For the record, I share these concerns and also mentioned them in the
email that I just sent. I've also previously asked this in this email,
including the named constructor as a workaround, just as Benjamin did:

https://externals.io/message/123503#123525

Best regards
Tim Düsterhus

1 year ago by kontakt@beberlei.de — view source

unread

Am 21.06.2024, 12:24:20 schrieb Nicolas Grekas <nicolas.grekas+php@gmail.com

:

Hi Ben,
Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):
object {}
public function newLazyGhost(callable $initializer, int
$options): object {}
public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}
public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}
public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}
public function skipInitialization(object $object): void {}

// existing methods
}
Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud
Oh, that looks so much more self-explanatory and readable. I love
it. Thanks! (Looks like the RFC text hasn't been updated yet.)
Happy you like it so much! The text of the RFC is now up to date. Note
that we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?
Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use
naming carefully to make sure that users identify them as belonging to a
sub.feature (lazy objects) in particular, so i would prefer we rename some
of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy /
lazyInitialize - other methods in this RFC are already very outspoken, so I
don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API
users make assumptions about what this relates to (readonly, lazy, …)
I get this aspect, I'm fine with either option, dunno if anyone has a
strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.

The RFC still has the isInitialized and initialize methods, lets go with
your suggestions isLazyObject, initializeLazyObject, and also maybe
markLazyObjectAsInitialized instead of markAsInitialized?

2.) I am 100% behind the implementation of lazy ghosts, its really great

work with all the behaviors. Speaking with my Doctrine ORM core developer
hat this has my full support.

\o/

3.) the lazy proxies have me worried that we are opening up a can of
worms by having the two objects and the magic of using only the properties
of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a
compelling argument for having it, but it is a leaky abstraction solving
the identity issue only on one side, but the factory code might not know
its used for a proxy and make all sorts of decisions based on identity that
lead to problems.

Correct me if i am wrong or missing something, but If the factory does
not know about proxying, then it would also be fine to build a lazy ghost
and copy over all state after using the factory.

Unfortunately no, copying doesn't work in the generic case: when the
object's dependencies involve a circular reference with the object itself,
the copying strategy can lead to a sort of "brain split" situation where we
have two objects (the proxy and the real object) which still coexist but
can have diverging states.

This is what virtual state proxies solve, by making sure that while we
have two objects, we're sure by design that they have synchronized state.

Yes, $this can leak with proxies, but this is reduced to the strict
minimum in the state-proxy design. Compared to the "brain split" I
mentioned, this is a minor concern.

State-synchronization is costly currently since it relies on magic methods
on every single property access.
From this angle, state-proxies are the ones that benefit the most from
being in the engine.

4.) I am wondering, do we need the resetAs* methods? You can already

implement lazy proxies in userland code by manually writing the code, we
don’t need engine support for that. Not having these two methods would
reduce the surface of the RFC / API considerably. And given the „real
world“ example is not really real world, only the Doctrine
(createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are,
this shows maybe its not needed.

Yes, this use case of making an object lazy after it's been created is
quite useful. It makes it straightforward to turn a class lazy using
inheritance for example (LazyClass extends NonLazyClass), without having to
write nor maintain any decorating logic. From a technical pov, this is just
a different flavor of the same code infrastructure, so this is pretty
aligned with the rest of the proposed API.

5.) The RFC does not spell it out, but I assume this does not have any
effect on stacktraces, i.e. since properties are proxied, there are no
„magic“ frames appearing in the stacktraces?

Nothing special on this domain indeed, there are no added frames (unlike
inheritance proxies since they'd decorate methods).

As a general note, an important design criterion for the RFC has been to
make it a superset of what we can achieve in userland already. Ghost
objects, state proxies, capabilities of resetAsLazy* methods, etc are all
possible today. Making the RFC a subset of those existing capabilities
would defeat the purpose of this proposal, since it would mean we'd have to
keep maintaining the existing code to support the use cases it enables,
with all the associated drawbacks for the PHP community at large.

Nicolas

1 year ago by Nicolas Grekas — view source

unread

Le ven. 12 juil. 2024 à 12:55, Benjamin Außenhofer kontakt@beberlei.de a
écrit :

Am 21.06.2024, 12:24:20 schrieb Nicolas Grekas <
nicolas.grekas+php@gmail.com>:
Hi Ben,
Hi Larry,

Following your feedback we propose to amend the API as follows:
class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options):
object {}
public function newLazyGhost(callable $initializer, int
$options): object {}
public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}
public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}
public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}
public function skipInitialization(object $object): void {}

// existing methods
}
Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud
Oh, that looks so much more self-explanatory and readable. I love
it. Thanks! (Looks like the RFC text hasn't been updated yet.)
Happy you like it so much! The text of the RFC is now up to date. Note
that we renamed ReflectionProperty::skipInitialization() and
setRawValueWithoutInitialization() to skipLazyInitialization() and
setRawValueWithoutLazyInitialization() after we realized that
ReflectionProperty already has an isInitialized() method for something
quite different.

While Arnaud works on moving the code to the updated API, are there
more comments on this RFC before we consider opening the vote?
Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use
naming carefully to make sure that users identify them as belonging to a
sub.feature (lazy objects) in particular, so i would prefer we rename some
of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy /
lazyInitialize - other methods in this RFC are already very outspoken, so I
don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API
users make assumptions about what this relates to (readonly, lazy, …)
I get this aspect, I'm fine with either option, dunno if anyone has a
strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.
The RFC still has the isInitialized and initialize methods, lets go with
your suggestions isLazyObject, initializeLazyObject, and also maybe
markLazyObjectAsInitialized instead of markAsInitialized?

I've updated the RFC with these method names. I've a slight preference
towards shorter names that don't contain the "lazy object" wording but I
feel like the consensus might be to have the more explicit versions, so
fine to me.

1 year ago by tim@bastelstu.be — view source

unread

Hi

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?

I plan to give the RFC another read-through, but will likely not get
around to it before the next week.

Best regards
Tim Düsterhus

1 year ago by tim@bastelstu.be — view source

unread

Hi

I finally got around to giving the RFC another read. Please apologize if
this email asks questions that have already been answered elsewhere, as
the current mailing list volume makes it hard for me to keep up.

Is there any reason to call the makeLazyX() methods on an object that
was not just freshly created with ->newInstanceWithoutConstructor()
then?

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:

Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.

The return value of the initializer has to be an instance of a parent
or a child class of the lazy-object and it must have the same properties.

Would returning a parent class not violate the LSP? Consider the
following example:
   class A { public string $s; }
   class B extends A { public function foo() { } }

   $o = new B();
   ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
     return new A();
   });

   $o->foo(); // works
   $o->s = 'init';
   $o->foo(); // breaks
$o->foo() calls B::foo() in both cases here, as $o is always the proxy
object. We need to double check, but we believe that this rule doesn't
break LSP.
I don't understand what happens with the 'A' object then, but perhaps
this will become clearer once you add the requested examples.
The 'A' object is what is called the "actual instance" in the RFC. $o
acts as a proxy to the actual instance: Any property access on $o is
forwarded to the actual instance A.

I've read the updated RFC and it's still not clear to me that returning
an arbitrary “actual instance” object is sound. Especially when private
properties - which for all intents and purposes are not visible outside
of the class - are involved. Consider the following:

 class A {
   public function __construct(
     public string $property,
   ) {}
 }

 class B extends A {
   public function __construct(
     string $property,
     private string $foo,
   ) { parent::__construct($property); }

   public function getFoo() {
     return $this->foo;
   }
}

$r = new ReflectionClass(B::class);
$obj = $r->newLazyProxy(function ($obj) {
  return new A('value');
});
var_dump($obj->property); // 'value'
var_dump($obj->getFoo()); // Implicitly accesses A::${'\0B\0foo'}

(i.e. the mangled B::$foo property)?

Now you might say that B does not have the same properties as A and
creating the proxy is not legal, but then the addition of a new private
property would immediately break the use of the lazy proxy, which
specifically is something that private properties should not be able to do.

Best regards
Tim Düsterhus

1 year ago by tim@bastelstu.be — view source

unread

Hi

I've read the updated RFC and it's still not clear to me that returning
an arbitrary “actual instance” object is sound. Especially when private
properties - which for all intents and purposes are not visible outside
of the class - are involved. Consider the following:

I initially wanted to include any new questions in a completely separate
thread to keep stuff organized, but I realized that the cloning behavior
is very closely related to what I already remarked above:

The cloning behavior appears to be unsound to me. Consider the following:

 class A {
    public function __construct(
      public string $property,
    ) {}
 }
 class B extends A {
    public function foo() { }
 }

 function only_b(B $b) { $b->foo(); }

 $r = new ReflectionClass(B::class);
 $b = $r->newLazyProxy(function ($obj) {
   return new A('value');
 });

 $b->property = 'init_please';

 $notActuallyB = clone $b;
 only_b($b); // legal
 only_b($notActuallyB); // illegal

I'm cloning what I believe to be an instance of B, but get back an A.

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Hi Valentin, Marco, Benjamin, Tim, Rob,

Thanks for the detailed feedback again, it's very helpful!
Let me try to answer many emails at once, in chronological order:

The RFC says that Virtual state-proxies are necessary because of circular

references. It's difficult to accept this reasoning, because using
circular references is a bad practice and the given example is something I
try to avoid by all means in my code.

While discussing this argument about circular references with Arnaud, we
realized that with this reasoning, we wouldn't have a garbage collector in
the engine. Yet and fortunately, there is one because circular references
are an important thing that exists in practice. We have to account for
circular references, that's not an option.

don't touch readonly because of lazy objects: this feature is too niche

to cripple a major-major feature like readonly. I would suggest deferring
until after the first bits of this RFC landed.

Following Marco's advice, we've decided to remove all the flags related to
the various ways to handle readonly. This also removes the secondary vote.
The behavior related to readonly properties is now that they are skipped if
already initialized when calling resetAsLazy* methods, throw in the
initializer as usual, and are resettable only if the class is not final, as
already allowed in userland (and as explained in the RFC).

I finally got around to giving the RFC another read. Please apologize if

this email asks questions that have already been answered elsewhere, as
the current mailing list volume makes it hard for me to keep up.

Is there any reason to call the makeLazyX() methods on an object that
was not just freshly created with ->newInstanceWithoutConstructor()
then?

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:

Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.

This capability is needed for two reasons: 1. completeness and 2. feature
parity with what can be currently done using magic methods (so that it's
already used to solve real-world problems).

This relates to Benjamin's question about using a static factory instead of
a constructor. This is a valid alternative, but it can be used only when
you are in control of the instantiation logic. That's not always the case.
E.g. Doctrine uses the "new $class" pattern in its configuration system.
Whether this is a good idea or not is not the topic. But this pattern means
that as a user of Doctrine, you sometimes have to provide a class name and
can't use any other constructor. Doctrine is just an example of course.
Another example is when you have a library that wants to make one of its
classes lazy: let's say __ construct() is the way for the users of this lib
to use it (pretty common), then moving to a static factory is not possible
without a BC break.

So yes, turning an existing instance lazy is definitely needed.
About readonly, see the simplification above.

The return value of the initializer has to be an instance of a
parent
or a child class of the lazy-object and it must have the same
properties.

Would returning a parent class not violate the LSP? Consider the
following example:
   class A { public string $s; }
   class B extends A { public function foo() { } }

   $o = new B();
   ReflectionLazyObject::makeLazyProxy($o, function (B $o) {
     return new A();
   });

   $o->foo(); // works
   $o->s = 'init';
   $o->foo(); // breaks
$o->foo() calls B::foo() in both cases here, as $o is always the proxy
object. We need to double check, but we believe that this rule doesn't
break LSP.
I don't understand what happens with the 'A' object then, but perhaps
this will become clearer once you add the requested examples.
The 'A' object is what is called the "actual instance" in the RFC. $o
acts as a proxy to the actual instance: Any property access on $o is
forwarded to the actual instance A.
I've read the updated RFC and it's still not clear to me that returning
an arbitrary “actual instance” object is sound. Especially when private
properties - which for all intents and purposes are not visible outside
of the class - are involved. Consider the following:
 class A {
   public function __construct(
     public string $property,
   ) {}
 }

 class B extends A {
   public function __construct(
     string $property,
     private string $foo,
   ) { parent::__construct($property); }

   public function getFoo() {
     return $this->foo;
   }
}

$r = new ReflectionClass(B::class);
$obj = $r->newLazyProxy(function ($obj) {
  return new A('value');
});
var_dump($obj->property); // 'value'
var_dump($obj->getFoo()); // Implicitly accesses A::${'\0B\0foo'}
(i.e. the mangled B::$foo property)?

Now you might say that B does not have the same properties as A and
creating the proxy is not legal, but then the addition of a new private
property would immediately break the use of the lazy proxy, which
specifically is something that private properties should not be able to do.

True, thanks for raising this point. After brainstorming with Arnaud, we
improved this behavior by:

allowing only parent classes, not child classes
requiring that all properties from a real instance have a corresponding
one on the proxy OR that the extra properties on the proxy are skipped/set
before initialization.

This means that it's now possible for a child class to add a property,
private or not. There's one requirement: the property must be skipped or
set before initialization.

For the record, with magic methods, we currently have no choice but to
create an inheritance proxy. This means the situation of having Proxy
extend Real like in your example is the norm. While doing so, it's pretty
common to attach some interface so that we can augment Real with extra
capabilities (let's say Proxy implements LazyObjectInterface). Being able
to use class Real as a backing store for Proxy gives us a very smooth
upgrade path (the implementation of the laziness can remain an internal
detail), and it's also sometimes the only way to leverage a factory that
returns Real, not Proxy.

The cloning behavior appears to be unsound to me. Consider the following:

 class A {
    public function __construct(
      public string $property,
    ) {}
 }
 class B extends A {
    public function foo() { }
 }

 function only_b(B $b) { $b->foo(); }

 $r = new ReflectionClass(B::class);
 $b = $r->newLazyProxy(function ($obj) {
   return new A('value');
 });

 $b->property = 'init_please';

 $notActuallyB = clone $b;
 only_b($b); // legal
 only_b($notActuallyB); // illegal

I'm cloning what I believe to be an instance of B, but get back an A.

That is very true. I had a look at the userland implementation and indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

flags should be a list<SomeEnumAroundProxies> instead. A bitmask
for
a new API feels unsafe and anachronistic, given the tiny performance
hit.

Unfortunately this leads to a 30% slowdown in newLazyGhost() when
switching
to an array of enums, in a micro benchmark. I'm not sure how this would
impact a real application, but given this is a performance critical

I'm curious, how did the implementation look like?

I'll let Arnaud answer this one.

Any access to a non-existant (i.e. dynamic) property will trigger

initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?

While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.

Absolutely. From a behavioral PoV, dynamic vs non-dynamic properties
doesn't matter: both kinds are uninitialized at this stage and the engine
will trigger object handlers in the same way (it will just not trigger the
same object handlers).

If the object is already lazy, a ReflectionException is thrown with
the message “Object is already lazy”.

What happens when calling the method on a initialized proxy object?
i.e. the following:
 class Obj { public function __construct(public string $name) {} }
 $obj1 = new Obj('obj1');
 $r->resetAsLazyProxy($obj, ...);
 $r->initialize($obj);
 $r->resetAsLazyProxy($obj, ...);
What happens when calling it for the actual object of an initialized
proxy object?

Once initialized, a lazy object should be indistinguishable from a non-lazy
one.
This means that the second call to resetAsLazyProxy will just do that:
reset the object like it does for any regular object.

It's probably not possible to prevent this, but will this
allow for proxy chains? Example:

 class Obj { public function __construct(public string $name) {} }
 $obj1 = new Obj('obj1');
 $r->resetAsLazyProxy($obj1, function () use (&$obj2) {
     $obj2 = new Obj('obj2');
     return $obj2;
 });
 $r->resetAsLazyProxy($obj2, function () {
     return new Obj('obj3');
 });
 var_dump($obj1->name); // what will this print?

This example doesn't work because $obj2 doesn't exist when trying to make
it lazy but you probably mean this instead?

 class Obj { public function __construct(public string $name) {} }

 $obj1 = new Obj('obj1');
 $obj2 = new Obj('obj2');
 $r->resetAsLazyProxy($obj1, function () use ($obj2) {
     return $obj2;
 });
 $r->resetAsLazyProxy($obj2, function () {
     return new Obj('obj3');
 });
 var_dump($obj1->name); // what will this print?

This will print "obj3": each object is separate from the other from a
behavioral perspective, but with such a chain, accessing $obj1 will trigger
its initializer and will then access $obj2->name, which will trigger the
second initializer then access $obj3->name, which contains "obj3".
(I just confirmed with the implementation I have, which is from a previous
API flavor, but the underlying mechanisms are the same).

I just noticed in the RFC that I don't see any mention of what happens when

running get_class, get_debug_type, etc., on the proxies, but it does
mention var_dump.

Yes, because there is nothing to say on the topic: turning an instance lazy
doesn't change anything regarding the type-system so that these will return
the same result - the class of the object.

The RFC is in sync with this message, please have a look for clarifications.

Please let me know if any topics remain unanswered.

Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

Thanks for the detailed feedback again, it's very helpful!
Let me try to answer many emails at once, in chronological order:

Note that this kind of bulk reply make it very hard for me to keep track
of mailing list threads. It breaks threading, which makes it much harder
for me to find original context of a quoted part, especially since you
did not include the author / date for the quotes.

That said, I've taken a look at the differences since my email and also
gave the entire RFC another read.

don't touch readonly because of lazy objects: this feature is too niche

to cripple a major-major feature like readonly. I would suggest deferring
until after the first bits of this RFC landed.

Following Marco's advice, we've decided to remove all the flags related to
the various ways to handle readonly. This also removes the secondary vote.
The behavior related to readonly properties is now that they are skipped if
already initialized when calling resetAsLazy* methods, throw in the
initializer as usual, and are resettable only if the class is not final, as
already allowed in userland (and as explained in the RFC).

The 'readonly' section still mentions 'makeInstanceLazy', which likely
is a left-over from a previous version of the RFC. You should have
another look and clean up the naming there.

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:

Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.

This capability is needed for two reasons: 1. completeness and 2. feature
parity with what can be currently done using magic methods (so that it's
already used to solve real-world problems).

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

True, thanks for raising this point. After brainstorming with Arnaud, we
improved this behavior by:

allowing only parent classes, not child classes

requiring that all properties from a real instance have a corresponding
one on the proxy OR that the extra properties on the proxy are skipped/set
before initialization.

This means that it's now possible for a child class to add a property,
private or not. There's one requirement: the property must be skipped or
set before initialization.

For the record, with magic methods, we currently have no choice but to
create an inheritance proxy. This means the situation of having Proxy
extend Real like in your example is the norm. While doing so, it's pretty
common to attach some interface so that we can augment Real with extra
capabilities (let's say Proxy implements LazyObjectInterface). Being able
to use class Real as a backing store for Proxy gives us a very smooth
upgrade path (the implementation of the laziness can remain an internal
detail), and it's also sometimes the only way to leverage a factory that
returns Real, not Proxy.

I'm not entirely convinced that this is sound now, but I'm not in a
state to think this through in detail.

I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on the proxy instance (including overlapping properties used with ReflectionProperty::skipLazyInitialization() or setRawValueWithoutLazyInitialization()) to synchronize the state shared by both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit. My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?! A more expansive explanation would be helpful. Possibly with
an example that explains what would break if this would not happen.

That is very true. I had a look at the userland implementation and indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:

 function (object $clonedProxy) use ($originalProxy) {
     return clone $originalProxy->getRealObject();
 }

? Then I believe this is unsound. Consider the following:

 $myProxy = $r->newLazyProxy(...);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $myProxy->someProp++;
 var_dump($clonedProxy->someProp);

The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.

Any access to a non-existant (i.e. dynamic) property will trigger

initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?

While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.

Absolutely. From a behavioral PoV, dynamic vs non-dynamic properties
doesn't matter: both kinds are uninitialized at this stage and the engine
will trigger object handlers in the same way (it will just not trigger the
same object handlers).

Unless I missed it, you didn't update the RFC to mention this. Please do
so, I find it important to have a record of all details that were
discussed (e.g. for the documentation or when evaluating bug reports).

If the object is already lazy, a ReflectionException is thrown with
the message “Object is already lazy”.

What happens when calling the method on a initialized proxy object?
i.e. the following:
  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $r->resetAsLazyProxy($obj, ...);
  $r->initialize($obj);
  $r->resetAsLazyProxy($obj, ...);
What happens when calling it for the actual object of an initialized
proxy object?
Once initialized, a lazy object should be indistinguishable from a non-lazy
one.
This means that the second call to resetAsLazyProxy will just do that:
reset the object like it does for any regular object.
It's probably not possible to prevent this, but will this
allow for proxy chains? Example:
  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $r->resetAsLazyProxy($obj1, function () use (&$obj2) {
      $obj2 = new Obj('obj2');
      return $obj2;
  });
  $r->resetAsLazyProxy($obj2, function () {
      return new Obj('obj3');
  });
  var_dump($obj1->name); // what will this print?
This example doesn't work because $obj2 doesn't exist when trying to make
it lazy but you probably mean this instead?

Ah, yes you are right. An initialization is missing in the middle of the
two reset calls (like in the previous example). My question was
specifically about resetting an initialized proxy, so your adjusted
example is not quite what I was looking for, but the results should
probably be the same?

  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $obj2 = new Obj('obj2');
  $r->resetAsLazyProxy($obj1, function () use ($obj2) {
      return $obj2;
  });
  $r->resetAsLazyProxy($obj2, function () {
      return new Obj('obj3');
  });
  var_dump($obj1->name); // what will this print?
This will print "obj3": each object is separate from the other from a
behavioral perspective, but with such a chain, accessing $obj1 will trigger
its initializer and will then access $obj2->name, which will trigger the
second initializer then access $obj3->name, which contains "obj3".
(I just confirmed with the implementation I have, which is from a previous
API flavor, but the underlying mechanisms are the same).

Okay, that works as expected then.

Please let me know if any topics remain unanswered.

I've indeed found two more questions.

Just to confirm my understanding: The RFC mentions that the initializer
of a proxy receives the proxy object as the first parameter. It further
mentions that making changes is legal (but likely useless).

My understanding is that attempting to read a property of the
initializer object will most likely fail, because it still is
uninitialized? Or are the properties of the proxy object initialized
with their default value before calling the initializer?

For ghost objects the behavior is clear, just not for proxies.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Le ven. 5 juil. 2024 à 21:49, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Thanks for the detailed feedback again, it's very helpful!
Let me try to answer many emails at once, in chronological order:

Note that this kind of bulk reply make it very hard for me to keep track
of mailing list threads. It breaks threading, which makes it much harder
for me to find original context of a quoted part, especially since you
did not include the author / date for the quotes.

Noted.

That said, I've taken a look at the differences since my email and also
gave the entire RFC another read.

don't touch readonly because of lazy objects: this feature is too niche

to cripple a major-major feature like readonly. I would suggest
deferring
until after the first bits of this RFC landed.

Following Marco's advice, we've decided to remove all the flags related
to
the various ways to handle readonly. This also removes the secondary
vote.
The behavior related to readonly properties is now that they are skipped
if
already initialized when calling resetAsLazy* methods, throw in the
initializer as usual, and are resettable only if the class is not final,
as
already allowed in userland (and as explained in the RFC).

The 'readonly' section still mentions 'makeInstanceLazy', which likely
is a left-over from a previous version of the RFC. You should have
another look and clean up the naming there.

I found a few other outdated occurrences. They should all be updated now.

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:

Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.

This capability is needed for two reasons: 1. completeness and 2. feature
parity with what can be currently done using magic methods (so that it's
already used to solve real-world problems).

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

Keeping this capability for userland is not an option for me as it would
mostly defeat my goal, which is to get rid of any userland code on this
topic (and is achieved by the RFC).

Here is a real-world example:
https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and
would greatly benefit from resetAsLazyGhost().

True, thanks for raising this point. After brainstorming with Arnaud, we
improved this behavior by:

allowing only parent classes, not child classes

requiring that all properties from a real instance have a
corresponding
one on the proxy OR that the extra properties on the proxy are
skipped/set
before initialization.

This means that it's now possible for a child class to add a property,
private or not. There's one requirement: the property must be skipped or
set before initialization.

For the record, with magic methods, we currently have no choice but to
create an inheritance proxy. This means the situation of having Proxy
extend Real like in your example is the norm. While doing so, it's pretty
common to attach some interface so that we can augment Real with extra
capabilities (let's say Proxy implements LazyObjectInterface). Being able
to use class Real as a backing store for Proxy gives us a very smooth
upgrade path (the implementation of the laziness can remain an internal
detail), and it's also sometimes the only way to leverage a factory that
returns Real, not Proxy.

I'm not entirely convinced that this is sound now, but I'm not in a
state to think this through in detail.

I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on
the proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared by
both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Properties that are declared on the real instance are bound to the proxy
instance, so that accessing any of these properties on the proxy forwards
the operation to the corresponding property on the real instance. This
includes properties used with ReflectionProperty::skipLazyInitialization()
or setRawValueWithoutLazyInitialization().

My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?!

It follows that more properties can exist on the proxy itself (declared by
child classes of the real object that the proxy implements).

That is very true. I had a look at the userland implementation and
indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we
now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
 function (object $clonedProxy) use ($originalProxy) {
     return clone $originalProxy->getRealObject();
 }

Nope, that's not what we describe in the RFC so I hope you can read it
again and get where you were confused and tell us if we're not clear enough
(to me we are :) )

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

? Then I believe this is unsound. Consider the following:
 $myProxy = $r->newLazyProxy(...);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $myProxy->someProp++;
 var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.

That's basically what we do and what we describe in the RFC, just with the
added lazy-clone operation on the instance returned by the initializer.

Any access to a non-existant (i.e. dynamic) property will trigger

initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?

While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.

Absolutely. From a behavioral PoV, dynamic vs non-dynamic properties
doesn't matter: both kinds are uninitialized at this stage and the engine
will trigger object handlers in the same way (it will just not trigger
the
same object handlers).

Unless I missed it, you didn't update the RFC to mention this. Please do
so, I find it important to have a record of all details that were
discussed (e.g. for the documentation or when evaluating bug reports).

Updated.

If the object is already lazy, a ReflectionException is thrown with
the message “Object is already lazy”.

What happens when calling the method on a initialized proxy object?
i.e. the following:
  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $r->resetAsLazyProxy($obj, ...);
  $r->initialize($obj);
  $r->resetAsLazyProxy($obj, ...);
What happens when calling it for the actual object of an initialized
proxy object?
Once initialized, a lazy object should be indistinguishable from a
non-lazy
one.
This means that the second call to resetAsLazyProxy will just do that:
reset the object like it does for any regular object.
It's probably not possible to prevent this, but will this
allow for proxy chains? Example:
  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $r->resetAsLazyProxy($obj1, function () use (&$obj2) {
      $obj2 = new Obj('obj2');
      return $obj2;
  });
  $r->resetAsLazyProxy($obj2, function () {
      return new Obj('obj3');
  });
  var_dump($obj1->name); // what will this print?
This example doesn't work because $obj2 doesn't exist when trying to make
it lazy but you probably mean this instead?
Ah, yes you are right. An initialization is missing in the middle of the
two reset calls (like in the previous example). My question was
specifically about resetting an initialized proxy, so your adjusted
example is not quite what I was looking for, but the results should
probably be the same?

I guess so yes if I understood you correctly.

  class Obj { public function __construct(public string $name) {} }
  $obj1 = new Obj('obj1');
  $obj2 = new Obj('obj2');
  $r->resetAsLazyProxy($obj1, function () use ($obj2) {
      return $obj2;
  });
  $r->resetAsLazyProxy($obj2, function () {
      return new Obj('obj3');
  });
  var_dump($obj1->name); // what will this print?
This will print "obj3": each object is separate from the other from a
behavioral perspective, but with such a chain, accessing $obj1 will
trigger
its initializer and will then access $obj2->name, which will trigger the
second initializer then access $obj3->name, which contains "obj3".
(I just confirmed with the implementation I have, which is from a
previous
API flavor, but the underlying mechanisms are the same).
Okay, that works as expected then.

Please let me know if any topics remain unanswered.

I've indeed found two more questions.

Just to confirm my understanding: The RFC mentions that the initializer
of a proxy receives the proxy object as the first parameter. It further
mentions that making changes is legal (but likely useless).

My understanding is that attempting to read a property of the
initializer object will most likely fail, because it still is
uninitialized? Or are the properties of the proxy object initialized
with their default value before calling the initializer?

RFC updated. Those properties will remain uninitialized for proxies.

For ghost objects the behavior is clear, just not for proxies.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

That's because this is observable using e.g. (array) or var_dump.

Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

Keeping this capability for userland is not an option for me as it would
mostly defeat my goal, which is to get rid of any userland code on this
topic (and is achieved by the RFC).

Here is a real-world example:
https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and
would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on
the proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared by
both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:

 public PropertyType $propertyName {
     get {
         return $this->realInstance->propertyName;
     }
     set(PropertyType $value) {
         $this->realInstance->propertyName = $value;
     }
 }

And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?!

It follows that more properties can exist on the proxy itself (declared by
child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.

That is very true. I had a look at the userland implementation and
indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we
now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it
again and get where you were confused and tell us if we're not clear enough
(to me we are :) )

The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?

  $predefinedObject = new SomeObj();
  $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
      return $predefinedObject;
  });
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($clonedProxy);

It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.

? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
That's basically what we do and what we describe in the RFC, just with the
added lazy-clone operation on the instance returned by the initializer.

This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:

  $predefinedObject = new SomeObj();
  $initializer = function () use ($predefinedObject) {
      return $predefinedObject;
  };
  $myProxy = $r->newLazyProxy($initializer);
  $otherProxy = $r->newLazyProxy($initializer);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($otherProxy);
  $r->initialize($clonedProxy);

To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus

1 year ago by kontakt@beberlei.de — view source

unread

Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean

that the cost-benefit ratio is appropriate for inclusion in core. I get

behind the two examples in the “About Lazy-Loading Strategies” section,

but I'm afraid I still can't wrap my head why I would want an object

that makes itself lazy in its own constructor: I have not yet seen a

real-world example.

Keeping this capability for userland is not an option for me as it would

mostly defeat my goal, which is to get rid of any userland code on this

topic (and is achieved by the RFC).

Here is a real-world example:

https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and

would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony
Service Entity Repository extension are not a necessary real world case
that would require this RFC to include a way for classes to make
themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory
(Doctrine code itself) and ContainerRepositoryFactory in a way to make the
repositories lazy without needing resetAsLazy, just
$reflector->createLazyProxy. In case of the second the
LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Please let me know if this is not how it works or can work or if my
reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there
is always a way to just use newLazy*. And when a library does not expose
new $object to you to override, then that is an architectural choice (and
maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this RFC
and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a
$factory that can arbitrarily create and get objects from somewhere, but
also initializer and always force the lazy object to be an instance created
by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code

infrastructure, so this is pretty aligned with the rest of the proposed
API.

We are not specifically considering the technical POV, but even more
importantly the user facing API. And this just adds to the surface of the
API a lot of things that are pushing only a 1-5% edge case.

I have one question regarding the updated initialization sequence. The

RFC writes:

Properties that are declared on the real instance are uninitialized on

the proxy instance (including overlapping properties used with

ReflectionProperty::skipLazyInitialization() or

setRawValueWithoutLazyInitialization()) to synchronize the state shared
by

both instances.

I do not understand this. Specifically I do not understand the "to

synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
public PropertyType $propertyName {
    get {
        return $this->realInstance->propertyName;
    }
    set(PropertyType $value) {
        $this->realInstance->propertyName = $value;
    }
}
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will

always forward the property access, so there effectively is no state on

the proxy?!

It follows that more properties can exist on the proxy itself (declared by

child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.

That is very true. I had a look at the userland implementation and

indeed,

we keep the wrapper while cloning the backing instance (it's not that we

have the choice, the engine doesn't give us any other options).

RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we

now

postpone calling $real->__clone to the moment where the proxy clone is

initialized.

Do I understand it correctly that the initializer of the cloned proxy is

effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it

again and get where you were confused and tell us if we're not clear enough

(to me we are :) )

The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's

initializers that are shared between clones.

And then, when we call that shared initializer in the $clonedProxy, we

clone the returned instance, so that even if the initializer returns a

shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
 $predefinedObject = new SomeObj();
 $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
     return $predefinedObject;
 });
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.

? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the

value after modification!

Also: What happens if the cloned proxy is initialized before the

original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep

the same initializer. Then both proxies are actually fully independent

after cloning, as I would expect from the clone operation.

That's basically what we do and what we describe in the RFC, just with the

added lazy-clone operation on the instance returned by the initializer.

This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are

initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is

the reason that you removed that? One possible reason that comes to my

mind is a default value that refers to a non-existing constant. It would

be observable because the initialization emits an error. Are there any

other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus

1 year ago by Rob Landers — view source

unread

Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

Keeping this capability for userland is not an option for me as it would
mostly defeat my goal, which is to get rid of any userland code on this
topic (and is achieved by the RFC).

Here is a real-world example:
https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and
would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony Service Entity Repository extension are not a necessary real world case that would require this RFC to include a way for classes to make themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory (Doctrine code itself) and ContainerRepositoryFactory in a way to make the repositories lazy without needing resetAsLazy, just $reflector->createLazyProxy. In case of the second the LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Please let me know if this is not how it works or can work or if my reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there is always a way to just use newLazy*. And when a library does not expose new $object to you to override, then that is an architectural choice (and maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this RFC and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a $factory that can arbitrarily create and get objects from somewhere, but also initializer and always force the lazy object to be an instance created by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code infrastructure, so this is pretty aligned with the rest of the proposed API.

We are not specifically considering the technical POV, but even more importantly the user facing API. And this just adds to the surface of the API a lot of things that are pushing only a 1-5% edge case.
I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on
the proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared by
both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
public PropertyType $propertyName {
    get {
        return $this->realInstance->propertyName;
    }
    set(PropertyType $value) {
        $this->realInstance->propertyName = $value;
    }
}
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?!

It follows that more properties can exist on the proxy itself (declared by
child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.
That is very true. I had a look at the userland implementation and
indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we
now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it
again and get where you were confused and tell us if we're not clear enough
(to me we are :) )
The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
 $predefinedObject = new SomeObj();
 $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
     return $predefinedObject;
 });
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.
? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
That's basically what we do and what we describe in the RFC, just with the
added lazy-clone operation on the instance returned by the initializer.
This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus

For what it’s worth, I see “resetAsLazy()” being most useful for unit testing libraries that build proxies. While this feature will remove most of the tricky nuances around proxies, it doesn’t make it any easier in generating the code for them, so that has to be tested. Being able to write a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.

— Rob

1 year ago by kontakt@beberlei.de — view source

unread

Am 12.07.2024, 08:00:18 schrieb Rob Landers rob@bottled.codes:

Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean

that the cost-benefit ratio is appropriate for inclusion in core. I get

behind the two examples in the “About Lazy-Loading Strategies” section,

but I'm afraid I still can't wrap my head why I would want an object

that makes itself lazy in its own constructor: I have not yet seen a

real-world example.

Keeping this capability for userland is not an option for me as it would

mostly defeat my goal, which is to get rid of any userland code on this

topic (and is achieved by the RFC).

Here is a real-world example:

https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and

would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony
Service Entity Repository extension are not a necessary real world case
that would require this RFC to include a way for classes to make
themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory
(Doctrine code itself) and ContainerRepositoryFactory in a way to make the
repositories lazy without needing resetAsLazy, just
$reflector->createLazyProxy. In case of the second the
LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Please let me know if this is not how it works or can work or if my
reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there
is always a way to just use newLazy*. And when a library does not expose
new $object to you to override, then that is an architectural choice (and
maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this
RFC and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a
$factory that can arbitrarily create and get objects from somewhere, but
also initializer and always force the lazy object to be an instance created
by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code
infrastructure, so this is pretty aligned with the rest of the proposed
API.

We are not specifically considering the technical POV, but even more
importantly the user facing API. And this just adds to the surface of the
API a lot of things that are pushing only a 1-5% edge case.

I have one question regarding the updated initialization sequence. The

RFC writes:

Properties that are declared on the real instance are uninitialized on

the proxy instance (including overlapping properties used with

ReflectionProperty::skipLazyInitialization() or

setRawValueWithoutLazyInitialization()) to synchronize the state shared
by

both instances.

I do not understand this. Specifically I do not understand the "to

synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
public PropertyType $propertyName {
    get {
        return $this->realInstance->propertyName;
    }
    set(PropertyType $value) {
        $this->realInstance->propertyName = $value;
    }
}
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will

always forward the property access, so there effectively is no state on

the proxy?!

It follows that more properties can exist on the proxy itself (declared by

child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.

That is very true. I had a look at the userland implementation and

indeed,

we keep the wrapper while cloning the backing instance (it's not that we

have the choice, the engine doesn't give us any other options).

RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we

now

postpone calling $real->__clone to the moment where the proxy clone is

initialized.

Do I understand it correctly that the initializer of the cloned proxy is

effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it

again and get where you were confused and tell us if we're not clear enough

(to me we are :) )

The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's

initializers that are shared between clones.

And then, when we call that shared initializer in the $clonedProxy, we

clone the returned instance, so that even if the initializer returns a

shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
 $predefinedObject = new SomeObj();
 $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
     return $predefinedObject;
 });
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.

? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the

value after modification!

Also: What happens if the cloned proxy is initialized before the

original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep

the same initializer. Then both proxies are actually fully independent

after cloning, as I would expect from the clone operation.

That's basically what we do and what we describe in the RFC, just with the

added lazy-clone operation on the instance returned by the initializer.

This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are

initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is

the reason that you removed that? One possible reason that comes to my

mind is a default value that refers to a non-existing constant. It would

be observable because the initialization emits an error. Are there any

other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus

For what it’s worth, I see “resetAsLazy()” being most useful for unit
testing libraries that build proxies. While this feature will remove most
of the tricky nuances around proxies, it doesn’t make it any easier in
generating the code for them, so that has to be tested. Being able to write
a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.

I don’t think this RFC can replace any logic from mock testing libraries
and doesn’t need the objects to be lazy. Maybe I am not seeing the use
case here though.

The code generation part of a mock library to add the assertion logic needs
to happen anyways and making them lazy to defer initialization does not
seem a useful thing for a test library to do from my POV.

You can already do with ReflectionClass::newInstanceWithoutConstructor
everything that is needed for building mocks.

The only thing a lazy proxy / ghost could reasonbly do for mocking is to
allow saying what method was first called on the mock, but only when using
debug_backtrace in the factory method.

Maybe we could extend the proxy functionality in a follow-up RFC to allow
passing a $callInterceptor callback that gets invoked on every call to the
proxy. But this does not make reset* methods necessary.

— Rob

1 year ago by Rob Landers — view source

unread

Am 12.07.2024, 08:00:18 schrieb Rob Landers rob@bottled.codes:
Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

Keeping this capability for userland is not an option for me as it would
mostly defeat my goal, which is to get rid of any userland code on this
topic (and is achieved by the RFC).

Here is a real-world example:
https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and
would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony Service Entity Repository extension are not a necessary real world case that would require this RFC to include a way for classes to make themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory (Doctrine code itself) and ContainerRepositoryFactory in a way to make the repositories lazy without needing resetAsLazy, just $reflector->createLazyProxy. In case of the second the LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Please let me know if this is not how it works or can work or if my reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there is always a way to just use newLazy*. And when a library does not expose new $object to you to override, then that is an architectural choice (and maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this RFC and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a $factory that can arbitrarily create and get objects from somewhere, but also initializer and always force the lazy object to be an instance created by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code infrastructure, so this is pretty aligned with the rest of the proposed API.

We are not specifically considering the technical POV, but even more importantly the user facing API. And this just adds to the surface of the API a lot of things that are pushing only a 1-5% edge case.
I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on
the proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared by
both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
public PropertyType $propertyName {
    get {
        return $this->realInstance->propertyName;
    }
    set(PropertyType $value) {
        $this->realInstance->propertyName = $value;
    }
}
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?!

It follows that more properties can exist on the proxy itself (declared by
child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.
That is very true. I had a look at the userland implementation and
indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we
now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it
again and get where you were confused and tell us if we're not clear enough
(to me we are :) )
The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
 $predefinedObject = new SomeObj();
 $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
     return $predefinedObject;
 });
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.
? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
That's basically what we do and what we describe in the RFC, just with the
added lazy-clone operation on the instance returned by the initializer.
This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus
For what it’s worth, I see “resetAsLazy()” being most useful for unit testing libraries that build proxies. While this feature will remove most of the tricky nuances around proxies, it doesn’t make it any easier in generating the code for them, so that has to be tested. Being able to write a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.
I don’t think this RFC can replace any logic from mock testing libraries and doesn’t need the objects to be lazy. Maybe I am not seeing the use case here though.

I'm not talking about mocks, I'm talking about testing code generation of proxies. Those may or may not be used in mocks, DI containers, etc. Very often you have to have integration tests that make sure weird code is handled and generated properly and that those proxies resolve correctly. This is especially important when refactoring code to implement this feature to ensure everything still works as before and detect any regressions.

The code generation part of a mock library to add the assertion logic needs to happen anyways and making them lazy to defer initialization does not seem a useful thing for a test library to do from my POV.

You can already do with ReflectionClass::newInstanceWithoutConstructor everything that is needed for building mocks.

Can you proxy an abstract class? If so, newInstanceWithoutConstructor won't work.

— Rob

1 year ago by Nicolas Grekas — view source

unread

Le ven. 12 juil. 2024 à 08:00, Rob Landers rob@bottled.codes a écrit :

Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean

that the cost-benefit ratio is appropriate for inclusion in core. I get

behind the two examples in the “About Lazy-Loading Strategies” section,

but I'm afraid I still can't wrap my head why I would want an object

that makes itself lazy in its own constructor: I have not yet seen a

real-world example.

Keeping this capability for userland is not an option for me as it would

mostly defeat my goal, which is to get rid of any userland code on this

topic (and is achieved by the RFC).

Here is a real-world example:

https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and

would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony
Service Entity Repository extension are not a necessary real world case
that would require this RFC to include a way for classes to make
themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory
(Doctrine code itself) and ContainerRepositoryFactory in a way to make the
repositories lazy without needing resetAsLazy, just
$reflector->createLazyProxy. In case of the second the
LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Please let me know if this is not how it works or can work or if my
reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there
is always a way to just use newLazy*. And when a library does not expose
new $object to you to override, then that is an architectural choice (and
maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this
RFC and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a
$factory that can arbitrarily create and get objects from somewhere, but
also initializer and always force the lazy object to be an instance created
by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code
infrastructure, so this is pretty aligned with the rest of the proposed
API.

We are not specifically considering the technical POV, but even more
importantly the user facing API. And this just adds to the surface of the
API a lot of things that are pushing only a 1-5% edge case.

I have one question regarding the updated initialization sequence. The

RFC writes:

Properties that are declared on the real instance are uninitialized on

the proxy instance (including overlapping properties used with

ReflectionProperty::skipLazyInitialization() or

setRawValueWithoutLazyInitialization()) to synchronize the state shared
by

both instances.

I do not understand this. Specifically I do not understand the "to

synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
public PropertyType $propertyName {
    get {
        return $this->realInstance->propertyName;
    }
    set(PropertyType $value) {
        $this->realInstance->propertyName = $value;
    }
}
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

My understanding is that the proxy will

always forward the property access, so there effectively is no state on

the proxy?!

It follows that more properties can exist on the proxy itself (declared by

child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.

That is very true. I had a look at the userland implementation and

indeed,

we keep the wrapper while cloning the backing instance (it's not that we

have the choice, the engine doesn't give us any other options).

RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we

now

postpone calling $real->__clone to the moment where the proxy clone is

initialized.

Do I understand it correctly that the initializer of the cloned proxy is

effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it

again and get where you were confused and tell us if we're not clear enough

(to me we are :) )

The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's

initializers that are shared between clones.

And then, when we call that shared initializer in the $clonedProxy, we

clone the returned instance, so that even if the initializer returns a

shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
 $predefinedObject = new SomeObj();
 $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
     return $predefinedObject;
 });
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.

? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the

value after modification!

Also: What happens if the cloned proxy is initialized before the

original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep

the same initializer. Then both proxies are actually fully independent

after cloning, as I would expect from the clone operation.

That's basically what we do and what we describe in the RFC, just with the

added lazy-clone operation on the instance returned by the initializer.

This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $clonedProxy = clone $myProxy;
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

Properties are not initialized to their default value yet (they are

initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is

the reason that you removed that? One possible reason that comes to my

mind is a default value that refers to a non-existing constant. It would

be observable because the initialization emits an error. Are there any

other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Best regards
Tim Düsterhus

For what it’s worth, I see “resetAsLazy()” being most useful for unit
testing libraries that build proxies. While this feature will remove most
of the tricky nuances around proxies, it doesn’t make it any easier in
generating the code for them, so that has to be tested. Being able to write
a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.

Testing is actually a good domain where resetting lazy objects might open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh
and provides nice DX thanks to this capability.

1 year ago by tim@bastelstu.be — view source

unread

Hi

Testing is actually a good domain where resetting lazy objects might open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh
and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Testing is actually a good domain where resetting lazy objects might open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
<
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh

and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy. Here
it is:
https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy. But
it's needed to refresh it when calling $object->_refresh(). The
implementation I just linked swaps the real object bound to the proxy for
another one (the line
"Configuration::instance()->persistence()->refresh($object);" swaps by
reference).

1 year ago by Nicolas Grekas — view source

unread

Hi there,

Le mar. 16 juil. 2024 à 10:13, Nicolas Grekas nicolas.grekas+php@gmail.com
a écrit :

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Testing is actually a good domain where resetting lazy objects might
open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
<
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh

and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy. Here
it is:

https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy.
But it's needed to refresh it when calling $object->_refresh(). The
implementation I just linked swaps the real object bound to the proxy for
another one (the line
"Configuration::instance()->persistence()->refresh($object);" swaps by
reference).

After chatting a bit with Benjamin on Slack, I realized that the sentence
"The indented use-case is for an object to manage its own laziness by
calling the method in its constructor" was a bit restrictive and that there
are more use cases for reset methods.

Here is the revised part about resetAsLazyGhost in the RFC:

This method allows an object to manage its own laziness by calling the
method in its constructor, as demonstrated here
https://gist.github.com/arnaud-lb/9d52e2ba4e278411bff3addf75ce56be. In
such cases, the proposed lazy-object API can be used to achieve lazy
initialization at the implementation detail level.

Another use case for this method is to achieve resettable services. In
these scenarios, a service object already inserted into a complex
dependency graph can be reset to its initial state using the lazy object
infrastructure, without its implementation being aware of this concern. A
concrete example of this use case is the Doctrine EntityManager, which can
end up in a hard to recover https://github.com/doctrine/orm/issues/5933
"closed" state, preventing its use in long-running processes. However, thanks
to the lazy-loading code infrastructure
https://github.com/symfony/symfony/blob/1a16ebc32598faada074e0af12a6a698d2964a5e/src/Symfony/Bridge/Doctrine/ManagerRegistry.php#L42,
recovering from such a state is possible. This method would be instrumental
in achieving this capability without resorting to the current complex code
used in userland.

I hope this helps.

Cheers,
Nicolas

1 year ago by Nicolas Grekas — view source

unread

Dear all,

Le mar. 16 juil. 2024 à 17:51, Nicolas Grekas nicolas.grekas+php@gmail.com
a écrit :

Hi there,

Le mar. 16 juil. 2024 à 10:13, Nicolas Grekas <
nicolas.grekas+php@gmail.com> a écrit :

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Testing is actually a good domain where resetting lazy objects might
open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
<
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh

and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy. Here
it is:

https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy.
But it's needed to refresh it when calling $object->_refresh(). The
implementation I just linked swaps the real object bound to the proxy for
another one (the line
"Configuration::instance()->persistence()->refresh($object);" swaps by
reference).

After chatting a bit with Benjamin on Slack, I realized that the sentence
"The indented use-case is for an object to manage its own laziness by
calling the method in its constructor" was a bit restrictive and that there
are more use cases for reset methods.

Here is the revised part about resetAsLazyGhost in the RFC:

This method allows an object to manage its own laziness by calling the
method in its constructor, as demonstrated here
https://gist.github.com/arnaud-lb/9d52e2ba4e278411bff3addf75ce56be. In
such cases, the proposed lazy-object API can be used to achieve lazy
initialization at the implementation detail level.

Another use case for this method is to achieve resettable services. In
these scenarios, a service object already inserted into a complex
dependency graph can be reset to its initial state using the lazy object
infrastructure, without its implementation being aware of this concern. A
concrete example of this use case is the Doctrine EntityManager, which can
end up in a hard to recover https://github.com/doctrine/orm/issues/5933
"closed" state, preventing its use in long-running processes. However, thanks
to the lazy-loading code infrastructure
https://github.com/symfony/symfony/blob/1a16ebc32598faada074e0af12a6a698d2964a5e/src/Symfony/Bridge/Doctrine/ManagerRegistry.php#L42,
recovering from such a state is possible. This method would be instrumental
in achieving this capability without resorting to the current complex code
used in userland.

I hope this helps.

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely, when
the factory returns an object from a parent class, we describe that adding
more on the proxy class would throw, and we also explain why. We also added
a restriction to prevent a proxy from having an overridden __clone or
__destruct when the factory returns a parent, and explained why again.

This should simplify the overall behavior by preventing edge case that
wouldn't have easy answers. If those limitations prove too restrictive in
practice (my experience tells me they should be fine), they could be
leveraged in the future.

On our side, this should close the last topics we wanted to address before
opening the vote.

Please let us know if anyone has other concerns.

Cheers,
Nicolas

1 year ago by Larry Garfield — view source

unread

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely,
when the factory returns an object from a parent class, we describe
that adding more on the proxy class would throw, and we also explain
why. We also added a restriction to prevent a proxy from having an
overridden __clone or __destruct when the factory returns a parent, and
explained why again.

This should simplify the overall behavior by preventing edge case that
wouldn't have easy answers. If those limitations prove too restrictive
in practice (my experience tells me they should be fine), they could be
leveraged in the future.

On our side, this should close the last topics we wanted to address
before opening the vote.

Please let us know if anyone has other concerns.

Cheers,
Nicolas

Minor point: Why is the $initializer return type null, instead of void? I don't see a purpose to allowing an explicit null return and nothing else.

Otherwise, I'm quite looking forward to this.

--Larry Garfield

1 year ago by Nicolas Grekas — view source

unread

Le jeu. 18 juil. 2024 à 00:13, Larry Garfield larry@garfieldtech.com a
écrit :

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely,
when the factory returns an object from a parent class, we describe
that adding more on the proxy class would throw, and we also explain
why. We also added a restriction to prevent a proxy from having an
overridden __clone or __destruct when the factory returns a parent, and
explained why again.

This should simplify the overall behavior by preventing edge case that
wouldn't have easy answers. If those limitations prove too restrictive
in practice (my experience tells me they should be fine), they could be
leveraged in the future.

On our side, this should close the last topics we wanted to address
before opening the vote.

Please let us know if anyone has other concerns.

Cheers,
Nicolas

Minor point: Why is the $initializer return type null, instead of void? I
don't see a purpose to allowing an explicit null return and nothing else.

Updated to use "void". Both would work :)

Otherwise, I'm quite looking forward to this.

🤞

1 year ago by Philip Hofstetter — view source

unread

Hi,

Minor point: Why is the $initializer return type null, instead of void? I

don't see a purpose to allowing an explicit null return and nothing else.

Updated to use "void". Both would work :)

Super minor nitpick: You have updated the prototype, but not the
explanation text which still says:

When initialization is required, the $initializer is called with the object

as first parameter. The initializer should initialize the object, and must
return null (or void). See the “Initialization Sequence” section.

However, given the :void return type, you can’t return null - that
would be a fatal error.

The phrase should probably be

When initialization is required, the $initializer is called with the object

as first parameter. The initializer should initialize the object. See the
“Initialization Sequence” section.

Philip

1 year ago by Arnaud Le Blanc — view source

unread

Hi Philip,

On Thu, Jul 18, 2024 at 12:19 PM Philip Hofstetter
phofstetter@sensational.ch wrote:

Super minor nitpick: You have updated the prototype, but not the explanation text which still says:

When initialization is required, the $initializer is called with the object as first parameter. The initializer should initialize the object, and must return null (or void). See the “Initialization Sequence” section.

I've updated the text as suggested, as the Initialization Sequence
section also specifies how the function should return. The
Initialization Sequence section says approximately the same thing,
however: "The initializer must return null or no value. If the
initializer returns something other than null, a TypeError is thrown".
The reason is that all function calls evaluate to a value, including
when calling a void function, so void functions actually return null.
Unfortunately we have no good way to enforce a void return, unless we
check the declared return type of the function, but it would not work
for all kinds of callables.

Documenting the signature as returning void is fine, though.

Note that the return value is not important currently: We just check
that it's null so that we reserve the possibility to give the return
value a meaning in the future.

Best Regards,
Arnaud

1 year ago by Larry Garfield — view source

unread

Otherwise, I'm quite looking forward to this.

🤞

Another thought that occurred to me. Given how lightweight it looks to be (may not actually be, but looks it), how much overhead would there be to having a compiled DI container that is lazy by default? Just make everything lazy with a fairly standard initializer or factory, unless a specific case says you shouldn't. That way you can use optional dependencies in a constructor pretty much at will with no overhead of needing to create a chain of dependencies as a result.

Would that be a bad idea for some reason, or would it actually work?

(This doesn't really affect my vote, more just a thought that came up.)

--Larry Garfield

1 year ago by Rob Landers — view source

unread

Otherwise, I'm quite looking forward to this.

🤞

Another thought that occurred to me. Given how lightweight it looks to be (may not actually be, but looks it), how much overhead would there be to having a compiled DI container that is lazy by default? Just make everything lazy with a fairly standard initializer or factory, unless a specific case says you shouldn't. That way you can use optional dependencies in a constructor pretty much at will with no overhead of needing to create a chain of dependencies as a result.

Would that be a bad idea for some reason, or would it actually work?

(This doesn't really affect my vote, more just a thought that came up.)

--Larry Garfield

I'm not convinced a DI container is possible at all with this implementation, see "surprising behaviors."

When you are building a DI container, you usually have absolutely no idea what a user will return from a proxy. At most, you might know they are returning SomeInterface or SomeBaseClass, but the user might return MyFancyClass that implements the interface or base class, which isn't allowed via this implementation. The type must be exactly the same or a parent class:

the factory of a lazy proxy is allowed to return an instance of the same class as the proxy, or of a parent class.

Returning an instance of a sub-class is not allowed

— Rob

1 year ago by Rob Landers — view source

unread

Otherwise, I'm quite looking forward to this.

🤞

Another thought that occurred to me. Given how lightweight it looks to be (may not actually be, but looks it), how much overhead would there be to having a compiled DI container that is lazy by default? Just make everything lazy with a fairly standard initializer or factory, unless a specific case says you shouldn't. That way you can use optional dependencies in a constructor pretty much at will with no overhead of needing to create a chain of dependencies as a result.

Would that be a bad idea for some reason, or would it actually work?

(This doesn't really affect my vote, more just a thought that came up.)

--Larry Garfield

I'm not convinced a DI container is possible at all with this implementation, see "surprising behaviors."

When you are building a DI container, you usually have absolutely no idea what a user will return from a proxy. At most, you might know they are returning SomeInterface or SomeBaseClass, but the user might return MyFancyClass that implements the interface or base class, which isn't allowed via this implementation. The type must be exactly the same or a parent class:

the factory of a lazy proxy is allowed to return an instance of the same class as the proxy, or of a parent class.

Returning an instance of a sub-class is not allowed

— Rob

Sorry, misspoke:

what a user will return from a proxy.

should be "what a user will return from a factory."

— Rob

1 year ago by Nicolas Grekas — view source

unread

Le mer. 24 juil. 2024 à 16:05, Larry Garfield larry@garfieldtech.com a
écrit :

Otherwise, I'm quite looking forward to this.

🤞

Another thought that occurred to me. Given how lightweight it looks to
be (may not actually be, but looks it), how much overhead would there be to
having a compiled DI container that is lazy by default? Just make
everything lazy with a fairly standard initializer or factory, unless a
specific case says you shouldn't. That way you can use optional
dependencies in a constructor pretty much at will with no overhead of
needing to create a chain of dependencies as a result.

Would that be a bad idea for some reason, or would it actually work?

(This doesn't really affect my vote, more just a thought that came up.)

I see no blockers.

I don't have numbers nor plan to play with this idea in the short term but
that could be fun to try :)
Eg the Symfony DI could be updated to leverage the new mechanism (I did it
already with a previous version of the RFC) and we'd try running an app
with all services defined as lazy by default.

To Rob: proxying by interface can be implemented using regular code
generation so it's not a blocker. Symfony does it already, and will
continue to do it to cover the use case.

Nicolas

1 year ago by Rob Landers — view source

unread

To Rob: proxying by interface can be implemented using regular code generation so it's not a blocker. Symfony does it already, and will continue to do it to cover the use case.

I'm not sure what you mean, as the RFC makes it look this should be completely legal, until you get to the end:

interface Fancy {
function sayHi(): void;
}
class A implements Fancy {
function sayHi(): void { echo "hi!"; }
}

$container->register(proxyType: Fancy::class, factory: fn() => new A());

// Now container only knows that it needs to return a FancyClass and this is "legal looking":

$lazy = new ReflectionClass($proxyType)->newLazyProxy($factory);
// same as
$lazy = new ReflectionClass(Fancy::class)->newLazyProxy(fn() => new A());

This ^ compiles, it's perfectly legal and makes 100% sense that I should be able to call $lazy->sayHi() and it calls A::sayHi(), which is what you would expect. Instead, this is runtime error.

— Rob

1 year ago by Nicolas Grekas — view source

unread

Le mer. 24 juil. 2024 à 21:44, Rob Landers rob@bottled.codes a écrit :

To Rob: proxying by interface can be implemented using regular code
generation so it's not a blocker. Symfony does it already, and will
continue to do it to cover the use case.

I'm not sure what you mean, as the RFC makes it look this should be
completely legal, until you get to the end:

interface Fancy {
function sayHi(): void;
}
class A implements Fancy {
function sayHi(): void { echo "hi!"; }
}

$container->register(proxyType: Fancy::class, factory: fn() => new A());

// Now container only knows that it needs to return a FancyClass and this
is "legal looking":

$lazy = new ReflectionClass($proxyType)->newLazyProxy($factory);
// same as
$lazy = new ReflectionClass(Fancy::class)->newLazyProxy(fn() => new A());

This ^ compiles, it's perfectly legal and makes 100% sense that I should
be able to call $lazy->sayHi() and it calls A::sayHi(), which is what you
would expect. Instead, this is runtime error.

The reason why this doesn't work is that the RFC is about state proxies,
which means properties are required to hook the lazy behavior.
The RFC forbids your snippet because the initializer must return a parent
of Fancy::class, which A::class isn't.

Please have a closer read of the RFC, we tried to describe the reasoning as
much as possible and this is covered.

Nicolas

1 year ago by Rob Landers — view source

unread

Le mer. 24 juil. 2024 à 21:44, Rob Landers rob@bottled.codes a écrit :

__

To Rob: proxying by interface can be implemented using regular code generation so it's not a blocker. Symfony does it already, and will continue to do it to cover the use case.

I'm not sure what you mean, as the RFC makes it look this should be completely legal, until you get to the end:

interface Fancy {
function sayHi(): void;
}
class A implements Fancy {
function sayHi(): void { echo "hi!"; }
}

$container->register(proxyType: Fancy::class, factory: fn() => new A());

// Now container only knows that it needs to return a FancyClass and this is "legal looking":

$lazy = new ReflectionClass($proxyType)->newLazyProxy($factory);
// same as
$lazy = new ReflectionClass(Fancy::class)->newLazyProxy(fn() => new A());

This ^ compiles, it's perfectly legal and makes 100% sense that I should be able to call $lazy->sayHi() and it calls A::sayHi(), which is what you would expect. Instead, this is runtime error.

The reason why this doesn't work is that the RFC is about state proxies, which means properties are required to hook the lazy behavior.
The RFC forbids your snippet because the initializer must return a parent of Fancy::class, which A::class isn't.

Please have a closer read of the RFC, we tried to describe the reasoning as much as possible and this is covered.

Nicolas

My point was that this can’t be used for any DI container on the planet (unless you parse the code and statically figure out what concrete type is returned — hope it is only one type — and will only work for the simplest of cases). There is nothing here that can be used for DI unless people are injecting concrete types and not using inheritance, which goes against the “code to the interface” rule we’ve been using for the last 20-40 years…

— Rob

1 year ago by Rob Landers — view source

unread

Dear all,

Le mar. 16 juil. 2024 à 17:51, Nicolas Grekas <nicolas.grekas+php@gmail.com mailto:nicolas.grekas%2Bphp@gmail.com> a écrit :

Hi there,

Le mar. 16 juil. 2024 à 10:13, Nicolas Grekas <nicolas.grekas+php@gmail.com mailto:nicolas.grekas%2Bphp@gmail.com> a écrit :

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Testing is actually a good domain where resetting lazy objects might open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh
and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy. Here it is:
https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy. But it's needed to refresh it when calling $object->_refresh(). The implementation I just linked swaps the real object bound to the proxy for another one (the line "Configuration::instance()->persistence()->refresh($object);" swaps by reference).

After chatting a bit with Benjamin on Slack, I realized that the sentence "The indented use-case is for an object to manage its own laziness by calling the method in its constructor" was a bit restrictive and that there are more use cases for reset methods.

Here is the revised part about resetAsLazyGhost in the RFC:

This method allows an object to manage its own laziness by calling the method in its constructor, as demonstrated here https://gist.github.com/arnaud-lb/9d52e2ba4e278411bff3addf75ce56be. In such cases, the proposed lazy-object API can be used to achieve lazy initialization at the implementation detail level.

Another use case for this method is to achieve resettable services. In these scenarios, a service object already inserted into a complex dependency graph can be reset to its initial state using the lazy object infrastructure, without its implementation being aware of this concern. A concrete example of this use case is the Doctrine EntityManager, which can end up in a hard to recover https://github.com/doctrine/orm/issues/5933 "closed" state, preventing its use in long-running processes. However, thanks to the lazy-loading code infrastructure https://github.com/symfony/symfony/blob/1a16ebc32598faada074e0af12a6a698d2964a5e/src/Symfony/Bridge/Doctrine/ManagerRegistry.php#L42, recovering from such a state is possible. This method would be instrumental in achieving this capability without resorting to the current complex code used in userland.

I hope this helps.

A bit unrelated to the above topic: we've further clarified the RFC by addition restrictions to what can be done with lazy proxies. Namely, when the factory returns an object from a parent class, we describe that adding more on the proxy class would throw, and we also explain why. We also added a restriction to prevent a proxy from having an overridden __clone or __destruct when the factory returns a parent, and explained why again.

This should simplify the overall behavior by preventing edge case that wouldn't have easy answers. If those limitations prove too restrictive in practice (my experience tells me they should be fine), they could be leveraged in the future.

On our side, this should close the last topics we wanted to address before opening the vote.

Please let us know if anyone has other concerns.

Cheers,
Nicolas

If you cannot use an instance of a subclass as the actual object, then it the methods probably shouldn't exist on ReflectionClass since you use that to reflect on interfaces and abstract classes. This also limits the usability, as for example, in a container, you might know that the type you need to return is MyInterface, but not know what type the actual factory will return.

— Rob

1 year ago by Nicolas Grekas — view source

unread

Le jeu. 18 juil. 2024 à 09:08, Rob Landers rob@bottled.codes a écrit :

Dear all,

Le mar. 16 juil. 2024 à 17:51, Nicolas Grekas <
nicolas.grekas+php@gmail.com> a écrit :

Hi there,

Le mar. 16 juil. 2024 à 10:13, Nicolas Grekas <
nicolas.grekas+php@gmail.com> a écrit :

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Testing is actually a good domain where resetting lazy objects might open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
<
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh

and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy. Here
it is:

https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy.
But it's needed to refresh it when calling $object->_refresh(). The
implementation I just linked swaps the real object bound to the proxy for
another one (the line
"Configuration::instance()->persistence()->refresh($object);" swaps by
reference).

After chatting a bit with Benjamin on Slack, I realized that the sentence
"The indented use-case is for an object to manage its own laziness by
calling the method in its constructor" was a bit restrictive and that there
are more use cases for reset methods.

Here is the revised part about resetAsLazyGhost in the RFC:

This method allows an object to manage its own laziness by calling the
method in its constructor, as demonstrated here
https://gist.github.com/arnaud-lb/9d52e2ba4e278411bff3addf75ce56be. In
such cases, the proposed lazy-object API can be used to achieve lazy
initialization at the implementation detail level.

Another use case for this method is to achieve resettable services. In
these scenarios, a service object already inserted into a complex
dependency graph can be reset to its initial state using the lazy object
infrastructure, without its implementation being aware of this concern. A
concrete example of this use case is the Doctrine EntityManager, which can
end up in a hard to recover https://github.com/doctrine/orm/issues/5933
"closed" state, preventing its use in long-running processes. However, thanks
to the lazy-loading code infrastructure
https://github.com/symfony/symfony/blob/1a16ebc32598faada074e0af12a6a698d2964a5e/src/Symfony/Bridge/Doctrine/ManagerRegistry.php#L42,
recovering from such a state is possible. This method would be instrumental
in achieving this capability without resorting to the current complex code
used in userland.

I hope this helps.

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely, when
the factory returns an object from a parent class, we describe that adding
more on the proxy class would throw, and we also explain why. We also added
a restriction to prevent a proxy from having an overridden __clone or
__destruct when the factory returns a parent, and explained why again.

This should simplify the overall behavior by preventing edge case that
wouldn't have easy answers. If those limitations prove too restrictive in
practice (my experience tells me they should be fine), they could be
leveraged in the future.

On our side, this should close the last topics we wanted to address before
opening the vote.

Please let us know if anyone has other concerns.

Cheers,
Nicolas

If you cannot use an instance of a subclass as the actual object, then it
the methods probably shouldn't exist on ReflectionClass since you use that
to reflect on interfaces and abstract classes. This also limits the
usability, as for example, in a container, you might know that the type you
need to return is MyInterface, but not know what type the actual factory
will return.

Proxying by interface is not in the scope of this RFC. For that, one can
use code generation, and libraries like symfony/var-exporter provide all
the tools to make it easy to do.

Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely, when
the factory returns an object from a parent class, we describe that adding
more on the proxy class would throw, and we also explain why. We also added
a restriction to prevent a proxy from having an overridden __clone or
__destruct when the factory returns a parent, and explained why again.

Note that in the RFC you typoed it as '__destructor' ('or' suffix).

Please let us know if anyone has other concerns.

I've replied regarding the cloning semantics in an earlier email.

Regarding the reset*() methods even with the additional examples I
remain unconvinced that this is not only necessary to work around
existing design issues in userland libraries. However I guess that we
will not reach an agreement here and I also do not consider myself the
target audience of this RFC. I'm just here to find edge cases :-)

Except for the cloning semantics I cannot find any obvious problems with
the described semantics.

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Hi,

Le jeu. 18 juil. 2024 à 21:58, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely, when
the factory returns an object from a parent class, we describe that
adding
more on the proxy class would throw, and we also explain why. We also
added
a restriction to prevent a proxy from having an overridden __clone or
__destruct when the factory returns a parent, and explained why again.

Note that in the RFC you typoed it as '__destructor' ('or' suffix).

Please let us know if anyone has other concerns.

I've replied regarding the cloning semantics in an earlier email.

Regarding the reset*() methods even with the additional examples I
remain unconvinced that this is not only necessary to work around
existing design issues in userland libraries. However I guess that we
will not reach an agreement here and I also do not consider myself the
target audience of this RFC. I'm just here to find edge cases :-)

Except for the cloning semantics I cannot find any obvious problems with
the described semantics.

Cloning has kept us busy in the last days and after many brainstorming
sessions, we've decided to follow your initial proposal : make the clone
operator trigger the initialization of the original object before cloning
it.

The RFC is updated, with a note about "Lazy cloning" in the "Future scope"
section.

Since this should be the last topic in this thread, we plan to open the
vote on Friday.

I invite everybody to give the RFC a new read.

Thanks,
Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

Cloning has kept us busy in the last days and after many brainstorming
sessions, we've decided to follow your initial proposal : make the clone
operator trigger the initialization of the original object before cloning
it.

Thank you. That cloning behavior certainly is much easier to reason about.

I invite everybody to give the RFC a new read.

I'm seeing there are some more changes and not just to the cloning
section. I've went through the entire RFC once again and here are my
(hopefully final) five remarks. They are only about textual
clarification of some behaviors, I don't have any further semantic concerns.

__destruct() is still called __destructor() in some places.
In the "Initialization Triggers" section, it says

The following special cases do not trigger initialization of a lazy
object:
Cloning, unless __clone() is implemented and accesses a property.
This is no longer true with the latest changes.

In the "Proxy objects" initialization section, it says:

The value of properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization() is discarded.
I assume that the destructors of the values will be called if the
reference count drops to zero? Then perhaps add "as if unset() was
called" or "as if null was assigned to them" to make it clear that this
is a regular reassignment and not some lazy object speciality.

In the "Proxy objects" initialization section, it says:

get_class() or the ::class constant evaluate to the class name of
the proxy, regardless of the actual instance.
This feels misplaced, because my understanding it that it is not
something about the initialization, but rather proxy objects in general?
Perhaps this is best moved to the "Real instance implementation" section
in the Notes. It should also mention the behavior of the instanceof
operator and also type declarations.

Perhaps some phrasing like the following is best: "The externally
visible type of a lazy proxy is the type of the proxy object, even if
the real object is of a parent type. This includes the get_class()
function, the ::class constant, the instanceof operator and type
checking in parameter, return and property types."

In the explanation of
"ReflectionClass::markLazyObjectAsInitialized()", it says:

Its behavior is the same as described for Ghost Objects in the
Initialization Sequence section, except that the initializer is not
called.

This means that calling markLazyObjectAsInitialized() on a lazy proxy
turns it into a regular object of the proxy class, as if
newInstanceWithoutConstructor() was used, right?

Best regards
Tim Düsterhus

1 year ago by Arnaud Le Blanc — view source

unread

Hi Tim,

I'm seeing there are some more changes and not just to the cloning
section. I've went through the entire RFC once again and here are my
(hopefully final) five remarks. They are only about textual
clarification of some behaviors, I don't have any further semantic concerns.

Thank you! We have updated the RFC accordingly.

In the explanation of
"ReflectionClass::markLazyObjectAsInitialized()", it says:

Its behavior is the same as described for Ghost Objects in the
Initialization Sequence section, except that the initializer is not
called.

This means that calling markLazyObjectAsInitialized() on a lazy proxy
turns it into a regular object of the proxy class, as if
newInstanceWithoutConstructor() was used, right?

Yes, except for the value of properties that were already initialized.

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

Thank you! We have updated the RFC accordingly.

LGTM :-)

In the explanation of
"ReflectionClass::markLazyObjectAsInitialized()", it says:

Its behavior is the same as described for Ghost Objects in the
Initialization Sequence section, except that the initializer is not
called.

This means that calling markLazyObjectAsInitialized() on a lazy proxy
turns it into a regular object of the proxy class, as if
newInstanceWithoutConstructor() was used, right?

Yes, except for the value of properties that were already initialized.

Right, thank you for the clarification.

Best regards
Tim Düsterhus

1 year ago by kontakt@beberlei.de — view source

unread

Am 17.07.2024, 20:31:02 schrieb Nicolas Grekas <nicolas.grekas+php@gmail.com

:

Dear all,

Le mar. 16 juil. 2024 à 17:51, Nicolas Grekas <
nicolas.grekas+php@gmail.com> a écrit :

Hi there,

Le mar. 16 juil. 2024 à 10:13, Nicolas Grekas <
nicolas.grekas+php@gmail.com> a écrit :

Le lun. 15 juil. 2024 à 21:42, Tim Düsterhus tim@bastelstu.be a
écrit :

Hi

Testing is actually a good domain where resetting lazy objects might
open
interesting use cases.
This reminded me about zenstruck/foundry, which leverages the
LazyProxyTrait to provide refreshable fixture objects
<
https://symfony.com/bundles/ZenstruckFoundryBundle/current/index.html#auto-refresh

and provides nice DX thanks to this capability.

I have not used this library before, but I have taken a (very) brief
look at the code and documentation.

My understanding is that all the fixture objects are generated by a
corresponding Factory class. This factory clearly has the capability of
constructing objects by itself, so it could just create a lazy proxy
instead?

I'm seeing the instantiateWith() example in the documentation where
the user can return a constructed object themselves, but I'm not seeing
how that can safely be combined with the reset*() methods: Anything
special the user did to construct the object would be reverted, so the
user might as well rely on the default construction logic of the
factory
then.

What am I missing?

Finding the spot where the reset method would be useful is not easy.
Here it is:

https://github.com/zenstruck/foundry/blob/v2.0.7/src/Persistence/IsProxy.php#L66-L76

Basically, the reset method is not needed when creating the lazy proxy.
But it's needed to refresh it when calling $object->_refresh(). The
implementation I just linked swaps the real object bound to the proxy for
another one (the line
"Configuration::instance()->persistence()->refresh($object);" swaps by
reference).

After chatting a bit with Benjamin on Slack, I realized that the sentence
"The indented use-case is for an object to manage its own laziness by
calling the method in its constructor" was a bit restrictive and that there
are more use cases for reset methods.

Here is the revised part about resetAsLazyGhost in the RFC:

This method allows an object to manage its own laziness by calling the
method in its constructor, as demonstrated here
https://gist.github.com/arnaud-lb/9d52e2ba4e278411bff3addf75ce56be. In
such cases, the proposed lazy-object API can be used to achieve lazy
initialization at the implementation detail level.

Another use case for this method is to achieve resettable services. In
these scenarios, a service object already inserted into a complex
dependency graph can be reset to its initial state using the lazy object
infrastructure, without its implementation being aware of this concern. A
concrete example of this use case is the Doctrine EntityManager, which can
end up in a hard to recover https://github.com/doctrine/orm/issues/5933
"closed" state, preventing its use in long-running processes. However, thanks
to the lazy-loading code infrastructure
https://github.com/symfony/symfony/blob/1a16ebc32598faada074e0af12a6a698d2964a5e/src/Symfony/Bridge/Doctrine/ManagerRegistry.php#L42,
recovering from such a state is possible. This method would be instrumental
in achieving this capability without resorting to the current complex code
used in userland.

I hope this helps.

A bit unrelated to the above topic: we've further clarified the RFC by
addition restrictions to what can be done with lazy proxies. Namely, when
the factory returns an object from a parent class, we describe that adding
more on the proxy class would throw, and we also explain why. We also added
a restriction to prevent a proxy from having an overridden __clone or
__destruct when the factory returns a parent, and explained why again.

This should simplify the overall behavior by preventing edge case that
wouldn't have easy answers. If those limitations prove too restrictive in
practice (my experience tells me they should be fine), they could be
leveraged in the future.

On our side, this should close the last topics we wanted to address before
opening the vote.

Please let us know if anyone has other concerns.

I have discussed all my open topics and hope this has improved and
clarified the RFC in a way that it gets accepted!

The note about resetting an object back to its initial state with reset*
was the relevant extra information that was missing for me to understand
the need for resetAs*.

While that is not a use-case I need myself I can see its usefulness
especially when running PHP in worker-mode context either as message queue
processing daemon or for FrankenPHP and related application frameworks.

Cheers,
Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

For what it’s worth, I see “resetAsLazy()” being most useful for unit testing libraries that build proxies. While this feature will remove most of the tricky nuances around proxies, it doesn’t make it any easier in generating the code for them, so that has to be tested. Being able to write a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.

I'm not sure if I follow this example. With the native support for lazy
objects there should no longer be a need for code generation, no?
Testing that PHP works correctly in your own test suite is not a
value-add, PHP has its own tests for that.

Best regards
Tim Düsterhus

1 year ago by Rob Landers — view source

unread

Hi

For what it’s worth, I see “resetAsLazy()” being most useful for unit testing libraries that build proxies. While this feature will remove most of the tricky nuances around proxies, it doesn’t make it any easier in generating the code for them, so that has to be tested. Being able to write a test like this (abbreviated):

$realObj = new $foo()
$proxy = clone $realObj;
makeTestProxy($proxy); // resets as lazy with initializer
assert($realObj == $proxy);

Is really simple. Without a reset method, this isn’t straightforward.

I'm not sure if I follow this example. With the native support for lazy
objects there should no longer be a need for code generation, no?
Testing that PHP works correctly in your own test suite is not a
value-add, PHP has its own tests for that.

Best regards
Tim Düsterhus

There are ghost objects, and then there are proxies. This RFC allows both. Beyond that, there are many types of proxies. For example, I have a library that proxies an interface or class, generates a spy proxy, and when the user calls those methods on it, it records these calls for asynchronous RPC (for remote objects) or locally — kinda like erlang.

This code obviously has to be tested, both the generation of the code, as well as the execution. Now that properties can exist on interfaces, I also have to figure out how that will affect things as well, and test that. Being able to defer generation and/or state hydration to actual usage will be a massive improvement, especially in cases where you may have nested objects.

Now, you asked why I would want to test PHP’s own behavior? Well, it’s a new language feature. I’m going to test the hell out of it, learn its edges, and make sure it works as advertised and my use case. If I find any bugs, I will report them. If this feature had existed for years, I would probably still test it to make sure php upgrades don’t break expected behavior and it still works on lower PHP versions. Are you suggesting that I don’t need to have tests for this, that PHP will never change this feature?

— Rob

1 year ago by Nicolas Grekas — view source

unread

Le ven. 12 juil. 2024 à 01:40, Benjamin Außenhofer kontakt@beberlei.de a
écrit :

Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus tim@bastelstu.be:

Hi

Many things are already possible in userland. That does not always mean

that the cost-benefit ratio is appropriate for inclusion in core. I get

behind the two examples in the “About Lazy-Loading Strategies” section,

but I'm afraid I still can't wrap my head why I would want an object

that makes itself lazy in its own constructor: I have not yet seen a

real-world example.

Keeping this capability for userland is not an option for me as it would

mostly defeat my goal, which is to get rid of any userland code on this

topic (and is achieved by the RFC).

Here is a real-world example:

https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and

would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony
Service Entity Repository extension are not a necessary real world case
that would require this RFC to include a way for classes to make
themselves lazy.

I took the liberty at rewriting the code of DefaultRepositoryFactory
(Doctrine code itself) and ContainerRepositoryFactory in a way to make the
repositories lazy without needing resetAsLazy, just
$reflector->createLazyProxy. In case of the second the
LazyServiceEntityRepository class could be deleted.

https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a

Thanks for the example. Here is the quick diff:

Before:
return $this->managedRepositories[$repositoryHash] = new
$repositoryClassName($entityManager, $metadata);

After:
$reflector = new ReflectionClass($repositoryClassName);
return $this->managedRepositories[$repositoryHash] =
$reflector->newLazyProxy(static fn () => new
$repositoryClassName($entityManager, $metadata);

But in the case of repositories registered as services, this code path
won't be reached for the class we're talking about (we'll take the
"$container->has()" code path).

Yet we need to account for the history of such a piece of code: the service
repository class I shared uses laziness internally because this was the
most effective way to handle circular references seamlessly without
breaking any of the currently working consumer apps.

With refactoring, we can fix all design issues in all softwares. Yet that's
too theoretical and this is missing the point IMHO. What we want to provide
is a complete coverage of the lazy object domain. Simplifying too much can
lead to an incomplete solution that'd be too restrictive.

Please let me know if this is not how it works or can work or if my

reasoning is flawed.

Unless you have no way of getting to the „new $object“ in the code, there
is always a way to just use newLazy*. And when a library does not expose
new $object to you to override, then that is an architectural choice (and
maybe flaw that you have to accept).

I still think not having the reset* methods would greatly simplify this RFC

and would allow to force more constraints, have less footguns.

For example we could simplify the API of newLazyProxy to not receive a
$factory that can arbitrarily create and get objects from somewhere, but
also initializer and always force the lazy object to be an instance created
by newInstanceWithoutConstructor.

You said in a previous mail about reset*()

From a technical pov, this is just a different flavor of the same code

infrastructure, so this is pretty aligned with the rest of the proposed
API.

We are not specifically considering the technical POV, but even more
importantly the user facing API. And this just adds to the surface of the
API a lot of things that are pushing only a 1-5% edge case.

I share this angle: figuring out the minimal API to cover the domain.

Yet I showed reset methods proved useful already (see also my example about
refreshable fixtures) - not only as "just tools", but more importantly from
a theoretical aspect as a useful API to cover all situations that userland
will encounter (and does already). If we don't have them, I know I will
have to maintain a library that already provides this resetting capability
because there are valid use cases for them. That'd be a failure to address
the needs of userland on the topic to me.

Cheers,
Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

With refactoring, we can fix all design issues in all softwares. Yet that's
too theoretical and this is missing the point IMHO. What we want to provide
is a complete coverage of the lazy object domain. Simplifying too much can
lead to an incomplete solution that'd be too restrictive.

I disagree that the right solution to avoiding refactoring in a PHP
library is pushing the effort upstream into the PHP project itself,
because performing refactoring in PHP is much much harder, due to the
larger impact of any change.

Yes, you said that providing the reset*() methods is only a small
incremental change over the lazy object functionality in general, but it
doesn't stop there.

For any bit of API surface we provide, the following needs to happen:

Documentation pages need to be written.
Documentation pages need to be translated.
In case of bug reports, the person triaging the report will need to
reference the documentation and RFC to find out what the intended
behavior is.
In case it actually is a bug, someone will need to fix the bug.
In case the bug is not fixable due to a bad choice made in the design
phase, either the API needs to stay buggy or a replacement API needs to
be provided and the old one deprecated.
Any bugfix needs to be weighted against possible breakage, because
Hyrum's Law exists.
Any new RFC will need to take the API and the interactions between the
RFC topic and the API in question into account (for example if lazy
objects came first and hooks afterwards, then the hook authors would
need to think about the interaction with lazy objects).

Any bit of API surface we add to PHP will need to remain stable for the
next 20+ years, because that's the backwards compatibility expectations
our users have. These expectations are so strong, that users expect us
to not even deprecate things (see the bulk deprecation discussion thread).

I want to see strong arguments in favor of the inclusion in PHP itself
that make it clear that the benefits to the users of PHP outweigh the
maintenance effort for the next generation of PHP maintainers. This
feature is a complex feature, because of all the interaction with other
features (case in point: The number of edge cases I pointed out), thus
it needs stronger arguments compared to the addition of a new function
(e.g. the array_find() function in PHP 8.4).

I see these arguments in favor of the inclusion in PHP for the new*()
methods, I do not see these arguments for the resetAs*() methods.

I share this angle: figuring out the minimal API to cover the domain.

Yet I showed reset methods proved useful already (see also my example about
refreshable fixtures) - not only as "just tools", but more importantly from
a theoretical aspect as a useful API to cover all situations that userland
will encounter (and does already). If we don't have them, I know I will
have to maintain a library that already provides this resetting capability
because there are valid use cases for them. That'd be a failure to address
the needs of userland on the topic to me.

Best regards
Tim Düsterhus

1 year ago by Rob Landers — view source

unread

Hi

... snip

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
  $predefinedObject = new SomeObj();
  $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
      return $predefinedObject;
  });
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.
? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
That's basically what we do and what we describe in the RFC, just with the
added lazy-clone operation on the instance returned by the initializer.
This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
  $predefinedObject = new SomeObj();
  $initializer = function () use ($predefinedObject) {
      return $predefinedObject;
  };
  $myProxy = $r->newLazyProxy($initializer);
  $otherProxy = $r->newLazyProxy($initializer);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($otherProxy);
  $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

I think it would be up to the developer writing the proxy framework to use or abuse this, for example, I've been trying for years to get some decent semantics of value objects in PHP (I may or may not create an RFC for it once I've finished all my research), but, this seems like a perfectly usable case that creates the principle of least astonishment for value objects. For example, if you have an immutable Money(10) and clone Money(10) .... is there any reason to create a new Money(10)? Currently, clone's default behavior is already astonishing for value objects! The instance doesn't matter; it's the value that matters. For service objects, it may be the same thing -- at least, IMHO, services shouldn't have state, just behavior. For non-value objects, such as those in the domain, maybe they should be fetched anew from the DB, created newly from a cache, or cloned from an existing instance.

The point is, that this can have framework-level behavior that simply isn't possible right now because there is no way to control a clone operation properly. I'm actually quite excited to have some more control over cloning (even in this limited form) because the current behavior of __clone is so cobbled that it is barely usable except for the most basic of programs, and currently, the only solution is to disable cloning when it will break assumptions.

— Rob

1 year ago by tim@bastelstu.be — view source

unread

Hi

I think it would be up to the developer writing the proxy framework to use or abuse this

As I've just explained in my reply to Nicolas this is observable to the
user and thus leaks implementation details of the proxy framework.

Currently, clone's default behavior is already astonishing for value
objects!

clone has specific well-defined semantics: You get a distinct object
with identical state (unless a __clone() method is implemented). Lazy
proxies as specified in the RFC violate that.

That you do not like the existing cloning semantics is an unrelated
matter that is not relevant to this discussion.

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Le jeu. 11 juil. 2024 à 20:31, Tim Düsterhus tim@bastelstu.be a écrit :

Hi

Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.

Keeping this capability for userland is not an option for me as it would
mostly defeat my goal, which is to get rid of any userland code on this
topic (and is achieved by the RFC).

Here is a real-world example:

https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php

This class currently uses a poor-man's implementation of lazy objects and
would greatly benefit from resetAsLazyGhost().

Sorry, I was probably a little unclear with my question. I was not
specifically asking if anyone did that, because I am fairly sure that
everything possible has been done before.

I was interested in learning why I would want to promote a
"LazyServiceEntityRepository" instead of the user of my library just
making the "ServiceEntityRepository" lazy themselves.

I understand that historically making the "ServiceEntityRepository" lazy
yourself would have been very complicated, but the new RFC makes this
super easy.

So based on my understanding the "LazyServiceEntityRepository"
(c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
necessary tools to do it yourself, no? That would also match your goal
of getting rid of userland code on this topic.

That class is not a candidate for deprecation even after this RFC. But
thanks to this RFC, PHP 8.4 would allow removing all the code related to
the magic methods in the current implementation.

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

That's exactly it: instead of using a third party lib (or in this case
implementing a poor man's subset of a correct lazy object implementation),
the engine would enable using a native feature to achieve a fully correct
behavior in a very simple way. In this case, LazyServiceEntityRepository
would directly use ReflectionClass::resetAsLazyGhost to make itself lazy,
so that its consumers get a packaged behavior and don't need to care about
the topic while consuming the class.

I have one question regarding the updated initialization sequence. The
RFC writes:

Properties that are declared on the real instance are uninitialized on
the proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared
by
both instances.

I do not understand this. Specifically I do not understand the "to
synchronize the state" bit.

We reworded this sentence a bit. Clearer?

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
 public PropertyType $propertyName {
     get {
         return $this->realInstance->propertyName;
     }
     set(PropertyType $value) {
         $this->realInstance->propertyName = $value;
     }
 }
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.

Conceptually, you've got it right yes!

My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?!

It follows that more properties can exist on the proxy itself (declared
by
child classes of the real object that the proxy implements).

Right, that's mentioned in (2), so all clear.
That is very true. I had a look at the userland implementation and
indeed,
we keep the wrapper while cloning the backing instance (it's not that
we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we
now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
  function (object $clonedProxy) use ($originalProxy) {
      return clone $originalProxy->getRealObject();
  }
Nope, that's not what we describe in the RFC so I hope you can read it
again and get where you were confused and tell us if we're not clear
enough
(to me we are :) )
The "cloning of the real instance" bit is what lead me to this
understanding.

The $originalProxy is not shared with $clonedProxy. Instead, it's
initializers that are shared between clones.
And then, when we call that shared initializer in the $clonedProxy, we
clone the returned instance, so that even if the initializer returns a
shared instance, we don't share anything with the $originalProxy.

Ah, so you mean if the initializer would look like this instead of
creating a fresh object within the initializer?
  $predefinedObject = new SomeObj();
  $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
      return $predefinedObject;
  });
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($clonedProxy);
It didn't even occur to me that one would be able to return a
pre-existing object: I assume that simply reusing the initializer would
create a separate object and that would be sufficient to ensure that the
cloned instance would be independent.

E.g. Doctrine's Entity Manager returns always the same instance.

? Then I believe this is unsound. Consider the following:
  $myProxy = $r->newLazyProxy(...);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $myProxy->someProp++;
  var_dump($clonedProxy->someProp);
The clone was created before someProp was modified, but it outputs the
value after modification!

Also: What happens if the cloned proxy is initialized before the
original proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
That's basically what we do and what we describe in the RFC, just with
the
added lazy-clone operation on the instance returned by the initializer.
This means that if I would return a completely new object within the
initializer then for a cloned proxy the new object would immediately be
cloned and the original object be destructed, yes?

That's correct. This preserves the lifecycle of cloned objects: __clone()
has to be called when creating clones if the method exists.

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
  $predefinedObject = new SomeObj();
  $initializer = function () use ($predefinedObject) {
      return $predefinedObject;
  };
  $myProxy = $r->newLazyProxy($initializer);
  $otherProxy = $r->newLazyProxy($initializer);
  $clonedProxy = clone $myProxy;
  $r->initialize($myProxy);
  $r->initialize($otherProxy);
  $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?

Correct. The sharing is not specifically related to cloning. But when
cloning happens, the expected behavior is well defined: we should have
separate states.

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

Forcing an initialization when cloning would be unexpected. E.g. in
Doctrine, when you clone an uninitialized entity, you don't trigger a
database roundtrip. Instead, you create a new object that still references
the original state internally, but under a different object identity. This
cloning behavior is the one we've had for more than 10 years and I think
it's also the least astonishing one - at least if we consider this example
as a real world trial of this principle.

I would assume that cloning a proxy is something that rarely happens,
because my understanding is that proxies are most useful for service
objects, whereas ghost objects would be used for entities / value
objects, so this should not be too much of a problem.

I share this opinion. Then, since the difference between ghost vs proxy
should be internal from a consumer PoV, defining an observable behavior for
ghost objects also defines the behavior for proxy ones.

Properties are not initialized to their default value yet (they are
initialized before calling the initializer).

I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?

That's because this is observable using e.g. (array) or var_dump.

I see. Perhaps add a short sentence with the reasoning. Something like:

Properties are not initialized to their default value yet (they are
initialized before calling the initializer). As an example, this has an
impact on the behavior of an (array) cast on uninitialized objects and
also when the default value is based on a constant that is not yet
defined when creating the lazy object, but will be defined at the point
of initialization.

Thanks for the proposal. I've added it to the RFC.

Cheers,
Nicolas

1 year ago by tim@bastelstu.be — view source

unread

Hi

To me this is what the language evolution should do: Enable users to do
things that previously needed to be provided by userland libraries,
because they were complicated and fragile, not enabling userland
libraries to simplify things that they should not need to provide in the
first place because the language already provides it.

That's exactly it: instead of using a third party lib (or in this case
implementing a poor man's subset of a correct lazy object implementation),
the engine would enable using a native feature to achieve a fully correct
behavior in a very simple way. In this case, LazyServiceEntityRepository
would directly use ReflectionClass::resetAsLazyGhost to make itself lazy,
so that its consumers get a packaged behavior and don't need to care about
the topic while consuming the class.

I guess we have to agree to disagree here.

Yes, I think it is clearer. Let me try to rephrase this differently to
see if my understanding is correct:

For every property on that exists on the real instance, the property on
the proxy instance effectively [1] is replaced by a property hook like
the following:
  public PropertyType $propertyName {
      get {
          return $this->realInstance->propertyName;
      }
      set(PropertyType $value) {
          $this->realInstance->propertyName = $value;
      }
  }
And value that is stored in the property will be freed (including
calling the destructor if it was the last reference), as if unset()
was called on the property.

[1] No actual property hook will be created and the realInstance
property does not actually exist, but the semantics behave as if such a
hook would be applied.
Conceptually, you've got it right yes!

Sweet. Unless I've missed anything the bit about the value being unset
and the destructor implications is missing in the RFC text. It should be
added. Also the "Properties that are declared on the real instance are
bound to the proxy instance" bit because slightly misleading with the
newest change, because the proxy may no longer define additional properties.

May I suggest something along the lines of the following:

The proxy's properties will be bound to the proxy instance, so that
accessing any of these properties on the proxy forwards the operation to
the corresponding property on the real instance as if the proxy's
property was a virtual property. Any value stored within the proxy's
properties will be unset() and the destructor will be called if the
proxy held the last reference. This includes properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization().

Frankly, thinking about this cloning behavior gives me a headache,
because it quickly leads to very weird semantics. Consider the following
example:
   $predefinedObject = new SomeObj();
   $initializer = function () use ($predefinedObject) {
       return $predefinedObject;
   };
   $myProxy = $r->newLazyProxy($initializer);
   $otherProxy = $r->newLazyProxy($initializer);
   $clonedProxy = clone $myProxy;
   $r->initialize($myProxy);
   $r->initialize($otherProxy);
   $r->initialize($clonedProxy);
To my understanding both $myProxy and $otherProxy would share the
$predefinedObject as the real instance and $clonedProxy would have a
clone of the $predefinedObject at the time of the initialization as its
real instance?
Correct. The sharing is not specifically related to cloning. But when
cloning happens, the expected behavior is well defined: we should have
separate states.

Yes, it's clear that the should have separate states. The issue I'm
having here is that the actual cloning does not happen at the time of
the clone operation, but at an arbitrary later point in time and this
can have some odd consequences for the object lifecycles. Perhaps my
example was too simplified.

Let me try to expand the example a little.

 class SomeObj { public string $foo = 'A'; public string $dummy; }

 $predefinedObject = new SomeObj();
 $initializer = function () use ($predefinedObject) {
     return $predefinedObject;
 };
 $myProxy = $r->newLazyProxy($initializer);
 $otherProxy = $r->newLazyProxy($initializer);
 $r->getProperty('foo')->skipLazyInitialization($myProxy);
 $clonedProxy = clone $myProxy;
 var_dump($clonedProxy->foo);
 $r->initialize($myProxy);
 $r->initialize($otherProxy);
 $myProxy->foo = 'B';
 $r->initialize($clonedProxy);
 var_dump($clonedProxy->foo);

I would expect that this dumps 'A' both of the times, because at the
time of cloning the $foo property held the the value 'A'. But my
understanding is that it returns 'A' at the first time and 'B' at the
second time, because $predefinedObject is cloned at the time of the
$r->initialize($clonedProxy); call.

To me this sounds like cloning an uninitialized proxy would need to
trigger an initialization to result in semantics that do not violate the
principle of least astonishment.

Forcing an initialization when cloning would be unexpected. E.g. in
Doctrine, when you clone an uninitialized entity, you don't trigger a
database roundtrip. Instead, you create a new object that still references
the original state internally, but under a different object identity. This
cloning behavior is the one we've had for more than 10 years and I think
it's also the least astonishing one - at least if we consider this example
as a real world trial of this principle.

See above.

Best regards
Tim Düsterhus

1 year ago by Nicolas Grekas — view source

unread

Hi Tim,

Thanks for the detailed feedback. Arnaud already answered most of your
questions, here is the remaining one:

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

I've gave the RFC three or four passes and I'm not quite sure if I
follow everything, here's a list of some questions / remarks that came
to mind, roughly ordered by the order of things appearing in the RFC.

"been tested successfully on the Doctrine and on the Symfony projects"

Is there a PoC patch showcasing how the code would change / be
simplified for those pre-existing codebases?

Yes!
See https://github.com/nicolas-grekas/symfony/pull/44 for Symfony. All the
complex code is gone \o/
And if anyone is wondering: No, we're not moving this complexity into the
engine. As Arnaud wrote somewhere: Implementation in core is simple
compared to userland as we are at the right level of abstraction. No code
gen, no edge cases with relative type hints, visibility, readonly, hooks,
etc. We get more consistent and transparent behavior as well compared to
userland impl.

For Doctrine, the URL is
https://github.com/nicolas-grekas/doctrine-orm/pull/6 for now, with the
most important line being the removal of the symfony/var-exporter
dependency.

After yours and Valentin's feedback, we're considering an updated API that
would provide the same capabilities but that might be more consensual.

The RFC isn't updated but below is what we have in our drafts. Let me know
what you think already if you want (otherwise, let us work on the updated
implementation/RFC and we'll let you know about them ASAP).

Nicolas
PS: I understand that the concepts in the RFC might be difficult to grasp.
They were certainly challenging for me to summarize. I would happily accept
any help to improve the wording if anyone is willing.

class ReflectionLazyClass extends ReflectionClass
{
public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1;
public int const SKIP_DESTRUCTOR = 2;

public function __construct(object|string $objectOrClass);

public function newLazyGhostInstance(callable $initializer, int $options = 0
): object;
public function newLazyProxyInstance(callable $initializer, int $options = 0
): object;

public function makeInstanceLazyGhost(object $object, callable $initializer,
int $options = 0): void;
public function makeInstanceLazyProxy(object $object, callable $initializer,
int $options = 0): void;
public static function isInitialized(object $instance): bool;
/**

Initializes a lazy object (no-op if the object is already initialized.)
The backing object is returned, which can be another instance than the
lazy object when the virtual strategy is used.
/
public function initialize(object $object, bool $skipInitializer = false):
object;
/*
Marks a property as not triggering initialization when being accessed.
This method is useful to bypass initialization when setting a property.
/
public function skipInitializerForProperty(object $object, string $name,
string $class = null): void;
/*
Sets a property without triggering initialization while skipping hooks
if any.
*/
public function setRawPropertyValue(object $object, string $name, mixed $
value, string $class = null): void;
}

1 year ago by Robert Landers — view source

unread

On Wed, Jun 5, 2024 at 6:58 PM Nicolas Grekas
nicolas.grekas+php@gmail.com wrote:

Hi Tim,

Thanks for the detailed feedback. Arnaud already answered most of your questions, here is the remaining one:

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

I've gave the RFC three or four passes and I'm not quite sure if I
follow everything, here's a list of some questions / remarks that came
to mind, roughly ordered by the order of things appearing in the RFC.

"been tested successfully on the Doctrine and on the Symfony projects"

Is there a PoC patch showcasing how the code would change / be
simplified for those pre-existing codebases?

Yes!
See https://github.com/nicolas-grekas/symfony/pull/44 for Symfony. All the complex code is gone \o/
And if anyone is wondering: No, we're not moving this complexity into the engine. As Arnaud wrote somewhere: Implementation in core is simple compared to userland as we are at the right level of abstraction. No code gen, no edge cases with relative type hints, visibility, readonly, hooks, etc. We get more consistent and transparent behavior as well compared to userland impl.

For Doctrine, the URL is https://github.com/nicolas-grekas/doctrine-orm/pull/6 for now, with the most important line being the removal of the symfony/var-exporter dependency.

After yours and Valentin's feedback, we're considering an updated API that would provide the same capabilities but that might be more consensual.

The RFC isn't updated but below is what we have in our drafts. Let me know what you think already if you want (otherwise, let us work on the updated implementation/RFC and we'll let you know about them ASAP).

Nicolas
PS: I understand that the concepts in the RFC might be difficult to grasp. They were certainly challenging for me to summarize. I would happily accept any help to improve the wording if anyone is willing.

class ReflectionLazyClass extends ReflectionClass
{
public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1;
public int const SKIP_DESTRUCTOR = 2;

public function __construct(object|string $objectOrClass);

public function newLazyGhostInstance(callable $initializer, int $options = 0): object;
public function newLazyProxyInstance(callable $initializer, int $options = 0): object;

public function makeInstanceLazyGhost(object $object, callable $initializer, int $options = 0): void;
public function makeInstanceLazyProxy(object $object, callable $initializer, int $options = 0): void;
public static function isInitialized(object $instance): bool;
/**

Initializes a lazy object (no-op if the object is already initialized.)

The backing object is returned, which can be another instance than the lazy object when the virtual strategy is used.
/
public function initialize(object $object, bool $skipInitializer = false): object;
/*

Marks a property as not triggering initialization when being accessed.

This method is useful to bypass initialization when setting a property.
/
public function skipInitializerForProperty(object $object, string $name, string $class = null): void;
/*

Sets a property without triggering initialization while skipping hooks if any.
*/
public function setRawPropertyValue(object $object, string $name, mixed $value, string $class = null): void;
}

For virtual proxies, it would be nice if the instance returned didn't
have to be the same type or child type.

For example, in Durable PHP, I allow accessing remote entities via a
Spying Proxy:

$ctx->signal($id, fn(MyEntity $obj) => $obj->doSomething($x));

In this example, ->signal() generates/loads the proxy which implements
MyEntity with code that captures the value of $x to be forwarded to a
remote instance. It then calls the passed closure and captures the
method + arguments passed.

Most of the complexity is generating the objects themselves to
implement the methods. It would be much simpler to implement magic
methods that can intercept the method calls and not worry about
generating an entire class every time a dev wants to use this
functionality.

In my case, none of the complexity actually goes away since I still
need to generate concrete types of the correct type. That being said,
I don't exactly have millions of users so if I still need to manually
create this, I will. I just thought I'd toss out the idea of not
needing an object of the correct type, just one that can handle the
expected behavior.

Robert Landers
Software Engineer
Utrecht NL

1 year ago by tim@bastelstu.be — view source

unread

Hi

Yes!
See https://github.com/nicolas-grekas/symfony/pull/44 for Symfony. All the
complex code is gone \o/
[...]
For Doctrine, the URL is
https://github.com/nicolas-grekas/doctrine-orm/pull/6 for now, with the
most important line being the removal of the symfony/var-exporter
dependency.

Thank you, I'll have a look at it when I have the time.

The RFC isn't updated but below is what we have in our drafts. Let me know
what you think already if you want (otherwise, let us work on the updated
implementation/RFC and we'll let you know about them ASAP).

I've already mentioned in the sub-thread with Larry that this is also
what I had in mind after Arnaud's clarification, so that sounds good to me.

I'll give the RFC another full read once you finished incorporating the
existing feedback. Doesn't really make sense to give feedback on
something that still is in-flux and that after all might not exist in
the updated proposal.

PS: I understand that the concepts in the RFC might be difficult to grasp.
They were certainly challenging for me to summarize. I would happily accept
any help to improve the wording if anyone is willing.

I've already said it in my email to Arnaud: Examples showcasing the
usage of the API in a (simplified) real-world use case would definitely
help making the RFC easier to understand.

And for me it is important that any interactions with the existing
functionality [1] are explicitly spelled out. I'm happier spending the
time reading an RFC that is five times as long, but clearly defines the
entire behavior, than reading an RFC that leaves many open questions
that will end up being "implementation-defined" rather than fully
thought-through.

Best regards
Tim Düsterhus

[1] e.g. my WeakReference question, instanceof behavior, whether or not
private properties are accessible in the initializer, readonly, etc.

1 year ago by Chris Riley — view source

unread

On Tue, 4 Jun 2024 at 13:29, Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for
over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

I'm wondering why this has been attached to the existing reflection API
instead of being a new thing in and of itself? It doesn't seem strictly
related to reflection other than currently the solutions for this rely on
reflection to work.

~C

1 year ago by Marco Pivetta — view source

unread

Hey Nicolas, Arnaud,

On Tue, 4 Jun 2024 at 14:29, Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

First of all, let me say that this is a fantastic RFC: having maintained
both mine and Doctrine's version of lazy proxies for many years, this is
indeed a push in the right direction, making laziness an engine detail,
rather than a complex userland topic with hacks.

Moving this forward will allow us (in a far future) to get rid of some BC
boundaries around the weird unset() semantics of properties, which were
indeed problematic with typed properties and readonly properties.

Stellar work: well done!

TL;DR: of my feedback:

RFC is good / useful / needed: will vote for it
ghost proxies are well designed / fantastic feature
lazy proxies should probably explore replacing object identity further
don't like expanding ReflectionClass further: LazyGhost and
LazyProxy classes (or such) instead
initialize() shouldn't have a boolean behavioral switch parameter:
make 2 methods
flags should be a list<SomeEnumAroundProxies> instead. A bitmask for a
new API feels unsafe and anachronistic, given the tiny performance hit.
don't touch readonly because of lazy objects: this feature is too
niche to cripple a major-major feature like readonly. I would suggest
deferring until after the first bits of this RFC landed.

That said, I took some notes that I'd like you to both consider / answer.

Raw feedback

I did skim through the thread, but did not read it all, so please excuse me
if some feedback is potentially duplicate.
Notes are succinct/to the point: I abuse bullet points heavily, sorry :-)

From an abstraction point of view, lazy objects from this RFC are
indistinguishable from non-lazy ones

do the following aspects always apply? I understand they don't for lazy
proxies, just for ghosts?
- spl_object_id($object) === spl_object_id($proxy)?
- get_class($object) === get_class($proxy)?

Execution of methods or property hooks does not trigger initialization
until one of them accesses a backed property.

excellent! The entire design revolving around object state makes it so
much easier to reason about the entire behavior too!

Proxies: The initializer returns a new instance, and interactions with
the proxy object are forwarded to this instance

I am not sure why this is needed, given ghost objects.
- I understand that you want this for when instantiation is delegated to
  a third party (in Symfony's DIC, a factory), but I feel like the ghost
  design is so much more fool-proof than the proxy approach.
- Perhaps worth taking another stab at sharing object identity, before
  implementing these?
another note on naming: I used "value holder" inside ProxyManager,
because using just "proxy" as a name led to a lot of confusion. This also
applies to me trying to distinguish proxy types inside the RFC and this
discussion.

Internal objects are not supported because their state is usually not
managed via regular properties.

sad, but understandable limitation
what happens if a class extends an internal engine class, like the
gruesome ArrayObject?

The API uses various flags given as options

    public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1;
    public int const SKIP_DESTRUCTOR = 2;
    public int const SKIP_INITIALIZED_READONLY = 4;
    public int const RESET_INITIALIZED_READONLY = 8;

IMO, these should be enum types, given in as a list<TOption> to the
call site
- bitmasks are really only relevant in serialization / storage
  contexts, IMO, for compressing space as much as possible
- the bitmasks in the reflection API are already very confusing and
  hard to use, and I say that as someone that wrapped the entire reflection
  API various times, out of despair

public function newLazyGhost(callable $initializer, int $options = 0):
object {}
public function newLazyProxy(callable $factory, int $options = 0): object {}

Given the recent improvements around closures and the ... syntax (
https://wiki.php.net/rfc/first_class_callable_syntax), is it worth having
Closure only as argument type?
should we declare generic types in the RFC/docs, even if just in the
stubs?
- they would serve as massive documentation improvement for Psalm and
  PHPStan
- it would be helpful to document $initializer and $factory as
  :void or :object functions
  - can the engine check that, perhaps? I have no idea if Closure
    can provide such information on return types, inside the engine.

public function initialize(object $object, bool $skipInitializer = false):
object {}

worth dividing this into
- initialize()
- markAsAlreadyInitialized()
- don't use a boolean flag for two functions that do different things
all methods were added to ReflectionClass
- IMO worth having this as separate class/object where this is attached
  - if one needs to decorate/stub such API in userland, it is
    therefore a completely separate decoration
    - ReflectionClass is already gigantic
    - a smaller API surface that only does proxies (perhaps
      different based on proxy strategy) would be very beneficial
- suggestion: something like new GhostObject($className) and new Proxy($className)
  - I understand that the interactions with
    ReflectionClass#getProperty() felt natural, but the use-case is narrow,
    and ReflectionClass is already really humongous

The resetAsLazy*() methods accept an already created instance.
This allows writing classes that manage their own laziness

overall understandable
- a bit weird to support this, for now
- useful for resettable interfaces: Symfony DIC could benefit from this
what happens if ReflectionClass#reset*() methods are used on a
different class instance?
- considered?

$reflector->getProperty('id')->skipLazyInitialization($post);

perfect for partial objects / understanding why this was implemented
- would it make sense to have an API to set "bulk" values in an object
  this way, instead of having to do this for each property manually?
  - avoids instantiating reflection properties / marking individual
    properties manually
  - perhaps future scope?
    - thinking (new GhostObject($class))->initializePropertiesTo(['foo' => 'bar', 'baz' => 'tab'])

Initialization Triggers

really happy to see all these edge cases being considered here!
- how much of this new API has been tried against the test suite of
  (for example) ocramius/proxy-manager?
  - mostly asking because there's tons of edge cases noted in there

Cloning, unless __clone() is implemented and accesses a property.

how is the initializer of a cloned proxy used?
- Is the initializer cloned too?
- what about the object that a lazy proxy forwards state access to? Is
  it cloned too?

The following special cases do not trigger initialization of a lazy
object:

Will accessing a property via a debugger (such as XDebug) trigger
initialization here?
- asking because debugging proxy initialization often led to problems,
  in the past
  - sometimes even IDEs crashing, or segfaults
this wording is a bit confusing:

Proxy Objects
The actual instance is set to the return value.

considering the following paragraph:

The proxy object is not replaced or substituted for the actual instance.

After initialization, property accesses on the proxy are forwarded to the
actual instance.
Observing properties of the proxy has the same result as observing
properties of the actual instance.

This is some sort of "quantum locking" of both objects?
How hard is it to break this linkage?
- Can properties be unset(), for example?
- what happens to dynamic properties?
  - I don't use them myself, and I discourage their usage, but it
    would be OK to just document the expected behavior

The proxy and actual instance have distinct identities.

Given that we went great lengths to "quantum lock" two objects'
properties, wasn't it perhaps feasible to replace the
proxy instance?
- I have no idea if that would be possible/wished
- would require merging spl_object_id() within the scope of the
  initializer stack frame, with any outer ones
- would solve any identity problems that still riddle the lazy proxy
  design (which I think is incomplete, right now)

The scope and $this of the initializer function is not changed

good / thoughtful design
- using __construct() or reflection properties suffices for most users

If the initializer throws, the object properties are reverted to their
pre-initialization state and the object is
marked as lazy again.

this is some sort of "transactional" behavior
- welcome API, but is it worth having this complexity?
  - is there a performance tradeoff?
  - is a copy of the original state kept during initializer calls?
  - OK with it myself, just probing design considerations
the example uses setRawValueWithoutLazyInitialization(), and
initialization then accesses public properties
- shouldn't a property that now has a value not trigger
  initialization anymore?
  - or does that require
    ReflectionProperty#skipLazyInitialization() calls, for that to work?

ReflectionClass::SKIP_INITIALIZATION_ON_SERIALIZE: By default,
serializing a lazy object triggers its initialization
This flag disables that behavior, allowing lazy objects to be serialized
as empty objects.

how would one deserialize an empty object into a proxy again?
- would this understanding be deferred to the (de-)serializer of choice?
- exercise for userland?

ReflectionClass::newLazyProxy()
The factory should return a new object: the actual instance.

what happens if the user mis-implements the factory as function (object $proxy): object { return $proxy; }?
- this is obviously a mistake on their end, but is it somehow
  preventable?

The resetAsLazyGhost() method resets an existing object and marks it as
lazy.
The indented use-case is for an object to manage its own lazyness by
calling the method in its constructor.

this certainly makes it easier to design "out of the box" lazy objects
- perhaps more useful for tools like ORMs, (de-)serializers and DICs
  though
- using the proxy API internally in classes like DB connections feels a
  bit overkill, to me

ReflectionClass::SKIP_INITIALIZED_READONLY
If this flag is set, these properties are skipped and no exception is
thrown.
The behavior around readonly properties is explained in more details
later.
ReflectionClass::RESET_INITIALIZED_READONLY

while I can see this as useful, it effectively completely breaks the
readonly design
- this is something I'd probably vote against: not worth breaking
  readonly for just the reset*() API here
  - reset*() is already a niche API inside the (relatively) niche
    use-case of laziness: I wouldn't bypass readonly for it
- readonly provided developer value is bigger than lazy object value,
  in my mind

ReflectionClass::resetAsLazyProxy()
The proxy and the actual instance are distinct objects, with distinct
identities.

When creating a lazy proxy, all property accesses are forwarded to a new
instance
- are all property accesses re-bound to the new instance?
- are there any leftovers pointing to the old instance anywhere?
  - thinking dynamic properties and similar

If $skipInitializer is true, the behavior is the one described for
Ghost Objects in the Initialization Sequence
section, except that the initializer is not called.

please make a separate method for this
- it's not worth cramming a completely different behavior in the same
  method
- can document the methods completely independently, this way
- can deprecate the new method separately, if a design flaw is found in
  future

ReflectionProperty::setRawValueWithoutLazyInitialization()
The method does not call hooks, if any, when setting the property value.

So far, it has been possible to unset($object->property) to force
__get and __set to be called
- will setRawValueWithoutLazyInitialization skip also this "unset
  properties" behavior that is possible in userland?
  - this is fine, just a documentation detail to note
  - if it is like that, is it worth renaming the method
    setValueWithoutCallingHooks or such?
    - not important, just noting this opportunity

Readonly properties

while I appreciate the effort into digging in readonly properties, this
section feels extremely tricky
- I would suggest not implementing (for now) either of
  - SKIP_INITIALIZED_READONLY
  - RESET_INITIALIZED_READONLY
- I would suggest leaving some time for these, and re-evaluating after
  the RFC lands / starts being used

Destructors
The destructor of proxy objects is never called. We rely on the
destructor of the proxied instance instead.

raising an edge case here: spl_object_* and object identity checks may
be used inside a destructor
- for example, a DB connection de-registering itself from a connection
  pool somewhere, such as $pool->deRegister($this)
  - the connection pool may have the spl_object_id() of the proxy,
    not the real instance
- this is not a blocker, just an edge case that may require
  documentation
  - it reconnects with "can we replace the object in-place?" question
    above: replacing objects worth exploring

It employs the ghost strategy by default unless the dependency is to be
instantiated
and initialized by a user-provided factory

one question arises here: can this RFC create proxies of interfaces
at all?
- if not, does it throw appropriate exceptions?
- the reason this question comes up is that, especially in the context
  of DICs, factories are for interfaces
  - concrete classes are implementation details of factories, usually
    - very difficult to use lazy proxy and ghost object design with
      services that decorate each other
- this is semi-discussed in "About Proxies" below, around
  "inheritance-proxy"
  - still worth mentioning "no interfaces" as a clear limitation,
    perhaps?

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

1 year ago by Arnaud Le Blanc — view source

unread

Hi Marco,

Thank you for the very detailed feedback. Please find my answers below.
Nicolas will follow-up on some points.

lazy proxies should probably explore replacing object identity further

We have thought about a few approaches for that, but none is really
concluding:

Fusioning the identity of the proxy and the real instance, such that both
objects are considered to have the same identity with regard to
spl_object_id(), SplObjectStorage, WeakMap, and strict equality after
initialization will lead to weird effects.

For example, we don't know what the behavior of this should be:

$reflector = new ReflectionClass(MyClass::class);

$realInstance = new MyClass();
$proxy = $reflector->newLazyProxy(function () use ($realInstance) {
    return $realInstance;
});

$store = new SplObjectStorage();
$store[$realInstance] = 1;
$store[$proxy] = 2;

$reflector->initialize($proxy);

var_dump(count($store)); // 1 or 2 ?

A second approach, suggested by Benjamin, would be that properties of the
instance returned by the factory are copied to the proxy, the proxy marked
as initialized (so it's not a proxy anymore), and the instance returned by
the factory discarded. One problem of this approach is that the instance
may continue to exist independently of the proxy if it's referenced
somewhere. So, proper usage of the lazy proxy API would require the factory
to be aware of that, and to return an object that is not referenced
anywhere. As we implemented the proxy strategy primarily for use-cases
where we don't control the factory, this approach would not work.

A third approach would be to replace all references to the proxy by
references to the backing instance after initialization. Implementing this
approach would be prohibitively slow as it requires to scan the entire
object graph of the process.

None of these approaches are concluding unfortunately. Having two distinct
identities for the proxy and the real instance doesn't appear to be an
issue in practice, however (e.g. in Symfony).

don't like expanding ReflectionClass further: LazyGhost and

LazyProxy classes (or such) instead

The rationale for expanding ReflectionClass and ReflectionProperty is that
code creating lazy objects tend to also use these two classes, for
introspection or to initialize the object. Also, we feel that the new
methods are directly related to existing methods. E.g. newLazyGhost() is
just another variant of newInstance() and newInstanceWithoutConstructor(),
and setRawValueWithoutLazyInitialization() is just another variant of
setValue() and setRawValue() (added in the hooks RFC).

initialize() shouldn't have a boolean behavioral switch parameter: make

2 methods

Agreed. We have updated the RFC to split the method into initalize() and
markAsInitialized().

flags should be a list<SomeEnumAroundProxies> instead. A bitmask for
a new API feels unsafe and anachronistic, given the tiny performance hit.

Unfortunately this leads to a 30% slowdown in newLazyGhost() when switching
to an array of enums, in a micro benchmark. I'm not sure how this would
impact a real application, but given this is a performance critical
feature, the slowdown is an issue. Besides that, no existing API in core is
using an array of enums, and we don't want to introduce this concept in
this RFC.

From an abstraction point of view, lazy objects from this RFC are
indistinguishable from non-lazy ones

do the following aspects always apply? I understand they don't for lazy

proxies, just for ghosts?

spl_object_id($object) === spl_object_id($proxy)?

get_class($object) === get_class($proxy)?

Good catch, this phrase is not true for proxies, with relation to the
identity of a proxy and its real instance. Apart from that, the main point
of this phrase remains: interaction with a proxy has the same behavior as
interaction with a real instance, and they can be used without knowing they
are lazy.

Proxies: The initializer returns a new instance, and interactions with
the proxy object are forwarded to this instance

another note on naming: I used "value holder" inside ProxyManager,
because using just "proxy" as a name led to a lot of confusion. This also
applies to me trying to distinguish proxy types inside the RFC and this
discussion.

The name "Value Holder" appears to be taken for another pattern already:
https://martinfowler.com/eaaCatalog/lazyLoad.html
The exact name of the pattern used in this RFC is the "Virtual
State-Proxy", but others have suggested using just "Proxy" in the RFC. In
the API we use "LazyProxy".

Internal objects are not supported because their state is usually not
managed via regular properties.

sad, but understandable limitation

what happens if a class extends an internal engine class, like the
gruesome ArrayObject?

This also applies to sub-classes of internal objects. This is specified
under ReflectionClass::newLazyGhost(), but I've clarified that here too.

Given the recent improvements around closures and the ... syntax (
https://wiki.php.net/rfc/first_class_callable_syntax), is it worth having
Closure only as argument type?

should we declare generic types in the RFC/docs, even if just in the
stubs?

they would serve as massive documentation improvement for Psalm and
PHPStan

it would be helpful to document $initializer and $factory as
:void or :object functions

can the engine check that, perhaps? I have no idea if Closure
can provide such information on return types, inside the engine.

Could you say more about the benefits of type hinting as Closure instead of
callable? Is this purely to be able to type check the return type earlier?
We may be able to achieve this while retaining the callable type hint, when
a Closure is given, but this would be unusual. Maybe this is something we
can push in a different RFC?

what happens if ReflectionClass#reset*() methods are used on a
different class instance?

considered?

The object must be an instance of the class represented by ReflectionClass
(including of a sub-class)

$reflector->getProperty('id')->skipLazyInitialization($post);
perfect for partial objects / understanding why this was implemented

would it make sense to have an API to set "bulk" values in an object
this way, instead of having to do this for each property manually?

avoids instantiating reflection properties / marking individual
properties manually

perhaps future scope?

thinking (new GhostObject($class))->initializePropertiesTo(['foo' => 'bar', 'baz' => 'tab'])

This is something we thought about but it's not as simple as it seems: we
have to support setting private properties of parent classes, so the
argument has to represent that. A format that would be able to represent
that is as follows:

The argument is a map of class names to properties
The properties are a map of property name to property value. To support
skipping a property (and setting it to its default value), numeric keys (no
key specified) denote that the value is a property name to skip.

Example:

class ParentOfA {
private $c;
private $d = 1;
}
class A extends ParentOfA {
private $a;
private $b;
}

->initializePropertiesTo([
A::class => ['a' => 'value-a', 'b' => 'value-b'],
ParentOfA::class => ['c' => 'value-c', 'd'], // 'd' is init to its
default value
])

An alternative is to use the same binary format as
get_mangled_object_vars().

WDYT?

Initialization Triggers

really happy to see all these edge cases being considered here!

how much of this new API has been tried against the test suite of
(for example) ocramius/proxy-manager?

mostly asking because there's tons of edge cases noted in there

Nicolas has tested the implementation extensively on the VarExporter and
DIC components, and on Doctrine. We would be happy if you could check with
the ocramius/proxy-manager test suite.

Cloning, unless __clone() is implemented and accesses a property.

how is the initializer of a cloned proxy used?

Is the initializer cloned too?

The initializer is not cloned. I've clarified this in the RFC. The
Initialization Sequence section has an example of how an initializer can
detect being called for a clone of the origin lazy object, if necessary:

$init = function ($object) use (&$originalObject) {
if ($object !== $originalObject) {
// we are initializing a clone
}};$originalObject = $reflector->newLazyProxy($init);

* what about the object that a lazy proxy forwards state access to? Is

it cloned too?

Cloning an initialized proxy is the same as cloning the real instance (this
clones the real instance and returns it).

The following special cases do not trigger initialization of a lazy
object:

Will accessing a property via a debugger (such as XDebug) trigger
initialization here?

asking because debugging proxy initialization often led to problems,
in the past

sometimes even IDEs crashing, or segfaults

Good point, I will check this. The implementation is biased towards
initialization, so accessing a lazy object will usually initialize it.
However the internal APIs used by var_dump and get_mangled_object_vars() do
not trigger initialization, so if a debugger uses that, it shouldn't
trigger initialization.

this wording is a bit confusing:

Proxy Objects
The actual instance is set to the return value.

considering the following paragraph:

The proxy object is not replaced or substituted for the actual
instance.

Indeed. "actual instance" is supposed to designate the instance that the
proxy forwards to, but I see why it's confusing. I've renamed "actual
instance" to "real instance". WDYT?

After initialization, property accesses on the proxy are forwarded to
the actual instance.
Observing properties of the proxy has the same result as observing
properties of the actual instance.

This is some sort of "quantum locking" of both objects?

How hard is it to break this linkage?

Can properties be unset(), for example?

what happens to dynamic properties?

I don't use them myself, and I discourage their usage, but it
would be OK to just document the expected behavior

The linkage can not be broken. unset() and dynamic properties are not
special, in that these are just property accesses. All property accesses on
the proxy are forwarded to the real instance.

If the initializer throws, the object properties are reverted to their
pre-initialization state and the object is
marked as lazy again.

this is some sort of "transactional" behavior

welcome API, but is it worth having this complexity?

is there a performance tradeoff?

is a copy of the original state kept during initializer calls?

OK with it myself, just probing design considerations

I believe it's worth having this, as it prevents leaving an object in a
corrupt state, which could then be accessed later, in case of temporary
initializer failure.

The performance overhead and complexity are small. A shallow copy of the
original state is kept during initialization. As the copy is shallow this
is just a few pointer copies and refcount increases.

the example uses setRawValueWithoutLazyInitialization(), and
initialization then accesses public properties

shouldn't a property that now has a value not trigger
initialization anymore?

or does that require
ReflectionProperty#skipLazyInitialization() calls, for that to work?

setRawValueWithoutLazyInitialization() has the same effect as
skipLazyInitialization(), in addition to setting the specified value. I've
clarified this in the RFC.

ReflectionClass::SKIP_INITIALIZATION_ON_SERIALIZE: By default,
serializing a lazy object triggers its initialization
This flag disables that behavior, allowing lazy objects to be serialized
as empty objects.

how would one deserialize an empty object into a proxy again?

would this understanding be deferred to the (de-)serializer of
choice?

exercise for userland?

Yes this is left as exercise to userland when this flag is used.

ReflectionClass::newLazyProxy()
The factory should return a new object: the actual instance.

what happens if the user mis-implements the factory as function (object $proxy): object { return $proxy; }?

this is obviously a mistake on their end, but is it somehow
preventable?

Returning a lazy object (including an initialized proxy) is not allowed and
will throw. I've clarified this in the RFC (the RFC specified that
returning a lazy object was not allowed, but whether this included
initialized proxies was not clear).

ReflectionClass::resetAsLazyProxy()
The proxy and the actual instance are distinct objects, with distinct
identities.

When creating a lazy proxy, all property accesses are forwarded to a new
instance

are all property accesses re-bound to the new instance?

are there any leftovers pointing to the old instance anywhere?

thinking dynamic properties and similar

Yes all property accesses are forwarded to the new instance after that.
There can not be any leftovers to the old instance anywhere, including in
dynamic properties (the resetAsLazy*() methods reset the object entirely).

ReflectionProperty::setRawValueWithoutLazyInitialization()
The method does not call hooks, if any, when setting the property value.

So far, it has been possible to unset($object->property) to force
__get and __set to be called

will setRawValueWithoutLazyInitialization skip also this "unset
properties" behavior that is possible in userland?

this is fine, just a documentation detail to note

if it is like that, is it worth renaming the method
setValueWithoutCallingHooks or such?

not important, just noting this opportunity

Yes, setRawValueWithoutLazyInitialization() skips magic methods and hooks.

The "setRawValue" part of the method name was borrowed from the
ReflectionProperty::setRawValue() method introduced by the hooks RFC, which
is an equivalent to setValue() but doesn't call hooks.

Destructors
The destructor of proxy objects is never called. We rely on the
destructor of the proxied instance instead.

raising an edge case here: spl_object_* and object identity checks may
be used inside a destructor

for example, a DB connection de-registering itself from a connection
pool somewhere, such as $pool->deRegister($this)

the connection pool may have the spl_object_id() of the proxy,
not the real instance

this is not a blocker, just an edge case that may require
documentation

it reconnects with "can we replace the object in-place?"
question above: replacing objects worth exploring

This is an interesting case. This may be an issue if the proxy itself was
registered in the pool. In case the initializer or the constructor
registers the connection, then the real instance will have been
registered.

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

flags should be a list<SomeEnumAroundProxies> instead. A bitmask for
a new API feels unsafe and anachronistic, given the tiny performance hit.

Unfortunately this leads to a 30% slowdown in newLazyGhost() when switching
to an array of enums, in a micro benchmark. I'm not sure how this would
impact a real application, but given this is a performance critical

I'm curious, how did the implementation look like? Is there a proof of
concept commit or patch available somewhere? As the author of the first
internal enum (Random\IntervalBoundary) I had the pleasure of finding
out that there was no trivial way to efficiently match the various enum
cases. See the PR review here:
https://github.com/php/php-src/pull/9679#discussion_r1045943444

I was able to find a hacky work-around, but if we add additional enums,
perhaps we should add the proper infrastructure to the arginfo files or
so to match enum cases to switch-case in C, even without making them an
integer-backed enum.

what happens if ReflectionClass#reset*() methods are used on a
different class instance?

considered?

The object must be an instance of the class represented by ReflectionClass
(including of a sub-class)

I believe the sub-class part is not spelled out in the RFC text. I'm
also not sure if allowing sub-classes here is sound, given the reasoning
of my previous emails.

 * what about the object that a lazy proxy forwards state access to? Is
it cloned too?
Cloning an initialized proxy is the same as cloning the real instance (this
clones the real instance and returns it).

See my previous email.

After initialization, property accesses on the proxy are forwarded to
the actual instance.
Observing properties of the proxy has the same result as observing
properties of the actual instance.

This is some sort of "quantum locking" of both objects?

How hard is it to break this linkage?

Can properties be unset(), for example?

what happens to dynamic properties?

I don't use them myself, and I discourage their usage, but it
would be OK to just document the expected behavior

The linkage can not be broken. unset() and dynamic properties are not
special, in that these are just property accesses. All property accesses on
the proxy are forwarded to the real instance.

The dynamic property bit is good, I had the same question. To rephrase:

Any access to a non-existant (i.e. dynamic) property will trigger
initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?

While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.

ReflectionClass::newLazyProxy()
The factory should return a new object: the actual instance.

what happens if the user mis-implements the factory as function (object $proxy): object { return $proxy; }?

this is obviously a mistake on their end, but is it somehow
preventable?

Returning a lazy object (including an initialized proxy) is not allowed and
will throw. I've clarified this in the RFC (the RFC specified that
returning a lazy object was not allowed, but whether this included
initialized proxies was not clear).

Relatedly: In the 'resetAsLazyGhost()' explanation we have this sentence:

If the object is already lazy, a ReflectionException is thrown with
the message “Object is already lazy”.

What happens when calling the method on a initialized proxy object?
i.e. the following:

 class Obj { public function __construct(public string $name) {} }
 $obj1 = new Obj('obj1');
 $r->resetAsLazyProxy($obj, ...);
 $r->initialize($obj);
 $r->resetAsLazyProxy($obj, ...);

What happens when calling it for the actual object of an initialized
proxy object? It's probably not possible to prevent this, but will this
allow for proxy chains? Example:

 class Obj { public function __construct(public string $name) {} }
 $obj1 = new Obj('obj1');
 $r->resetAsLazyProxy($obj1, function () use (&$obj2) {
     $obj2 = new Obj('obj2');
     return $obj2;
 });
 $r->resetAsLazyProxy($obj2, function () {
     return new Obj('obj3');
 });
 var_dump($obj1->name); // what will this print?

Best regards
Tim Düsterhus

1 year ago by Rob Landers — view source

unread

I just noticed in the RFC that I don't see any mention of what happens when running get_class, get_debug_type, etc., on the proxies, but it does mention var_dump.

1 year ago by Arnaud Le Blanc — view source

unread

Hi Tim,

flags should be a list<SomeEnumAroundProxies> instead. A bitmask for
a new API feels unsafe and anachronistic, given the tiny performance hit.

Unfortunately this leads to a 30% slowdown in newLazyGhost() when switching
to an array of enums, in a micro benchmark. I'm not sure how this would
impact a real application, but given this is a performance critical

I'm curious, how did the implementation look like? Is there a proof of
concept commit or patch available somewhere? As the author of the first
internal enum (Random\IntervalBoundary) I had the pleasure of finding
out that there was no trivial way to efficiently match the various enum
cases. See the PR review here:
https://github.com/php/php-src/pull/9679#discussion_r1045943444

I've benchmarked this implementation:
https://github.com/arnaud-lb/php-src/commit/f5f87d8a7abeba2f406407606949e5c6e512baab.
Using a backed enum to have a more direct way to map enum cases to
integers didn't make a significant difference.
Here is the benchmark:
https://gist.github.com/arnaud-lb/76f77b5d7409a9e4deea995c179c6e96.
Caching the options array between calls had a less dramatic slowdown
(around 10%): https://gist.github.com/arnaud-lb/87e7f58cc11463dd3aa098218eb95a90.

Best Regards,
Arnaud

1 year ago by tim@bastelstu.be — view source

unread

Hi

I'm curious, how did the implementation look like? Is there a proof of
concept commit or patch available somewhere? As the author of the first
internal enum (Random\IntervalBoundary) I had the pleasure of finding
out that there was no trivial way to efficiently match the various enum
cases. See the PR review here:
https://github.com/php/php-src/pull/9679#discussion_r1045943444

I've benchmarked this implementation:
https://github.com/arnaud-lb/php-src/commit/f5f87d8a7abeba2f406407606949e5c6e512baab.
Using a backed enum to have a more direct way to map enum cases to
integers didn't make a significant difference.
Here is the benchmark:
https://gist.github.com/arnaud-lb/76f77b5d7409a9e4deea995c179c6e96.
Caching the options array between calls had a less dramatic slowdown
(around 10%): https://gist.github.com/arnaud-lb/87e7f58cc11463dd3aa098218eb95a90.

Your Gists don't seem to include the actual numbers, so the second link
is not particularly useful.

However you said that using a backed enum does not improve the
situation, so there probably really is not much that can be done.

For completeness I want to note, though: You might be able to improve
the type checking performance by directly checking the CE instead of
going through instanceof_function(), because you know that inheritance
is not a thing for enums. Not sure if this makes much of a difference,
given the fact that instanceof_function() already does this and is
force-inlined, but it might remove the branch with the call to
instanceof_function_slow().

Best regards
Tim Düsterhus

1 year ago by Gina P. Banyard — view source

unread

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Hello,

I don't have any strong opinions about the feature in general, mainly because I don't understand the problem space.

However, I have some remarks.

The fact that an initialize() method has a $skipInitializer parameter doesn't make a lot of sense to me.
Because at a glance, I don't see how passing true to it, and not calling the method is different?
This should probably be split into two distinct methods.

Does get_mangled_object_vars() trigger initialization or not?
This should behave like an (array) cast (and should be favoured instead of an array cast as it was introduced for that purpose).

How does a lazy object look like when it has been dumped?

The initializer must return null or no value
Technically all functions in PHP return a value, which by default is null, so this is somewhat redundant.
Also, would this throw a TypeError if a value other than null is returned?

Best regards,
Gina P. Banyard

1 year ago by Arnaud Le Blanc — view source

unread

Hi Gina,

The fact that an initialize() method has a $skipInitializer parameter
doesn't make a lot of sense to me.
Because at a glance, I don't see how passing true to it, and not calling
the method is different?

Calling initialize() with $skipInitializer set to true has the same effect
as calling skipLazyInitialization() on all properties that are still lazy
on the object. This can be used to manually initialize a lazy object
afterwards, as property accesses will not trigger initialization anymore.
This also ensures that the initializer function can be decref'ed.

This should probably be split into two distinct methods.

Agreed

Does get_mangled_object_vars() trigger initialization or not?

No. get_mangled_object_vars() and array cast are among the few cases that
do not trigger initialization. They are listed in
https://wiki.php.net/rfc/lazy-objects#initialization_triggers.

This should behave like an (array) cast (and should be favoured instead of
an array cast as it was introduced for that purpose).

Exactly

How does a lazy object look like when it has been dumped?

The output of var_dump() on a lazy object is the same as on an object whose
all properties have been unset() (except those initialized with
setRawValueWithoutLazyInitialization() or skipLazyInitialization()). For
convenience we also prefix the output with either lazy ghost or lazy proxy.

I've added a var_dump section in the RFC.

The initializer must return null or no value
Technically all functions in PHP return a value, which by default is
null, so this is somewhat redundant.

Agreed. I believe that formulating this like that makes it clear that any
of "return null;", "return;", or implicit return, will work.

Also, would this throw a TypeError if a value other than null is returned?

Agreed. Currently we throw an Error, but I will change that to TypeError.

Best Regards,
Arnaud

1 year ago by Levi Morrison — view source

unread

On Tue, Jun 4, 2024 at 6:31 AM Nicolas Grekas
nicolas.grekas+php@gmail.com wrote:

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

I will vote no on this one. I do not believe the internal complexity
and maintenance is worth the feature. Additionally, I do not feel like
changing the language to support this feature is a good idea; if this
were a library only thing, I would not care.

1 year ago by Arnaud Le Blanc — view source

unread

Hi Levi,

On Wed, Jun 26, 2024 at 12:07 AM Levi Morrison levi.morrison@datadoghq.com
wrote:

I will vote no on this one. I do not believe the internal complexity
and maintenance is worth the feature. Additionally, I do not feel like
changing the language to support this feature is a good idea; if this
were a library only thing, I would not care.

Hi Levi,

The proposed implementation is adding very little complexity as it's not
adding any special case outside of object handlers (except in json_encode()
and serialize() because these functions trade abstractions for speed).
Furthermore all operations that may trigger an object initialization are
already effectful, due to magic methods or hooks (so we are not making pure
operations effectful). This means that we do not have to worry about lazy
objects or to be aware of them anywhere in the code base, outside of object
handlers.

To give you an idea, it's implemented by hooking into the code path that
handles accesses to undefined properties. This code path may call __get or
__set methods if any, or trigger errors, and with this proposal, may
trigger the initialization. Userland implementations achieve this
functionality in a very similar way (with unset() and a generated sub-class
with magic methods), but they have considerably more edge cases to handle
due to being at a different abstraction level.

Best Regards,
Arnaud

1 year ago by Marco Pivetta — view source

unread

Hey Arnaud,

The proposed implementation is adding very little complexity as it's not
adding any special case outside of object handlers (except in json_encode()
and serialize() because these functions trade abstractions for speed).
Furthermore all operations that may trigger an object initialization are
already effectful, due to magic methods or hooks (so we are not making pure
operations effectful). This means that we do not have to worry about lazy
objects or to be aware of them anywhere in the code base, outside of object
handlers.

To give you an idea, it's implemented by hooking into the code path that
handles accesses to undefined properties. This code path may call __get or
__set methods if any, or trigger errors, and with this proposal, may
trigger the initialization. Userland implementations achieve this
functionality in a very similar way (with unset() and a generated sub-class
with magic methods), but they have considerably more edge cases to handle
due to being at a different abstraction level.

Assuming this won't pass a vote (I hope it does, but I want to be
optimistic): is this something that could be implemented in an extension,
or is it only feasible in core?

Greets,

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

1 year ago by Rob Landers — view source

unread

Hey Arnaud,

The proposed implementation is adding very little complexity as it's not adding any special case outside of object handlers (except in json_encode() and serialize() because these functions trade abstractions for speed). Furthermore all operations that may trigger an object initialization are already effectful, due to magic methods or hooks (so we are not making pure operations effectful). This means that we do not have to worry about lazy objects or to be aware of them anywhere in the code base, outside of object handlers.

To give you an idea, it's implemented by hooking into the code path that handles accesses to undefined properties. This code path may call __get or __set methods if any, or trigger errors, and with this proposal, may trigger the initialization. Userland implementations achieve this functionality in a very similar way (with unset() and a generated sub-class with magic methods), but they have considerably more edge cases to handle due to being at a different abstraction level.

Assuming this won't pass a vote (I hope it does, but I want to be optimistic): is this something that could be implemented in an extension, or is it only feasible in core?

Greets,

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

I really hope it passes, not just for your libraries but also for mine. I'm looking forward to going on a deletion spree and having a nice standardized proxy API.

— Rob

1 year ago by Arnaud Le Blanc — view source

unread

Hi Marco,

Hey Arnaud,

The proposed implementation is adding very little complexity as it's not adding any special case outside of object handlers (except in json_encode() and serialize() because these functions trade abstractions for speed). Furthermore all operations that may trigger an object initialization are already effectful, due to magic methods or hooks (so we are not making pure operations effectful). This means that we do not have to worry about lazy objects or to be aware of them anywhere in the code base, outside of object handlers.

To give you an idea, it's implemented by hooking into the code path that handles accesses to undefined properties. This code path may call __get or __set methods if any, or trigger errors, and with this proposal, may trigger the initialization. Userland implementations achieve this functionality in a very similar way (with unset() and a generated sub-class with magic methods), but they have considerably more edge cases to handle due to being at a different abstraction level.

Assuming this won't pass a vote (I hope it does, but I want to be optimistic): is this something that could be implemented in an extension, or is it only feasible in core?

An extension could achieve similar behavior by decorating the default
object handlers. However, it may have to re-implement a significant
part of the object handlers logic, so that initialization is triggered
at the right time.

Best Regards,
Arnaud

1 year ago by Rob Landers — view source

unread

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Can you add to the RFC how to proxy final classes as well? This is mentioned (unless I misunderstood) but in the proxy example it shows the proxy class extending the proxied class (which I think is an error if the base class is final). How would this work? Or would it need to implement a shared interface (this is totally fine IMHO)?

— Rob

1 year ago by Arnaud Le Blanc — view source

unread

Hi Rob,

Can you add to the RFC how to proxy final classes as well? This is mentioned (unless I misunderstood) but in the proxy example it shows the proxy class extending the proxied class (which I think is an error if the base class is final). How would this work? Or would it need to implement a shared interface (this is totally fine IMHO)?

The example you are referring to in the "About Proxies" section is a
digression about how the lazy-loading inheritance-proxy pattern could
be achieved on top of the lazy-loading state-proxy pattern implemented
by this RFC, but it doesn't represent the main use-case.

To proxy a final class with this RFC, you can simply call the
newLazyProxy method:

final class MyClass {
public $a;
}

$reflector = new ReflectionClass(MyClass::class);
$obj = $reflector->newLazyProxy(function () {
return new MyClass();
});

Best Regards,
Arnaud

1 year ago by come@chilliet.eu — view source

unread

Le mardi 4 juin 2024, 14:28:53 UTC+2 Nicolas Grekas a écrit :

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for
over a year to add native support for lazy objects to PHP.

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Hello,

I’m a bit late to the party, but after reading the RFC I still do not understand the difference between Ghost and Proxy.
I do understand the technical internal difference, I think, of 1 vs 2 objects, and the API differences of passing a constructor vs a factory function.
But I do not understand the difference from the caller point of view, how would I choose when to use one or the other depending on my usecase?

What does Proxy allows that Ghost do not?

Or do Proxy only make sense when returning a parent class from the factory method?

Côme

1 year ago by Marco Pivetta — view source

unread

Hey Côme,

What does Proxy allows that Ghost do not?

Ghosts work well when you need to use the object identity, such as:

$a === $b checks
spl_object_id(...)
SplObjectStorage

Ghost objects operate at property level, and can only ever work with
concrete classes.

IMO, proxies can operate also at interface level (probably future scope),
and could be expanded to not need a concrete implementation until the
initialization callback is reached.

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

1 year ago by kontakt@beberlei.de — view source

unread

Am 06.08.2024, 11:01:54 schrieb Côme Chilliet come@chilliet.eu:

Le mardi 4 juin 2024, 14:28:53 UTC+2 Nicolas Grekas a écrit :

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for

over a year to add native support for lazy objects to PHP.

Please find all the details here:

https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,

Nicolas and Arnaud

Hello,

I’m a bit late to the party, but after reading the RFC I still do not
understand the difference between Ghost and Proxy.
I do understand the technical internal difference, I think, of 1 vs 2
objects, and the API differences of passing a constructor vs a factory
function.
But I do not understand the difference from the caller point of view, how
would I choose when to use one or the other depending on my usecase?

What does Proxy allows that Ghost do not?

the primary use-cases for a proxy is when the class has a complex factory
method and you want to make the class lazy on top. From a pattern POV, you
can think of the proxy a decorator, it calls the complex factory in the
initiailizer and then delegates all calls to it from there on.

Taking Doctrine ORM as an example:

$reflectionClass = new ReflectionClass(EntityManager::class);
$entityManager = $reflectionClass->newLazyProxy(function (EntityManager
$proxy) {
return EntityManager::create($options);
});

This is not possible with a Ghost, because that does not delegate, it
"becomes the original class/object".

Or do Proxy only make sense when returning a parent class from the factory
method?

Côme