Protected destructors

9 months ago by Jonathan Vollebregt — view source

unread

Hello all,

On github I reported a bug causing errors when objects go out of scope
with a protected/private destructor:
https://github.com/php/php-src/issues/16077

Example code: https://3v4l.org/bKijA

The magic methods documentation says:

All magic methods, with the exception of __construct(), __destruct(), and __clone(), must be declared as public

And true to its word __destruct can be declared protected/private
without a warning, unlike other magic methods.

Other magic methods however still work even if they're declared private:
https://3v4l.org/iCRSe

This lead to discussion around whether the engine should be fixed to
always successfully call __destruct or __destruct should warn when not
public like other magic methods, or contrarily warn when public, or
whether there should be special logic stopping manual calls to
__construct/__destruct to prevent double initialization as well...

But focusing on this specific issue I don't think these BC breaks are
even warranted. Since it doesn't work now making it work wouldn't be a
BC break unless you were relying on destructor call error handlers as
some kind of cursed event dispatcher.

A github search for "private function __destruct" found a whopping 298
hits. 296 for "protected function __destruct"

The biggest implementation problem I can think of is what to do about
class hierarchies. If you require all destructors to be protected (or
public) that puts the onus on the programmer to ensure the parent
destructor is called, but private destructors would be a headache.

It seems the protected but not private destructors idea was already
implemented for PHP 7 (https://3v4l.org/6DFGp) but fell victim to a
regression in 7.3.15/7.4.3 and no-one noticed because it was never added
to the changelog in the first place.

Long story short I'd like to suggest:

Allow the engine to call protected destructors (again)
Warning when declaring a private destructor as with other magic methods
Documentation update to confirm private destructors aren't allowed

Does this sound good?

Jonathan

9 months ago by Christoph M. Becker — view source

unread

Long story short I'd like to suggest:

Allow the engine to call protected destructors (again)

Warning when declaring a private destructor as with other magic methods

Documentation update to confirm private destructors aren't allowed

Does this sound good?

Hmm, I wonder about the use-cases of userland destructors. It seems to
me they are mostly useful for sanity checks, and maybe to close
resources. Are there others?

If not, I wouldn't worry much about the visibility of destructors,
because resources are scheduled for replacement anyway.

And since manually calling magic methods came up in the ticket: in my
opinion, whenever you have ->__ in production code, you're doing
something wrong.

Christoph

9 months ago by Jonathan Vollebregt — view source

unread

Hmm, I wonder about the use-cases of userland destructors. It seems to
me they are mostly useful for sanity checks, and maybe to close
resources. Are there others?

If not, I wouldn't worry much about the visibility of destructors,
because resources are scheduled for replacement anyway.

Besides closing resources and killing processes I've seen them store
data to disk for caching, remove temp files, call callbacks/dispatch
events, change state on other objects, dump stored errors to error_log
in a loop in an error handler...

It looks like there's quite a lot of use-cases for them (Which can go
wrong if called twice) that don't necessarily require resources to be
involved

9 months ago by Christoph M. Becker — view source

unread

Hmm, I wonder about the use-cases of userland destructors. It seems to
me they are mostly useful for sanity checks, and maybe to close
resources. Are there others?

If not, I wouldn't worry much about the visibility of destructors,
because resources are scheduled for replacement anyway.

Besides closing resources and killing processes I've seen them store
data to disk for caching, remove temp files, call callbacks/dispatch
events, change state on other objects, dump stored errors to error_log
in a loop in an error handler...

Okay. My point is that you cannot know (unless there are no circular
dependencies) when a destructor is called by the engine; it may be
called during some GC run, or during the request shutdown sequence. As
it's now, that happens pretty early during shutdown, but that might
change when stream resources are converted to objects. So you cannot be
absolutely sure that everything works as expected in destructors. This
is a general issue for garbage collected languages; some of these have
no destructors at all, for such reasons.

It looks like there's quite a lot of use-cases for them (Which can go
wrong if called twice) that don't necessarily require resources to be
involved

Like I said, I wouldn't be particularly worried about clients calling a
destructor manually (that's a bit different for the engine, since
segfaultish conditions should be avoided).

But I don't have a strong opinion about the visibility of destructors
anyway. I'm fine with allowing protected constructors.

Christoph

9 months ago by Rob Landers — view source

unread

Hmm, I wonder about the use-cases of userland destructors. It seems to
me they are mostly useful for sanity checks, and maybe to close
resources. Are there others?

If not, I wouldn't worry much about the visibility of destructors,
because resources are scheduled for replacement anyway.

Besides closing resources and killing processes I've seen them store
data to disk for caching, remove temp files, call callbacks/dispatch
events, change state on other objects, dump stored errors to error_log
in a loop in an error handler...

Okay. My point is that you cannot know (unless there are no circular
dependencies) when a destructor is called by the engine; it may be
called during some GC run, or during the request shutdown sequence. As
it's now, that happens pretty early during shutdown, but that might
change when stream resources are converted to objects. So you cannot be
absolutely sure that everything works as expected in destructors. This
is a general issue for garbage collected languages; some of these have
no destructors at all, for such reasons.

It’s undocumented, AFAIK, but the destructor behavior is pretty dependable atm. For example, local variables are almost always destructed at the end of the current (function) scope. I am not sure what streams have to do with GC, but I can’t see how streams would change this behavior.

It looks like there's quite a lot of use-cases for them (Which can go
wrong if called twice) that don't necessarily require resources to be
involved

Like I said, I wouldn't be particularly worried about clients calling a
destructor manually (that's a bit different for the engine, since
segfaultish conditions should be avoided).

But I don't have a strong opinion about the visibility of destructors
anyway. I'm fine with allowing protected constructors.

Christoph

Destructors should (IMHO) be public. Not necessarily because they can be called, but classes with destructors hint at underlying behavior when destructed. For performance, you might want to defer that by retaining a reference. If a class has a hidden destructor, you have to go read the code to find it.

That’s my 2¢

— Rob

9 months ago by Jonathan Vollebregt — view source

unread

Okay. My point is that you cannot know (unless there are no circular
dependencies) when a destructor is called by the engine;

The benefit of non-public visibility isn't when it's called, but how
many times it's called. If you can declare your destructor non-public
you can be confident it'll only be called once per instance (By the engine)

Or is there a scenario where the engine will call a destructor more than
once on the same instance?

Destructors should (IMHO) be public. Not necessarily because they can be called, but classes with destructors hint at underlying behavior when destructed. For performance, you might want to defer that by retaining a reference. If a class has a hidden destructor, you have to go read the code to find it.

Wouldn't you have to read the code to see if it had a public destructor too?

9 months ago by John Bafford — view source

unread

Okay. My point is that you cannot know (unless there are no circular
dependencies) when a destructor is called by the engine;

The benefit of non-public visibility isn't when it's called, but how many times it's called. If you can declare your destructor non-public you can be confident it'll only be called once per instance (By the engine)

Or is there a scenario where the engine will call a destructor more than once on the same instance?

Destructors should (IMHO) be public. Not necessarily because they can be called, but classes with destructors hint at underlying behavior when destructed. For performance, you might want to defer that by retaining a reference. If a class has a hidden destructor, you have to go read the code to find it.

Wouldn't you have to read the code to see if it had a public destructor too?

I would argue that, semantically, destructors should be private. You should never need to know if a class has a destructor, and you should never call it manually. The engine should automatically handle calling parent destructors when necessary.

If there really is some logic in the destructor that a user of a class might legitimately want to use, then that should be exposed in a separate method (with a more appropriate name, and an implementation that handles that it might be called more than once) and have the destructor call that method.

-John

9 months ago by Arnaud Le Blanc — view source

unread

Hi Christoph,

Besides closing resources and killing processes I've seen them store
data to disk for caching, remove temp files, call callbacks/dispatch
events, change state on other objects, dump stored errors to error_log
in a loop in an error handler...

Okay. My point is that you cannot know (unless there are no circular
dependencies) when a destructor is called by the engine; it may be
called during some GC run, or during the request shutdown sequence. As
it's now, that happens pretty early during shutdown, but that might
change when stream resources are converted to objects. So you cannot be
absolutely sure that everything works as expected in destructors. This
is a general issue for garbage collected languages; some of these have
no destructors at all, for such reasons.

I agree. To expand on this, I think that the use of destructors should
be discouraged for many reasons:

Objects with destructors slow down garbage collection due to
potential resurrections. It's not possible to detect resurrections, so
in the presence of destructors the GC has to forget most of its state,
run destructors, and restart.
They may be called in undefined order during GC and shutdown,
including after their dependencies have already been destroyed
They may cause concurrency issues (even in non-concurrent
programs) because they can be called at any time by the GC
They give a false sense of security as destructors will not be
called in case of fatal errors or crashes
Using them to manage non-memory resources (such as file
descriptors, temp files, ...) is not a good idea because the GC is
only triggered by memory metrics. The process may run out of file
descriptors before it triggers a garbage collection, for example.

Part of these problems would be solved by disabling the GC, but
ensuring that large code bases are free of cycles is not an easy task.

Java has deprecated destructors (finalizers) [2] and recommends using
Cleaners [1] instead. Cleaners would resolve the 1st issue, and
partially the 2nd one (for GC, not shutdown), but I'm not entirely
sure they are transposable. Ruby's ObjectSpace.define_finalizer is
similar to Cleaners, AFAIK, in that the finalizer must not reference
the object.

Some use-cases of destructors could be replaced with patterns like
Python's with() [3], Java's try-with [4], or Go's defer [5].

[1] https://docs.oracle.com/javase/9/docs/api/java/lang/ref/Cleaner.html
[2] https://docs.oracle.com/javase/9/docs/api/java/lang/Object.html#finalize--
[3] https://docs.python.org/3/reference/compound_stmts.html#with
[4] https://docs.oracle.com/javase/8/docs/technotes/guides/language/try-with-resources.html
[5] https://go.dev/doc/effective_go#defer

Best Regards,
Arnaud

9 months ago by K Sandvik — view source

unread

Some use-cases of destructors could be replaced with patterns like
Python's with() [3], Java's try-with [4], or Go's defer [5].

defer would be neat in PHP. --Kent

9 months ago by Larry Garfield — view source

unread

Some use-cases of destructors could be replaced with patterns like
Python's with() [3], Java's try-with [4], or Go's defer [5].

defer would be neat in PHP. --Kent

I would have said with() would be neat in PHP. :-)

--Larry Garfield

9 months ago by Rowan Tommins [IMSoP] — view source

unread

I would have said with() would be neat in PHP. :-)

I have been considering for a while proposing Context Managers [Python's with(), not to be confused with VisualBasic & JavaScript unrelated feature with the same keyword].

My primary example use case is safe database transactions, which I've seen implemented in PHP in two ways:

Callback style, where the code to run in a transaction has to be wrapped in a function, usually an anonymous closure. This is often cited as a use case for implicit capture in closures, but even with that it adds a layer of indirection, and changes the meaning of "return" and "yield" inside the wrapped block.
"Resource Acquisition Is Initialization" style, where the destructor rolls back the transaction if it hasn't been committed or rolled back manually. This requires fewer changes to the wrapped code, but as Arnaud points out, it's not 100% reliable / predictable in PHP, due to details of the GC.

Context Managers present a third option, where the code in the transaction remains a normal sequence of statements, but there is a more explicit guarantee about what will happen when the with{} block is exited. The Python design document has interesting background on what they included and excluded: https://peps.python.org/pep-0343/

C#'s "using statement" is similar, but explicitly designed for ensuring the correct "disposal" of an object rather than hooking entry to and exit from a "context": https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/using

Regards,

Rowan Tommins
[IMSoP]

9 months ago by Larry Garfield — view source

unread

I would have said with() would be neat in PHP. :-)

I have been considering for a while proposing Context Managers
[Python's with(), not to be confused with VisualBasic & JavaScript
unrelated feature with the same keyword].

My primary example use case is safe database transactions, which I've
seen implemented in PHP in two ways:

Callback style, where the code to run in a transaction has to be
wrapped in a function, usually an anonymous closure. This is often
cited as a use case for implicit capture in closures, but even with
that it adds a layer of indirection, and changes the meaning of
"return" and "yield" inside the wrapped block.

"Resource Acquisition Is Initialization" style, where the destructor
rolls back the transaction if it hasn't been committed or rolled back
manually. This requires fewer changes to the wrapped code, but as
Arnaud points out, it's not 100% reliable / predictable in PHP, due to
details of the GC.

Context Managers present a third option, where the code in the
transaction remains a normal sequence of statements, but there is a
more explicit guarantee about what will happen when the with{} block is
exited. The Python design document has interesting background on what
they included and excluded: https://peps.python.org/pep-0343/

C#'s "using statement" is similar, but explicitly designed for ensuring
the correct "disposal" of an object rather than hooking entry to and
exit from a "context":
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/using

Regards,

Rowan Tommins
[IMSoP]

I would support having Python-esque context managers in PHP, and would be happy to help make it happen (if research and English writing would be useful to whoever is doing the implementation).

--Larry Garfield

9 months ago by Arnaud Le Blanc — view source

unread

Hi Jonathan,

It seems the protected but not private destructors idea was already
implemented for PHP 7 (https://3v4l.org/6DFGp) but fell victim to a
regression in 7.3.15/7.4.3 and no-one noticed because it was never added
to the changelog in the first place.

Apparently the engine can successfully call non-public destructors
when in the right scope: https://3v4l.org/FEV2fT. This is intentional
behavior, as we have explicit visibility checks in the code that calls
destructors, and tests for this. This was implemented in the 5.0
development period at the same time as other work on magic method
visibility, so my interpretation is that it was likely done purely for
consistency. What changed in 7.3.15 is that local variables are now
destroyed after popping the current stack frame, which affects the
scope in which the object is destroyed in your snippet.

This is similar to __construct() and __clone() (whose visibility is
used to restrict who can perform the related operation), except that
the visibility of __destruct() doesn't really prevent the operation
from being performed: the object is still freed after the engine
failed to call the destructor. I'm not aware of use-cases for this.
Also it would be unreliable as the timing of destructor calls is not
always obvious or predictable.

Although the current behavior doesn't seem useful, the proposed
changes would only address the problem for destructors: We can not
apply the same trick to __construct() and __clone().

Best Regards,
Arnaud