Hi internals,
I'd like to propose a new custom object serialization mechanism intended to
replace the broken Serializable interface:
https://wiki.php.net/rfc/custom_object_serialization
This was already previously discussed in https://externals.io/message/98834,
this just brings it into RFC form. The latest motivation for this is
https://bugs.php.net/bug.php?id=77302, a compatibility issue in 7.3
affecting Symfony, caused by Serializable. We can't fix Serializable, but
we can at least make sure that a working alternative exists.
Regards,
Nikita
Not really fussed about having another implicit interface for serialization
(via magic methods).
Wouldn't a new interface make this clear, explicit, and make the
deprecation path easier (together with the migration)?
Marco Pivetta
Hi internals,
I'd like to propose a new custom object serialization mechanism intended to
replace the broken Serializable interface:https://wiki.php.net/rfc/custom_object_serialization
This was already previously discussed in
https://externals.io/message/98834,
this just brings it into RFC form. The latest motivation for this is
https://bugs.php.net/bug.php?id=77302, a compatibility issue in 7.3
affecting Symfony, caused by Serializable. We can't fix Serializable, but
we can at least make sure that a working alternative exists.Regards,
Nikita
Am 24.01.2019 um 15:09 schrieb Marco Pivetta:
Not really fussed about having another implicit interface for serialization
(via magic methods).
I second that emotion.
Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann sebastian@php.net a
écrit :
Am 24.01.2019 um 15:09 schrieb Marco Pivetta:
Not really fussed about having another implicit interface for
serialization
(via magic methods).I second that emotion.
The more I think about it, the more I'm convinced we should not add an
interface for that.
An interface defines a semantics - something that an API tpoa domain and
that ppl can type-hint for to get a specific implementation of that
API.Here, both aspects are not desired: we don't want ppl to type-hint for
e.g. Serializable - and too bad it exists because I've already seen ppl
think: "hey, I'll type-hint or extend it to express I want a serializable
thing". BUT that's not the contract of Serializable or any variant of it
because: 1. any PHP object is serializable 2. Serializable it well
allowed to throw an exception to exactly forbid an object from being
serialized.
From this reasoning, I conclude that we really want magic methods here
because what we need is a behavior. We want to hook into the engine to
benefit from some special features it provides. That's what all magic
methods are about and hooking into serialize()
/unserialize() is not
different.
The parallel with the Symfony Serializer is interesting but it stop here:
we don't need any help from core to build it. We should not seek for that
goal IMHO as it blur the lines of what we're needing: a core primitive to
build more useful things.
Magic methods would just provide that, without polluting the semantics
space.
My 2cts,
Nicolas
On Sun, Jan 27, 2019 at 5:37 PM Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:
Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann sebastian@php.net a
écrit :Am 24.01.2019 um 15:09 schrieb Marco Pivetta:
Not really fussed about having another implicit interface for
serialization
(via magic methods).I second that emotion.
The more I think about it, the more I'm convinced we should not add an
interface for that.
An interface defines a semantics - something that an API tpoa domain and
that ppl can type-hint for to get a specific implementation of that
API.
An interface declares an API surface for an object: in this case, the
consumer would be serialize()
or unserialize()
, and that's a perfectly
valid use-case scenario.
Here, both aspects are not desired: we don't want ppl to type-hint for
e.g. Serializable - and too bad it exists because I've already seen ppl
think: "hey, I'll type-hint or extend it to express I want a serializable
thing".
That's actually a very correct thing to do: by declaring that something is
Serializable
, you are expressing your intent to anybody inspecting the
structure of the object.
BUT that's not the contract of Serializable or any variant of it
because: 1. any PHP object is serializable 2. Serializable it well
allowed to throw an exception to exactly forbid an object from being
serialized.
That's a completely different problem, which is that PHP has no way to
declare APIs as functionally pure, or exception-less. That's something to
be explored, in my opinion, but the lack of it does not warrant dismissing
interfaces altogether (your current argument).
Serializability is something to be declared: currently, PHP is very much
ill-designed on this particular scope, but that doesn't mean that we should
make it even worse.
From this reasoning, I conclude that we really want magic methods here
because what we need is a behavior. We want to hook into the engine to
benefit from some special features it provides. That's what all magic
methods are about and hooking intoserialize()
/unserialize() is not
different.
Yet more magic methods just add an impressive amount of complexity to the
language (I'm already thinking of the permutations of edge cases that I'm
personally going to have to write).
It's yet another way to define something that would work correctly if an
interface (existing mechanism defined by the language) was used instead.
Magic methods would just provide that, without polluting the semantics
space.
The newly introduced magic methods pollute the entire language
semantics: now you have another edge case in the language, instead of an
interface that works in combination with the serialize()
and
unserialize()
functions.
I'll also add that the problem being solved is much smaller than the issues
introduced by the proposed additions.
Marco Pivetta
(I should have reviewed myself in the first message - sentences edited in
this reply)
Le lun. 28 janv. 2019 à 08:58, Marco Pivetta ocramius@gmail.com a écrit :
On Sun, Jan 27, 2019 at 5:37 PM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann sebastian@php.net a
écrit :Am 24.01.2019 um 15:09 schrieb Marco Pivetta:
Not really fussed about having another implicit interface for
serialization
(via magic methods).I second that emotion.
The more I think about it, the more I'm convinced we should not add an
interface for that.
An interface defines a semantics - an API to a domain - and
that ppl can type-hint for to get a specific implementation of that
API.An interface declares an API surface for an object: in this case, the
consumer would beserialize()
orunserialize()
, and that's a perfectly
valid use-case scenario.
Implemeting serialize/unserialize methods is meant to hook into the
same-name functions.
That's a very technical and concrete thing - a behavior - not anything
abstract as an interface is.
Here, both aspects are not desired: we don't want ppl to type-hint for
e.g. Serializable - and too bad it exists because I've already seen ppl
think: "hey, I'll type-hint or extend it to express I want a serializable
thing".That's actually a very correct thing to do: by declaring that something is
Serializable
, you are expressing your intent to anybody inspecting the
structure of the object.
"intent" is the issue here: these is no such abstract thing here. The PHP
engine is a very concrete object that provides behaviors to userland. And
we need a proper hook to configure the behavior of the
serialize/unserialize functions. There is no such things as
interface/intent/abstraction here, just a hook and that's perfect because
the engine should define as little semantics as possible: that's the job of
userland!
BUT that's not the contract of Serializable or any variant of it
because: 1. any PHP object is serializable 2. Serializable it well
allowed to throw an exception to exactly forbid an object from being
serialized.That's a completely different problem, which is that PHP has no way to
declare APIs as functionally pure, or exception-less. That's something to
be explored, in my opinion, but the lack of it does not warrant dismissing
interfaces altogether (your current argument).
You completly ignored 1.: any object is serializable. Another hint there
isn't any abstraction sitting here: if Serializable was one, it would be
like a small wall in wide field.
The only way we could make "Serializable2" an abstraction is by fordbidding
any object that does not impementit to be serializable. That's
unrealistic - and uneeded to me.
Serializability is something to be declared: currently, PHP is very much
ill-designed on this particular scope, but that doesn't mean that we should
make it even worse.From this reasoning, I conclude that we really want magic methods here
because what we need is a behavior. We want to hook into the engine to
benefit from some special features it provides. That's what all magic
methods are about and hooking intoserialize()
/unserialize() is not
different.Yet more magic methods just add an impressive amount of complexity to the
language (I'm already thinking of the permutations of edge cases that I'm
personally going to have to write).
Please do, I'd be happy to better understand your pov.
It's yet another way to define something that would work correctly if an
interface (existing mechanism defined by the language) was used instead.
It would not abstract anything, thus would be just broken syntactic
illusional sugar.
Magic methods would just provide that, without polluting the semantics
space.
The newly introduced magic methods pollute the entire language
semantics: now you have another edge case in the language, instead of an
interface that works in combination with theserialize()
and
unserialize()
functions.
"Language semantics" is another thing. I'm talking about "domain
semantics". See reasoning above :)
I'll also add that the problem being solved is much smaller than the issues
introduced by the
proposed additions.
I promise the contrary: my personal experience is that Serializable does
real harm (please work on https://github.com/symfony/symfony/issues/29951
and related issues to get the feeling).
Magic methods would solve all this crap.
Cheers,
Nicolas
Le 28 janv. 2019 à 08:58, Marco Pivetta ocramius@gmail.com a écrit :
Here, both aspects are not desired: we don't want ppl to type-hint for
e.g. Serializable - and too bad it exists because I've already seen ppl
think: "hey, I'll type-hint or extend it to express I want a serializable
thing".That's actually a very correct thing to do: by declaring that something is
Serializable
, you are expressing your intent to anybody inspecting the
structure of the object.
This interface (as well as JsonSerializable
) is incorrectly named. If I believe the manual, it is an ”interface for customized serializing”; therefore it should have been called CustomizedSerialization
or something like that.
If we use a new interface, at the very least, let its name match its function (or its function match its name).
—Claude
Den 2019-01-24 kl. 15:09, skrev Marco Pivetta:
Not really fussed about having another implicit interface for serialization
(via magic methods).Wouldn't a new interface make this clear, explicit, and make the
deprecation path easier (together with the migration)?
How important is the following statement from the RFC regarding magic
methods vs Interfaces?
- "Using an interface instead requires either raising the version
requirement to PHP 7.4, or dealing with the definition of a stub
interface in a compatible manner."
Sounds like it would be easier to implement this feature in
code/libraries targeting 7.x versions.
r//Björn Larsson
Hi Nikita,
https://wiki.php.net/rfc/custom_object_serialization
In the RFC, you mention that "Executing arbitrary code in the middle of
unserialization is dangerous and has led to numerous unserialize()
vulnerabilities in the past. For this reason __wakeup() calls are now
delayed until the end of unserialization."
How about destructors?
Some vulnerabilities come from destructors doing things with unserialized
state.
Would it be possible/a good idea to not call any destructors unless the
"wakeup" stage has been successful? Any exceptions thrown during
__wakeup/__unserialize would mean the unserialized data structure should be
destroyed without calling any destructors?
WDYT?
Nicolas
On Wed, Jan 30, 2019 at 10:20 AM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:
Hi Nikita,
https://wiki.php.net/rfc/custom_object_serialization
In the RFC, you mention that "Executing arbitrary code in the middle of
unserialization is dangerous and has led to numerousunserialize()
vulnerabilities in the past. For this reason __wakeup() calls are now
delayed until the end of unserialization."How about destructors?
Some vulnerabilities come from destructors doing things with unserialized
state.
Would it be possible/a good idea to not call any destructors unless the
"wakeup" stage has been successful? Any exceptions thrown during
__wakeup/__unserialize would mean the unserialized data structure should be
destroyed without calling any destructors?
WDYT?
This is already how it works. If a class has __wakeup and unserialization
fails (or call of __wakeup fails), then we will not call the destructor.
(The same would be true for __unserialize under this proposal.)
Nikita