Hi!
I'm currently in the process of cleaning up the error handling of the
new ext/random and now came across the implementation of __unserialize().
Looking through the source code for existing examples I found that the
following is used:
- ext/gmp: Exception
- ext/hash: Exception
- ext/date: Error
- ext/spl: UnexpectedValueException
- ext/random: Currently Exception.
… all of those with different error messages. unserialize()
itself will
emit a regular 'Notice' (yes, you read that right) when encountering
garbage data.
In the end I've opted to follow ext/date's lead and created a PR
changing ext/random to use an Error, as unserialization failures likely
mean that you are unserializing untrusted data which you should not do
in the first place. Also any errors during unserializing are not really
recoverable: You can't go and fix up the input string.
The PR can be found here:
https://github.com/php/php-src/pull/9185
During review cmb noted that using an 'Error' here might not be the best
choice, as while it is likely to be a programmer error if unserializing
fails, we do not want to educate users to catch(Error).
As the existing behavior already is inconsistent and the Notice really
should be upgraded to something … more alarming in the future, I'm
putting this topic up for discussion on the list.
My suggested path forward would be:
For 8.2:
- Update ext/random to use the ext/date wording (I like that one best):
"Invalid serialization data for <classname> object."
- Revert the change from Exception to Error in my ext/random PR #9185.
- Update ext/date to Exception (the Error for __unserialize() was added
in 8.2, but there already was an Error for __wakeup() before).
For whatever comes after 8.2:
- Add a new UnserializationFailureException extends Exception (or
whatever color the bikeshed should have). Any existing catch blocks
should still match, as Exception is more general. It would break for
ext/spl, though. Unless UnserializationFailureException extends
UnexpectedValueException. - Add a helper function that throws this Exception with a consistent
wording. - Upgrade
unserialize()
from Notice to UnserializationFailureException.
Opinions?
Best regards
Tim Düsterhus
Hi!
I'm currently in the process of cleaning up the error handling of the
new ext/random and now came across the implementation of __unserialize().Looking through the source code for existing examples I found that the
following is used:
- ext/gmp: Exception
- ext/hash: Exception
- ext/date: Error
- ext/spl: UnexpectedValueException
- ext/random: Currently Exception.
… all of those with different error messages.
unserialize()
itself will
emit a regular 'Notice' (yes, you read that right) when encountering
garbage data.In the end I've opted to follow ext/date's lead and created a PR
changing ext/random to use an Error, as unserialization failures likely
mean that you are unserializing untrusted data which you should not do
in the first place. Also any errors during unserializing are not really
recoverable: You can't go and fix up the input string.The PR can be found here:
https://github.com/php/php-src/pull/9185
During review cmb noted that using an 'Error' here might not be the best
choice, as while it is likely to be a programmer error if unserializing
fails, we do not want to educate users to catch(Error).
I probably have a very different philosophy on this as I don't mind users
catching Errors, but it probably be mostly done by Framework/libraries
doing some funky stuff.
And passing user input which can fail to unserialize is not really a
programming error and something that can be "expected" to happen, I agree
that an Exception does make more sense here.
As the existing behavior already is inconsistent and the Notice really
should be upgraded to something … more alarming in the future, I'm
putting this topic up for discussion on the list.My suggested path forward would be:
For 8.2:
- Update ext/random to use the ext/date wording (I like that one best):
"Invalid serialization data for <classname> object."
- Revert the change from Exception to Error in my ext/random PR #9185.
Those make sense to me
- Update ext/date to Exception (the Error for __unserialize() was added
in 8.2, but there already was an Error for __wakeup() before).
I'm not a super fan of this, both __wakeup and __unserialize should use the
same one.
As far as I can tell the reason __wakeup() is still defined is if someone
called it explicitly.
But in any case if someone was handing an unserialisation failure (that was
handled by __wakeup prior to 8.2) and caugh an Error explicitly this would
now break.
Thus I'd prefer handling this with the rest after 8.2
For whatever comes after 8.2:
- Add a new UnserializationFailureException extends Exception (or
whatever color the bikeshed should have). Any existing catch blocks
should still match, as Exception is more general. It would break for
ext/spl, though. Unless UnserializationFailureException extends
UnexpectedValueException.
As we would be breaking BC with ext/date anyway I think having it just
extend Exception would be fine
- Add a helper function that throws this Exception with a consistent
wording.- Upgrade
unserialize()
from Notice to UnserializationFailureException.
Notice to Exception might be a big jump, E_WARNING
definitely an then
promote to an Exception in 9.0 is probably more in line with what we did
for 7.x/8.0
George P. Banyard
Hi
During review cmb noted that using an 'Error' here might not be the best
choice, as while it is likely to be a programmer error if unserializing
fails, we do not want to educate users to catch(Error).I probably have a very different philosophy on this as I don't mind users
catching Errors, but it probably be mostly done by Framework/libraries
doing some funky stuff.
Which effectively translates into "don't catch(Error)". No rule without
exception :-)
In my userland code I also have some blanket catch(Throwable) around
generic code (i.e. arbitrary callables that can throw whatever type of
throwable).
My suggested path forward would be:
For 8.2:
- Update ext/random to use the ext/date wording (I like that one best):
"Invalid serialization data for <classname> object."
- Revert the change from Exception to Error in my ext/random PR #9185.
Those make sense to me
I've already made those changes (i.e. Exception + ext/date wording).
Feel free to add your approval if you agree:
https://github.com/php/php-src/pull/9185
For the other bits and pieces of your reply I don't have a strong
opinion and I'm happy with whatever results comes out of it.
Best regards
Tim Düsterhus
Hi
- Add a new UnserializationFailureException extends Exception (or
whatever color the bikeshed should have). Any existing catch blocks
should still match, as Exception is more general. It would break for
ext/spl, though. Unless UnserializationFailureException extends
UnexpectedValueException.As we would be breaking BC with ext/date anyway I think having it just
extend Exception would be fine
- Add a helper function that throws this Exception with a consistent
wording.- Upgrade
unserialize()
from Notice to UnserializationFailureException.Notice to Exception might be a big jump,
E_WARNING
definitely an then
promote to an Exception in 9.0 is probably more in line with what we did
for 7.x/8.0
I've created a PoC implementation here:
https://github.com/php/php-src/pull/9425.
-
Instead of requiring everyone to adjust their
__unserialize()
implementation, the PoC instead wraps any exception that is thrown
during unserialization into the UnserializationFailureException. That
way, classes can emit their own meaningful error messages, while
developers can just catchUnserializationFailedException
to handle
everything. -
I've noticed that 'unserialize()' already emits
E_WARNING
for some
types of error (e.g. out-of-bounds integers), so users already need to
be prepared forE_WARNING
to be emitted. I've adjusted the
aforementioned Notice to Warning, but we might be able to directly jump
to UnserializationFailedException from the existing warnings?
Of course I also plan to create a proper RFC for this, once PHP 8.2 RC
dust has settled a bit.
Best regards
Tim Düsterhus
- I've noticed that 'unserialize()' already emits
E_WARNING
for some types of error (e.g. out-of-bounds integers), so users already need to be prepared forE_WARNING
to be emitted. I've adjusted the aforementioned Notice to Warning, but we might be able to directly jump to UnserializationFailedException from the existing warnings?
IMHO, any change from Warning or lower to Exception or Error is a clear Breaking Change, because it can make a program that runs successfully under one version abort mid-process in another.
So, while it probably makes sense for all serialisation errors to consistently throw, that should be planned for 9.0, and everything that's not already an exception could raise a consistently formatted E_WARNING
in 8.3.
Regards,
--
Rowan Tommins
[IMSoP]
Hi
- I've noticed that 'unserialize()' already emits
E_WARNING
for some types of error (e.g. out-of-bounds integers), so users already need to be prepared forE_WARNING
to be emitted. I've adjusted the aforementioned Notice to Warning, but we might be able to directly jump to UnserializationFailedException from the existing warnings?IMHO, any change from Warning or lower to Exception or Error is a clear Breaking Change, because it can make a program that runs successfully under one version abort mid-process in another.
While this is technically correct, I am not sure if there is actually a
case where a script that is behaving correctly for the current version
will break (i.e. everything that will break is already subtly broken).
-
If you are unserializing only trusted data from a compatible PHP
version (compatible PHP version, because the serialization output might
differ, e.g. for 'C:' and 'O:'), thenunserialize()
should not fail /
not emit any notices or warnings. -
If you are not in the situation of (1), then
unserialize()
might
already throw arbitrary Throwables for some of the inputs, as the
implementation of__unserialize()
or__wakeup()
might throw:
Best regards
Tim Düsterhus