Allow exceptions in __toString()

10 years ago by Nikita Popov — view source

unread

Hi internals!

Currently it's not possible to throw an exception from __toString() -
instead you will get a (real) fatal error. This is becoming more and more
problematic as time goes on, e.g. with PHP 7 an exception can be triggered
as a result of a VM error or even a parse error.

I'd like to lift this restriction. A patch can be found here:
https://github.com/php/php-src/pull/1364

Apart from allowing exceptions, the patch also converts two leftover
recoverable fatal errors relating to __toString() into Error exceptions.

Furthermore the patch makes sure that we correctly (i.e. without leaks and
without superfluous notices or other side-effects) handle __toString()
exceptions in the engine. This includes concatenation and interpolation,
but also things like writing $foo->$object or ${$object}.
zend_parse_parameters() and Fast ZPP also deal with exceptions correctly.

However what it doesn't do, and what I wouldn't consider feasible to do, is
ensure that every single string conversion in library functions is
exception safe. Personally I don't think this is a blocking issue, as the
worst that can happen is usually an additional superfluous warning to be
thrown, or something similar. If cases like this turn up, we can
specifically target them.

It should also be noted that whatever issues with exceptions-safety may
remain, they already exist now (plus those the patch fixes), because the
two aforementioned recoverable fatal errors can be converted to exceptions
(and anyone doing blanket ErrorException conversions will do this).

So basically the question here is, is this partial solution acceptable for
us?

Thanks,
Nikita

10 years ago by johannes@schlueters.de — view source

unread

However what it doesn't do, and what I wouldn't consider feasible to do, is
ensure that every single string conversion in library functions is
exception safe. Personally I don't think this is a blocking issue, as the
worst that can happen is usually an additional superfluous warning to be
thrown, or something similar. If cases like this turn up, we can
specifically target them.

I don't agree to the assesment that this isn'T a problem. Consider this
extension pseudo-code:

zval *data = get_data();
convert_to_string(data);
store_into_database(Z_STRVAL_P(data));
return TRUE;

This will store wrong data in the database and report to the user that
there was an error before storing, so the user assumes nothing was
stored.

The only way we can do that imo is by bailing out if an exception is
thrown and introducing yet another set of string conversion functions
for marking exception safe uses. (i.e. convert_to_string_throwing())

johannes

10 years ago by Dmitry Stogov — view source

unread

On Thu, Jun 25, 2015 at 4:55 PM, Johannes Schlüter johannes@schlueters.de
wrote:

However what it doesn't do, and what I wouldn't consider feasible to do,
is
ensure that every single string conversion in library functions is
exception safe. Personally I don't think this is a blocking issue, as the
worst that can happen is usually an additional superfluous warning to be
thrown, or something similar. If cases like this turn up, we can
specifically target them.

I don't agree to the assesment that this isn'T a problem. Consider this
extension pseudo-code:

zval *data = get_data();
convert_to_string(data);
store_into_database(Z_STRVAL_P(data));
return TRUE;

This will store wrong data in the database and report to the user that
there was an error before storing, so the user assumes nothing was
stored.

We have hundreds such cases :(

The only way we can do that imo is by bailing out if an exception is
thrown and introducing yet another set of string conversion functions
for marking exception safe uses. (i.e. convert_to_string_throwing())

This may lead to memory and resource leaks.

In general, we may throw exceptions, only from "safe" places, but this
going to make a mesh.

Thanks. Dmitry.

johannes

9 years ago by Nikita Popov — view source

unread

On Thu, Jun 25, 2015 at 3:55 PM, Johannes Schlüter johannes@schlueters.de
wrote:

However what it doesn't do, and what I wouldn't consider feasible to do,
is
ensure that every single string conversion in library functions is
exception safe. Personally I don't think this is a blocking issue, as the
worst that can happen is usually an additional superfluous warning to be
thrown, or something similar. If cases like this turn up, we can
specifically target them.

I don't agree to the assesment that this isn'T a problem. Consider this
extension pseudo-code:

zval *data = get_data();
convert_to_string(data);
store_into_database(Z_STRVAL_P(data));
return TRUE;

This will store wrong data in the database and report to the user that
there was an error before storing, so the user assumes nothing was
stored.

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you can
end up in this situation, the two most common being:

a) Error handlers can (and often do) throw. E.g. it is possible to convert
the two recoverable fatal errors that __toString() currently throws into
exceptions and end up in the situation you describe, only with even less
safety and more leaks. However this applies to all occurrences of
zend_error or php_error_docref that generate a non-fatal error. From a
quick grep I estimate we have about 4000 places in the codebase where this
can occur and I'm sure many of them do not check for EG(exception)
immediately after throwing the error.

b) Destructors can throw. We have approximately 2500 places in php-src
destroying a zval, which could potentially throw. I don't think I've ever
seen an EG(exception) check after a zval_ptr_dtor().

Given that we rely on best-effort exception handling in so many cases, it's
not clear to me why exceptions thrown from __toString() should be expressly
forbidden.

If you want to protect against side-effects after exceptions in critical
places like database interaction, we can introduce exception checks right
before the operation is executed and prevent anything bad from happening
regardless of where the exception came from. But as said, I think this is
only tangential to the question of __toString(), as this is a general issue.

Nikita

9 years ago by Stanislav Malyshev — view source

unread

Hi!

a) Error handlers can (and often do) throw. E.g. it is possible to convert
the two recoverable fatal errors that __toString() currently throws into
exceptions and end up in the situation you describe, only with even less
safety and more leaks. However this applies to all occurrences of
zend_error or php_error_docref that generate a non-fatal error. From a
quick grep I estimate we have about 4000 places in the codebase where this
can occur and I'm sure many of them do not check for EG(exception)
immediately after throwing the error.

b) Destructors can throw. We have approximately 2500 places in php-src
destroying a zval, which could potentially throw. I don't think I've ever
seen an EG(exception) check after a zval_ptr_dtor().

You don't usually use the value being destroyed after zval_ptr_dtor().
You could, however, use the value that was supposed to be converted
after __toString, and you probably are if you're converting it. The
problem is not that exception is thrown, the problem is that the code
assumes there's some workable value and there's none or at least not the
value it expected.

With error handlers, it is trickier but the code that produces the error
usually does not expect everything to be fine after the error handler is
done - on the contrary, it explicitly expects it not to be fine, since
it just reported that something is not fine by invoking the error
handler! Of course, we have there all kinds of issues with error
handlers messing with the environment, but it's much harder to cause
this problem by accident.

The problem, as I see it, is not the throwing of exception by itself,
but the fact that the code assumes the defined state of the variable
being converted, but when exception is thrown, we don't get that state,
but we also don't know something bad happened.

Of course, we could do something like set the value to empty string, but
since again there's no check there the code would just assume we've got
legitimate conversion to empty string - which is not the same as "we've
got error" and could lead to very unexpected consequences (such as
somebody erroneously supplying object as a password and system treating
it as empty password).

If you want to protect against side-effects after exceptions in critical
places like database interaction, we can introduce exception checks right
before the operation is executed and prevent anything bad from happening

I don't see how this is possible - we have no idea which places can be
"critical" - any interaction with now-undefined variable could lead to a
disaster downstream.

Stas Malyshev
smalyshev@gmail.com

9 years ago by johannes@schlueters.de — view source

unread

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"
We should improve this but clearer rules.

johannes

9 years ago by Nikita Popov — view source

unread

On Sun, Jul 5, 2015 at 10:20 PM, Johannes Schlüter johannes@schlueters.de
wrote:

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"

That, or "Let's not be hypocrites".

Nikita

9 years ago by Stanislav Malyshev — view source

unread

Hi!

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"

That, or "Let's not be hypocrites".

I don't think the position of "let's not add more unpredictable behavior
to the engine" is adequately described as "hypocrite". We know there are
interruption problems and such in the engine, we know some of them are
hard to fix. However, adding a different class of those doesn't make it
better in any way. If we find a solution that allows us to not introduce
new unpredictable behaviors - I'm all for it, it's great. Maybe the code
that allows to throw exceptions the same way we do with errors, maybe
some way to not get unpredictable results on conversions. But we can't
just ignore it because there are other problems in the engine too.

Stas Malyshev
smalyshev@gmail.com

9 years ago by Nikita Popov — view source

unread

On Sun, Jul 5, 2015 at 10:33 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"

That, or "Let's not be hypocrites".

I don't think the position of "let's not add more unpredictable behavior
to the engine" is adequately described as "hypocrite". We know there are
interruption problems and such in the engine, we know some of them are
hard to fix. However, adding a different class of those doesn't make it
better in any way. If we find a solution that allows us to not introduce
new unpredictable behaviors - I'm all for it, it's great. Maybe the code
that allows to throw exceptions the same way we do with errors, maybe
some way to not get unpredictable results on conversions. But we can't
just ignore it because there are other problems in the engine too.

Sorry, I just find it somewhat weird that we consider some of the
exceptions generated as the result of __toString calls to be okay (error
handler exceptions), while others are considered to be too dangerous to
allow (direct exceptions).

But maybe I just approached this from the wrong direction. Instead of
allowing exceptions in __toString in combination with some hardening of the
VM, maybe we should just convert the recoverable fatal errors __toString
currently throws to actual fatal errors? Not a huge fan of doing that, but
at least it's consistent.

Nikita

9 years ago by Nikita Popov — view source

unread

On Sun, Jul 5, 2015 at 10:33 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"

That, or "Let's not be hypocrites".

I don't think the position of "let's not add more unpredictable behavior
to the engine" is adequately described as "hypocrite". We know there are
interruption problems and such in the engine, we know some of them are
hard to fix. However, adding a different class of those doesn't make it
better in any way. If we find a solution that allows us to not introduce
new unpredictable behaviors - I'm all for it, it's great. Maybe the code
that allows to throw exceptions the same way we do with errors, maybe
some way to not get unpredictable results on conversions. But we can't
just ignore it because there are other problems in the engine too.

Sorry, I just find it somewhat weird that we consider some of the
exceptions generated as the result of __toString calls to be okay (error
handler exceptions), while others are considered to be too dangerous to
allow (direct exceptions).

But maybe I just approached this from the wrong direction. Instead of
allowing exceptions in __toString in combination with some hardening of the
VM, maybe we should just convert the recoverable fatal errors __toString
currently throws to actual fatal errors? Not a huge fan of doing that, but
at least it's consistent.

I just realized that this wouldn't quite cut it, because our object to
integer and object to float conversions currently throw notices, which can
be converted to exceptions and to-int conversions at least are all over the
place. Presumably it's not acceptable to change these notices to fatal
errors.

Furthermore the array-to-string conversion also throws a notice since 5.4,
which can also result in an exception. While we approved an RFC (not
implemented) to make this a recoverable error, I suspect making this a real
fatal error will not be met with much enthusiasm.

Nikita

9 years ago by Bob Weinand — view source

unread

Am 05.07.2015 um 23:45 schrieb Nikita Popov nikita.ppv@gmail.com:

On Sun, Jul 5, 2015 at 10:33 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

I can see your concern here -- however this is nothing specific to
exceptions thrown from __toString(). There is are a number of ways you
can end up in this situation, the two most common being:

I summarize this a bit freely as "We already have places which are hard
to understand an predict, so let's add more!"

That, or "Let's not be hypocrites".

I don't think the position of "let's not add more unpredictable behavior
to the engine" is adequately described as "hypocrite". We know there are
interruption problems and such in the engine, we know some of them are
hard to fix. However, adding a different class of those doesn't make it
better in any way. If we find a solution that allows us to not introduce
new unpredictable behaviors - I'm all for it, it's great. Maybe the code
that allows to throw exceptions the same way we do with errors, maybe
some way to not get unpredictable results on conversions. But we can't
just ignore it because there are other problems in the engine too.

Sorry, I just find it somewhat weird that we consider some of the
exceptions generated as the result of __toString calls to be okay (error
handler exceptions), while others are considered to be too dangerous to
allow (direct exceptions).

But maybe I just approached this from the wrong direction. Instead of
allowing exceptions in __toString in combination with some hardening of the
VM, maybe we should just convert the recoverable fatal errors __toString
currently throws to actual fatal errors? Not a huge fan of doing that, but
at least it's consistent.

I just realized that this wouldn't quite cut it, because our object to
integer and object to float conversions currently throw notices, which can
be converted to exceptions and to-int conversions at least are all over the
place. Presumably it's not acceptable to change these notices to fatal
errors.

Furthermore the array-to-string conversion also throws a notice since 5.4,
which can also result in an exception. While we approved an RFC (not
implemented) to make this a recoverable error, I suspect making this a real
fatal error will not be met with much enthusiasm.

Nikita

We have a rather messy behavior here…

Failed object -> string is an error (if not caused by exception: E_RECOVERABLE)
Everything else a notice with return value 1?

So, how are we handling code which likes to convert recoverables into exceptions? That's not that rare.
Sure… we're passing the empty string forward? and the code calling zval_get_string() is continued…

We already have that behavior. Can we please add consistency here and always return an empty string to be passed forward?
Or… as Nikita proposed always do a real fatal (E_ERROR). In case we display E_ERROR, I'd like to ensure that the exception is displayed properly in any case.

I totally am on Nikitas side that either everything should be exception or fatal.
Recoverable isn't doing any good here.

But I strongly feel for allowing exceptions here.

P.s.: Ideally, we'd also change the object -> int/float cases to behave consistently with the strings. But that's maybe now off-scope for 7.0…

Bob

9 years ago by johannes@schlueters.de — view source

unread

On Sun, Jul 5, 2015 at 10:51 PM, Nikita Popov nikita.ppv@gmail.com
wrote:
On Sun, Jul 5, 2015 at 10:33 PM, Stanislav Malyshev
smalyshev@gmail.com wrote:
Hi!

            >>> I can see your concern here -- however this is
            nothing specific to
            >>> exceptions thrown from __toString(). There is are
            a number of ways you
            >>> can end up in this situation, the two most common
            being:
            >>
            >> I summarize this a bit freely as "We already have
            places which are hard
            >> to understand an predict, so let's add more!"
            >>
            >
            > That, or "Let's not be hypocrites".
            
            I don't think the position of "let's not add more
            unpredictable behavior
            to the engine" is adequately described as "hypocrite".
            We know there are
            interruption problems and such in the engine, we know
            some of them are
            hard to fix. However, adding a different class of
            those doesn't make it
            better in any way. If we find a solution that allows
            us to not introduce
            new unpredictable behaviors - I'm all for it, it's
            great. Maybe the code
            that allows to throw exceptions the same way we do
            with errors, maybe
            some way to not get unpredictable results on
            conversions. But we can't
            just ignore it because there are other problems in the
            engine too.
    
    
    Sorry, I just find it somewhat weird that we consider some of
    the exceptions generated as the result of __toString calls to
    be okay (error handler exceptions), while others are
    considered to be too dangerous to allow (direct exceptions).
    
    
    But maybe I just approached this from the wrong direction.
    Instead of allowing exceptions in __toString in combination
    with some hardening of the VM, maybe we should just convert
    the recoverable fatal errors __toString currently throws to
    actual fatal errors? Not a huge fan of doing that, but at
    least it's consistent.

I just realized that this wouldn't quite cut it, because our object to
integer and object to float conversions currently throw notices, which
can be converted to exceptions and to-int conversions at least are all
over the place. Presumably it's not acceptable to change these notices
to fatal errors.

Furthermore the array-to-string conversion also throws a notice since
5.4, which can also result in an exception. While we approved an RFC
(not implemented) to make this a recoverable error, I suspect making
this a real fatal error will not be met with much enthusiasm.

I fully agree that the current situation is bad. I just don't see an
improvement by this change, rather a new set of WTFs for the user.

In the ideal world we can throw everywhere without issues (well, except
maybe exceptions escaping from destructors in shutdown maybe ... while
maybe that's not different from other uncaught exceptions) Maybe we have
to restructure internal exception handling for that? Maybe we can create
a plan leading there for a future version?

johannes
(who dreams of a complete new and unified error&exception handling,
but never managed to sit down to create a good proposal ...)

10 years ago by Stanislav Malyshev — view source

unread

Hi!

However what it doesn't do, and what I wouldn't consider feasible to do, is
ensure that every single string conversion in library functions is
exception safe. Personally I don't think this is a blocking issue, as the
worst that can happen is usually an additional superfluous warning to be
thrown, or something similar. If cases like this turn up, we can
specifically target them.

This sounds kind of dangerous, especially in combination with
convert_to_string. Of course, when using ZPP it's probably fine since
ZPP result will be checked, but when not I'm not sure what happens when
conversion is initiated otherwise - i.e. what the result would be and
what happens if the result is used since the code assumed if string
conversion didn't bail it returned some useful string.
I think this was the main reason we didn't allow exceptions in
__toString() - it's hard to handle them properly.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Anatol Belski — view source

unread

Hi Nikita,

-----Original Message-----
From: Nikita Popov [mailto:nikita.ppv@gmail.com]
Sent: Thursday, June 25, 2015 2:53 PM
To: PHP internals; Dmitry Stogov
Subject: [PHP-DEV] Allow exceptions in __toString()

Hi internals!

Currently it's not possible to throw an exception from __toString() - instead you
will get a (real) fatal error. This is becoming more and more problematic as time
goes on, e.g. with PHP 7 an exception can be triggered as a result of a VM error
or even a parse error.

I'd like to lift this restriction. A patch can be found here:
https://github.com/php/php-src/pull/1364

Apart from allowing exceptions, the patch also converts two leftover
recoverable fatal errors relating to __toString() into Error exceptions.

Furthermore the patch makes sure that we correctly (i.e. without leaks and
without superfluous notices or other side-effects) handle __toString() exceptions
in the engine. This includes concatenation and interpolation, but also things like
writing $foo->$object or ${$object}.
zend_parse_parameters() and Fast ZPP also deal with exceptions correctly.

However what it doesn't do, and what I wouldn't consider feasible to do, is
ensure that every single string conversion in library functions is exception safe.
Personally I don't think this is a blocking issue, as the worst that can happen is
usually an additional superfluous warning to be thrown, or something similar. If
cases like this turn up, we can specifically target them.

It should also be noted that whatever issues with exceptions-safety may remain,
they already exist now (plus those the patch fixes), because the two
aforementioned recoverable fatal errors can be converted to exceptions (and
anyone doing blanket ErrorException conversions will do this).

So basically the question here is, is this partial solution acceptable for us?

Given the expressed concerns, despite the change itself is a good thing when working properly, IMHO we shouldn't urge it into 7.0. If having it now can lead to subtle improper behavior and even worse bugs, preferable were to have a complete patch in 7.1.

Regards

Anatol

Allow exceptions in __toString()

I don't see how this is possible - we have no idea which places can be "critical" - any interaction with now-undefined variable could lead to a disaster downstream.

I don't see how this is possible - we have no idea which places can be
"critical" - any interaction with now-undefined variable could lead to a
disaster downstream.