[RFC] Stream Error Handling Improvements

1 month ago by ignace nyamagana butera — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part of
my stream evolution work (PHP Foundation project funded by Sovereign Tech
Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

Hi Jakub,

This is a nice RFC. Since I use a lot of PHP streaming capabilities I am
looking forward to its voting outcome. Having said that, did you think,
instead adding more constants in the global namespace, of using Enum as an
alternative for error code. There is a precedent for that in the new URI
extension, This would
reduce the number of new constants added and perhaps make for a more modern
DX for the feature.
What do you think?

Best regards,
Ignace

1 month ago by Bob Weinand — view source — reply

unread

Hey Jakub,

Hello,

I would like to introduce a new stream error handling RFC that is part
of my stream evolution work (PHP Foundation project funded by
Sovereign Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

I super appreciate that you've tackled this. This has been a big wart in
PHP's stream handling for a long time.

I have a couple questions:

Why have you chosen a big list of constants rather than enums?
I think we should push for enums where applicable, meaning enum
StreamErrorMode, enum StreamErrorStore, enum StreamError.

You define what a terminal error is, but does it correlate to specific
error codes? Like, is a given error code always terminal or not
terminal? Or can some error codes sometimes be terminal or not terminal
depending on context?
I've tried to look at the code, but I don't quite understand it, e.g.
php_plain_files_rename: When chown or chmod fail, the error is terminal.
When the copying fails (did you forget to adapt php_copy_file_ctx?),
that's non-terminal, though. Reading through the code, I have the
feeling that operations which do not call out to other operations, which
can error, are terminal. And all others are non-terminal. E.g. failing
to open a file is terminal. That the copying where the file opening is
part of, failed, is non-terminal.
And then additionally some operations, which don't prevent success are
also marked non-terminal - squeezing it into the boolean for the purpose
of not erroring.
Which makes some sense to me, but the description in the RFC is really
confusing here. And the meanings are muddied.

Thus, if I understand that correctly, there should be an unique mapping
from error code to terminal. If we were to go with enums, we could make
it trivially a property of the individual enum values instead.

Should StreamException be attached the non-terminal errors which are
caused by the terminal errors? I.e. StreamException clearly is
STREAM_ERROR_CODE_PERMISSION_DENIED for example, but it happens during a
copy, so, should the information that a STREAM_ERROR_CODE_COPY_FAILED
was caused by that, also be present on the StreamException?

Are stream errors which happen as part of a fully wrapped functionality
marked as wrapper error as well? Like if the server sent an invalid TLS
response to a file_get_contents("https://...") operation, making the
whole operation fail? You obviously don't have access to the
intermediary internal stream resource there, where the stream error is
attached.
Is this what the logged errors section is about? Is that section talking
about "php_stream_display_wrapper_errors", which can concat some errors?
I see that this logic is mostly retained.

Do I understand the implementation correctly, that user error handlers
can be called multiple times for a same operation? E.g. when copying a
file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should
these not be somehow merged? Especially with global user_error handlers,
these cannot be aware of the fact that the permission denied is part of
anything else - at least not until possibly a second user_error
invocation happens.
Also, is the internal resource used for e.g. file_get_contents() leaked
into user error handlers? I'm not sure whether that's desirable. If
that's intended, that's fine too, I think.

Further suggestion on stream_get_errors() - why not return a proper
object instead of a loosely typed array? Proper typing, and creation is
actually faster. Also "docref" is so... antiquated? maybe just
"function" and have it contain the function name of the current operation.

Does stream_get_errors() ever reset? Looking at the code it doesn't seem
so. It will just forever grow. E.g. fwrite() will report an error every
single time when the buffer is full and a write is attempted. If errors
aren't freed, that's just a memory leak (as long as the socket lives).
This is a serious flaw.

Lastly, stream_get_errors() is overkill in a lot of cases (especially
when using fread()/fwrite() on simple sockets). I just care about the
last error code (there's just going to be one anyway). I'd love having a
stream_get_last_error() returning just the code.

Overall, I like the direction of the RFC, but I'd like to see improved
handling for "terminal error inside non-terminal operation having its
own error" in particular. Enums are rather a suggestion, not a
requirement, but they would be in my opinion very fitting here.

Thanks,

Bob

1 month ago by Jakub Zelenka — view source — reply

unread

Hi Bob,

Thanks for the feedback, the replies here also cover the feedback from
Jordi and Ignace.

Why have you chosen a big list of constants rather than enums?
I think we should push for enums where applicable, meaning enum
StreamErrorMode, enum StreamErrorStore, enum StreamError.

Ok I will change it. StreamErrorMode and StreamErrorStore should be
straightforward. The code one should be probably named StreamErrorCode (to
not interact with what Jordi proposed) and think it will need to be backed
as there should be a straightforward way how to compare it with exception
code. I might need to extend the gen stub as I'm not sure it allows macro
import for enum value like for constants or I will need to come up with
some mapping. I just don't wan to duplicate the numbers. I will figure
something out.

You define what a terminal error is, but does it correlate to specific
error codes? Like, is a given error code always terminal or not
terminal? Or can some error codes sometimes be terminal or not terminal
depending on context?
I've tried to look at the code, but I don't quite understand it, e.g.
php_plain_files_rename: When chown or chmod fail, the error is terminal.
When the copying fails (did you forget to adapt php_copy_file_ctx?),
that's non-terminal, though. Reading through the code, I have the
feeling that operations which do not call out to other operations, which
can error, are terminal. And all others are non-terminal. E.g. failing
to open a file is terminal. That the copying where the file opening is
part of, failed, is non-terminal.
And then additionally some operations, which don't prevent success are
also marked non-terminal - squeezing it into the boolean for the purpose
of not erroring.
Which makes some sense to me, but the description in the RFC is really
confusing here. And the meanings are muddied.

So currently the terminal / non terminal does not attach to code but it
could be done. The current logic is really just supposed to reflect what is
there already so the exception can be thrown in places where the function
would fail. So the errors after which the function fails are terminal and
the errors where the function still returns success are non terminal.

Thus, if I understand that correctly, there should be an unique mapping

from error code to terminal. If we were to go with enums, we could make
it trivially a property of the individual enum values instead.

I think making it property of enum make more sense. Then I can drop that
boolean property.

Should StreamException be attached the non-terminal errors which are
caused by the terminal errors? I.e. StreamException clearly is
STREAM_ERROR_CODE_PERMISSION_DENIED for example, but it happens during a
copy, so, should the information that a STREAM_ERROR_CODE_COPY_FAILED
was caused by that, also be present on the StreamException?

I think I get what you mean from the user point of view but I'm not sure
how could this be reasonably linked.

I could maybe check during stream error if there is EG(exception) it's a
StreamException and if so, I could the update it with this error (e.g. some
property that would hold those error can could be retrieved). I guess it
could be potentially extended for terminal errors as well so it could have
method like getAllErrors which would return all errors because sometimes
there can be more terminal errors and some of them might get lost (last one
will be added). However this would work just for exception so maybe more
generic solution is needed here (see below).

Are stream errors which happen as part of a fully wrapped functionality
marked as wrapper error as well? Like if the server sent an invalid TLS
response to a file_get_contents("https://...") operation, making the
whole operation fail? You obviously don't have access to the
intermediary internal stream resource there, where the stream error is
attached.

I actually thought about this yesterday that it's not ideal that some
errors might get lost in this way so I think it should go to wrapper errors
as well.

Is this what the logged errors section is about? Is that section talking
about "php_stream_display_wrapper_errors", which can concat some errors?
I see that this logic is mostly retained.

Yeah the logged errors is an existing mechanism for grouping errors to a
single one (concatenating message). The display is a bit messy because it
separated them by new line or <br/> (if html errors enabled) but I just
kept it to limit BC impact.

Do I understand the implementation correctly, that user error handlers
can be called multiple times for a same operation? E.g. when copying a
file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should
these not be somehow merged? Especially with global user_error handlers,
these cannot be aware of the fact that the permission denied is part of
anything else - at least not until possibly a second user_error
invocation happens.

This is somehow related to the exception error grouping as it requires some
sort of linking between the errors.

I guess to make this work we would need some top level wrapping similar to
what logged errors do but instead of concatenating, it would group the
errors in a better way which could then be used in exception or handler.
This would, however, increase the complexity significantly as it is
possible to nest the stream calls (e.g. stream calls in notification
callback or in the new error handler) so there would probably need to be
some stack. Also I'm not sure about the API. Should we pass always array of
errors to the callback (which would mostly have just one element anyway or
somehow chain the StreamError object (something like previousError()) and
what message should be used in exception (e.g. last error and then helper
function to get all errors)?

I'm willing to look into this but would like to know if this is something
important for user space and what is the usual flow how could this be used
in frameworks and so on. In other words I would like to get more input to
make sure that this is really needed as it requires quite a bit of effort
to do.

Also, is the internal resource used for e.g. file_get_contents() leaked
into user error handlers? I'm not sure whether that's desirable. If
that's intended, that's fine too, I think.

Currently yes. I need to actually double check the consequences of that as
I didn't really think about internal resources.

Further suggestion on stream_get_errors() - why not return a proper
object instead of a loosely typed array? Proper typing, and creation is
actually faster. Also "docref" is so... antiquated? maybe just
"function" and have it contain the function name of the current operation.

Yeah I will go for it and introduce StreamError as Jordi suggested with
slight modification to use enum and dropping terminal and docref.

Does stream_get_errors() ever reset? Looking at the code it doesn't seem
so. It will just forever grow. E.g. fwrite() will report an error every
single time when the buffer is full and a write is attempted. If errors
aren't freed, that's just a memory leak (as long as the socket lives).
This is a serious flaw.

Lastly, stream_get_errors() is overkill in a lot of cases (especially
when using fread()/fwrite() on simple sockets). I just care about the
last error code (there's just going to be one anyway). I'd love having a
stream_get_last_error() returning just the code.

I guess this makes more sense but it kind of also implies the top level
wrapping as it should contain all errors in the call. Or alternatively it
could be just a ring with limited number of errors be behave in the same
way like openssl_error_string. I guess it depends if we go with that top
level wrapping.

Cheers,

Jakub

1 month ago by Bob Weinand — view source — reply

unread

Hey Jakub,

thanks for the detailed reply.

The code one should be probably named StreamErrorCode (to not interact
with what Jordi proposed) and think it will need to be backed as there
should be a straightforward way how to compare it with exception code.
I might need to extend the gen stub as I'm not sure it allows macro
import for enum value like for constants or I will need to come up
with some mapping. I just don't wan to duplicate the numbers. I will
figure something out.

How important is it to have exception code?

I found that for our purposes we generally skip the exception code (i.e.
keep it at zero).
As a simple example, ErrorException does take two integers - one for
severity, and one for code, instead of using the severity as code.
I would propose something similar here: call it StreamError and provide
a public $error property for the actual enum, rather than squeezing it
into $code.

This also solves your gen-stub issues by making them unncessary.

Should StreamException be attached the non-terminal errors which are
caused by the terminal errors? I.e. StreamException clearly is
STREAM_ERROR_CODE_PERMISSION_DENIED for example, but it happens
during a
copy, so, should the information that a STREAM_ERROR_CODE_COPY_FAILED
was caused by that, also be present on the StreamException?
I think I get what you mean from the user point of view but I'm not
sure how could this be reasonably linked.

I could maybe check during stream error if there is EG(exception) it's
a StreamException and if so, I could the update it with this error
(e.g. some property that would hold those error can could be
retrieved). I guess it could be potentially extended for terminal
errors as well so it could have method like getAllErrors which would
return all errors because sometimes there can be more terminal errors
and some of them might get lost (last one will be added). However this
would work just for exception so maybe more generic solution is needed
here (see below).

Replying below to this.

Are stream errors which happen as part of a fully wrapped
functionality
marked as wrapper error as well? Like if the server sent an
invalid TLS
response to a file_get_contents("https://...") operation, making the
whole operation fail? You obviously don't have access to the
intermediary internal stream resource there, where the stream
error is
attached.

I actually thought about this yesterday that it's not ideal that some
errors might get lost in this way so I think it should go to wrapper
errors as well.

If we properly wrap the nested operations into a single error, then it
would be automatically part of the wrapper error, with no need to
explicitly add it to stream error.

Is this what the logged errors section is about? Is that section
talking
about "php_stream_display_wrapper_errors", which can concat some
errors?
I see that this logic is mostly retained.
Yeah the logged errors is an existing mechanism for grouping errors to
a single one (concatenating message). The display is a bit messy
because it separated them by new line or <br/> (if html errors
enabled) but I just kept it to limit BC impact.

I don't think changing the display / grouping here would be problematic
with respect to BC. But I get that this initial draft tries to limit
what it changes.

Do I understand the implementation correctly, that user error
handlers
can be called multiple times for a same operation? E.g. when
copying a
file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should
these not be somehow merged? Especially with global user_error
handlers,
these cannot be aware of the fact that the permission denied is
part of
anything else - at least not until possibly a second user_error
invocation happens.
This is somehow related to the exception error grouping as it requires
some sort of linking between the errors.

I guess to make this work we would need some top level wrapping
similar to what logged errors do but instead of concatenating, it
would group the errors in a better way which could then be used in
exception or handler. This would, however, increase the complexity
significantly as it is possible to nest the stream calls (e.g. stream
calls in notification callback or in the new error handler) so there
would probably need to be some stack. Also I'm not sure about the API.
Should we pass always array of errors to the callback (which would
mostly have just one element anyway or somehow chain the StreamError
object (something like previousError()) and what message should be
used in exception (e.g. last error and then helper function to get all
errors)?

I'm willing to look into this but would like to know if this is
something important for user space and what is the usual flow how
could this be used in frameworks and so on. In other words I would
like to get more input to make sure that this is really needed as it
requires quite a bit of effort to do.

Doing this work will definitely tidy up the whole error story. It's not
fundamentally crucial, but it definitely will make for a nicer API.

And yes, chaining StreamError sounds and nesting exceptions to $previous
sounds like the most straightforward way.

Also a possibility for exceptions: this might be a bit of work, but you
could insert a fake frame for exceptions to indicate when an error
appears as a sub-operation. Like:

StreamException: Permission denied
#0 [internal function]: fopen("target.php", "w+")
#1 [internal function]: copy("source.php", "target.php")
#2 ...

I also like the suggestion by Jordi to pass the StreamError right away
into the user handler.

Also, is the internal resource used for e.g. `file_get_contents()`
leaked
into user error handlers? I'm not sure whether that's desirable. If
that's intended, that's fine too, I think.
Currently yes. I need to actually double check the consequences of
that as I didn't really think about internal resources.

If we decide to go with completely wrapping and calling the user error
handler only once per top-level operation, this point would be moot anyway.

Does stream_get_errors() ever reset? Looking at the code it
doesn't seem
so. It will just forever grow. E.g. `fwrite()` will report an error
every
single time when the buffer is full and a write is attempted. If
errors
aren't freed, that's just a memory leak (as long as the socket
lives).
This is a serious flaw.

Lastly, stream_get_errors() is overkill in a lot of cases (especially
when using `fread()`/fwrite() on simple sockets). I just care about the
last error code (there's just going to be one anyway). I'd love
having a
stream_get_last_error() returning just the code.
I guess this makes more sense but it kind of also implies the top
level wrapping as it should contain all errors in the call. Or
alternatively it could be just a ring with limited number of errors be
behave in the same way like openssl_error_string. I guess it depends
if we go with that top level wrapping.

I feel like the top-level wrapping would simplify things a lot (from
point of addressing shortcomings), and if this just works like
json_get_last_error(), where you only ever see the last error, there's
no risks of being unbounded nor having arbitrary size limits.

The only disadvantage of top-level wrapping is probably a bit more
book-keeping as in "storing that there is an outer operation before
invoking an inner operation"? Haven't thought much about the actual
implementation of this, but should be doable.

Thanks,
Bob

1 month ago by Jakub Zelenka — view source — reply

unread

Hi Bob,

Hey Jakub,

thanks for the detailed reply.

The code one should be probably named StreamErrorCode (to not interact
with what Jordi proposed) and think it will need to be backed as there
should be a straightforward way how to compare it with exception code. I
might need to extend the gen stub as I'm not sure it allows macro import
for enum value like for constants or I will need to come up with some
mapping. I just don't wan to duplicate the numbers. I will figure something
out.

How important is it to have exception code?

I think the exception code could be quite useful for checking for specific
errors. It's much better than trying to match it by error message which I
saw in past used in normal error handlers. So I can see definitely use case
for that. It might be also possible to group those erors if enums are used
so users could just match specific group of errors.

I found that for our purposes we generally skip the exception code (i.e.
keep it at zero).
As a simple example, ErrorException does take two integers - one for
severity, and one for code, instead of using the severity as code.

Severity is almost always warning which I don't think is a good
representation for code. I saw error codes used in various application and
I think they are useful. I don't think it hurts to have them. The
implementation of that enum should not be a big problem. I just need to
figure out how to make it nice.

I would propose something similar here: call it StreamError and provide a
public $error property for the actual enum, rather than squeezing it into
$code.

I'm not sure I understand this. The code is a generic identifier for the
error class that can be matched. StreamError would be the actual
representation containing the message and other info - this cannot be
matched in a generic way.

Is this what the logged errors section is about? Is that section talking

about "php_stream_display_wrapper_errors", which can concat some errors?
I see that this logic is mostly retained.

Yeah the logged errors is an existing mechanism for grouping errors to a
single one (concatenating message). The display is a bit messy because it
separated them by new line or <br/> (if html errors enabled) but I just
kept it to limit BC impact.

I don't think changing the display / grouping here would be problematic
with respect to BC. But I get that this initial draft tries to limit what
it changes.

Exactly I would like to limit the external changes.

Do I understand the implementation correctly, that user error handlers

can be called multiple times for a same operation? E.g. when copying a
file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should
these not be somehow merged? Especially with global user_error handlers,
these cannot be aware of the fact that the permission denied is part of
anything else - at least not until possibly a second user_error
invocation happens.

This is somehow related to the exception error grouping as it requires
some sort of linking between the errors.

I guess to make this work we would need some top level wrapping similar to
what logged errors do but instead of concatenating, it would group the
errors in a better way which could then be used in exception or handler.
This would, however, increase the complexity significantly as it is
possible to nest the stream calls (e.g. stream calls in notification
callback or in the new error handler) so there would probably need to be
some stack. Also I'm not sure about the API. Should we pass always array of
errors to the callback (which would mostly have just one element anyway or
somehow chain the StreamError object (something like previousError()) and
what message should be used in exception (e.g. last error and then helper
function to get all errors)?

I'm willing to look into this but would like to know if this is something
important for user space and what is the usual flow how could this be used
in frameworks and so on. In other words I would like to get more input to
make sure that this is really needed as it requires quite a bit of effort
to do.

Doing this work will definitely tidy up the whole error story. It's not
fundamentally crucial, but it definitely will make for a nicer API.

And yes, chaining StreamError sounds and nesting exceptions to $previous
sounds like the most straightforward way.

Also a possibility for exceptions: this might be a bit of work, but you
could insert a fake frame for exceptions to indicate when an error appears
as a sub-operation. Like:

StreamException: Permission denied
#0 [internal function]: fopen("target.php", "w+")
#1 [internal function]: copy("source.php", "target.php")
#2 ...

I will check this out.

I also like the suggestion by Jordi to pass the StreamError right away
into the user handler.

Yeah agreed.

Also, is the internal resource used for e.g. file_get_contents() leaked

into user error handlers? I'm not sure whether that's desirable. If
that's intended, that's fine too, I think.

Currently yes. I need to actually double check the consequences of that as
I didn't really think about internal resources.

If we decide to go with completely wrapping and calling the user error
handler only once per top-level operation, this point would be moot anyway.

Does stream_get_errors() ever reset? Looking at the code it doesn't seem

so. It will just forever grow. E.g. fwrite() will report an error every
single time when the buffer is full and a write is attempted. If errors
aren't freed, that's just a memory leak (as long as the socket lives).
This is a serious flaw.

Lastly, stream_get_errors() is overkill in a lot of cases (especially
when using fread()/fwrite() on simple sockets). I just care about the
last error code (there's just going to be one anyway). I'd love having a
stream_get_last_error() returning just the code.

I guess this makes more sense but it kind of also implies the top level
wrapping as it should contain all errors in the call. Or alternatively it
could be just a ring with limited number of errors be behave in the same
way like openssl_error_string. I guess it depends if we go with that top
level wrapping.

I feel like the top-level wrapping would simplify things a lot (from point
of addressing shortcomings), and if this just works like
json_get_last_error(), where you only ever see the last error, there's no
risks of being unbounded nor having arbitrary size limits.

Yeah I think it would be more logical and improve things so I will check it
out and see when I have my next stream errors slot (should be later next
month).

Cheers,

Jakub

1 month ago by Bob Weinand — view source — reply

unread

Hey Jakub,

The code one should be probably named StreamErrorCode (to not
interact with what Jordi proposed) and think it will need to be
backed as there should be a straightforward way how to compare it
with exception code. I might need to extend the gen stub as I'm
not sure it allows macro import for enum value like for constants
or I will need to come up with some mapping. I just don't wan to
duplicate the numbers. I will figure something out.
How important is it to have exception code?
I think the exception code could be quite useful for checking for
specific errors. It's much better than trying to match it by error
message which I saw in past used in normal error handlers. So I can
see definitely use case for that. It might be also possible to group
those erors if enums are used so users could just match specific group
of errors.

For the purpose of grouping it would be vastly preferable to add
functions to the enum class to mark enum values as pertaining to
specific groups "function isIoError(): bool", "function
isFileSystemError(): bool" etc. or just simply "function errorGroup():
StreamErrorGroup" which is another enum with some grooups like "Io",
"Filesystem" etc. rather than comparing ranges.

I found that for our purposes we generally skip the exception code
(i.e. keep it at zero).
As a simple example, ErrorException does take two integers - one
for severity, and one for code, instead of using the severity as code.
Severity is almost always warning which I don't think is a good
representation for code. I saw error codes used in various application
and I think they are useful. I don't think it hurts to have them. The
implementation of that enum should not be a big problem. I just need
to figure out how to make it nice.
I would propose something similar here: call it StreamError and
provide a public $error property for the actual enum, rather than
squeezing it into $code.
I'm not sure I understand this. The code is a generic identifier for
the error class that can be matched. StreamError would be the actual
representation containing the message and other info - this cannot be
matched in a generic way.

My point here is that $code generally does not get used for our
Exceptions. Whenever we need to attach extra information to an
exception, we do it as extraneous properties. (That's what my example
with ErrorException was about.)

I.e. ignore the existence of $code (just assign zero).

And add an extra property for the StreamError enum (which now needs no
backing).

class StreamException extends Exception {
public __construct(
public StreamError $error,
public string $wrapperName,
string $message = "",
public string|null $param = "",
?Throwable $previous = null
) {
parent::__construct($message, 0, $previous);
}
}

Does this make more sense, described as code?

Thank you,
Bob

1 month ago by Jakub Zelenka — view source — reply

unread

Hi Bob,

Hey Jakub,

The code one should be probably named StreamErrorCode (to not interact
with what Jordi proposed) and think it will need to be backed as there
should be a straightforward way how to compare it with exception code. I
might need to extend the gen stub as I'm not sure it allows macro import
for enum value like for constants or I will need to come up with some
mapping. I just don't wan to duplicate the numbers. I will figure something
out.

How important is it to have exception code?

I think the exception code could be quite useful for checking for specific
errors. It's much better than trying to match it by error message which I
saw in past used in normal error handlers. So I can see definitely use case
for that. It might be also possible to group those erors if enums are used
so users could just match specific group of errors.

For the purpose of grouping it would be vastly preferable to add functions
to the enum class to mark enum values as pertaining to specific groups
"function isIoError(): bool", "function isFileSystemError(): bool" etc. or
just simply "function errorGroup(): StreamErrorGroup" which is another enum
with some grooups like "Io", "Filesystem" etc. rather than comparing ranges.

Yeah that's what I was thinking. Internally I would still probably
represent it as num ranges (as it is now) but userspace should use those
enum functions.

I found that for our purposes we generally skip the exception code (i.e.

keep it at zero).
As a simple example, ErrorException does take two integers - one for
severity, and one for code, instead of using the severity as code.

Severity is almost always warning which I don't think is a good
representation for code. I saw error codes used in various application and
I think they are useful. I don't think it hurts to have them. The
implementation of that enum should not be a big problem. I just need to
figure out how to make it nice.

I would propose something similar here: call it StreamError and provide a
public $error property for the actual enum, rather than squeezing it into
$code.

I'm not sure I understand this. The code is a generic identifier for the
error class that can be matched. StreamError would be the actual
representation containing the message and other info - this cannot be
matched in a generic way.

My point here is that $code generally does not get used for our
Exceptions. Whenever we need to attach extra information to an exception,
we do it as extraneous properties. (That's what my example with
ErrorException was about.)

I.e. ignore the existence of $code (just assign zero).

And add an extra property for the StreamError enum (which now needs no
backing).

class StreamException extends Exception {
public __construct(
public StreamError $error,
public string $wrapperName,
string $message = "",
public string|null $param = "",
?Throwable $previous = null
) {
parent::__construct($message, 0, $previous);
}
}

Does this make more sense, described as code?

As I see what you mean now. I would prefer to use StreamError name for the
class holding all the details including code, wrapper name, message and
param as proposed by Jordi because this could be also passed to handler and
retrieved from the stored errors. It just seems not right to use
StreamException as a container without being thrown (is that done
anywhere?) so I think separate class for handler and stored errors (or
single last error) seems cleaner. Although StreamException would be then
quite similar. I could pass StreamError there as a property but then it
would duplicate the message so not sure.

In any case I think it's better to call that enum StreamErrorCode. I will
still find a clean way to map it to the defines (STREAM_ERROR_CODE_*
macros) because that's what I use internally to set code in the error. So
not using int value for the exception code won't make that much difference
from the implementation PoV.

Kind regards,

Jakub

1 month ago by Bob Weinand — view source — reply

unread

Hey Jakub,

My point here is that $code generally does not get used for our
Exceptions. Whenever we need to attach extra information to an
exception, we do it as extraneous properties. (That's what my
example with ErrorException was about.)

I.e. ignore the existence of $code (just assign zero).

And add an extra property for the StreamError enum (which now
needs no backing).

class StreamException extends Exception {
    public __construct(
        public StreamError $error,
        public string $wrapperName,
        string $message = "",
        public string|null $param = "",
        ?Throwable $previous = null
    ) {
        parent::__construct($message, 0, $previous);
    }
}

Does this make more sense, described as code?
As I see what you mean now. I would prefer to use StreamError name for
the class holding all the details including code, wrapper name,
message and param as proposed by Jordi because this could be also
passed to handler and retrieved from the stored errors. It just seems
not right to use StreamException as a container without being thrown
(is that done anywhere?) so I think separate class for handler and
stored errors (or single last error) seems cleaner. Although
StreamException would be then quite similar. I could pass StreamError
there as a property but then it would duplicate the message so not sure.

In any case I think it's better to call that enum StreamErrorCode. I
will still find a clean way to map it to the defines
(STREAM_ERROR_CODE_* macros) because that's what I use internally to
set code in the error. So not using int value for the exception code
won't make that much difference from the implementation PoV.

I don't mind the name, whether you call it StreamError or
StreamErrorCode. Or maybe call the class "StreamFailure" and the enum
"StreamError"? Just a suggestion, I don't care too much about the actual
names.

For the internal implementation, does it actually matter passing around
an int value or a char pointer with the name? Which would allow you to
directly pass it into the enum lookup. But yup, do whatever fits you
best here :-)

(Ranty tangent: Umm, what's even the internal proper way to get an enum
value? Manual zend_hash_find_ptr(CE_CONSTANTS_TABLE(ce), constant_name)
including zend_update_class_constant()?! We ought to figure a way to
create the internal enum case objects at compile time, then we could
just reference them directly... Working with enums internally can be
annoying and fetching feels slow (even if it's just a HashTable lookup).)

Bob

1 month ago by Jordi Boggiano — view source — reply

unread

Hey Jakub,

Very nice improvement overall. Beyond the enum comment from Bob/Ignace
I would also think that stream_get_errors() could return an array of
StreamError objects instead of arrays?

readonly class StreamError
{
public function __construct(
public string $message,
public int $code,
public int $severity,
public bool $terminal,
public string $wrapper,
public ?string $param = null,
public ?string $docref = null,
) {}
}

This object could also be passed to the error handlers perhaps as:

function(string $wrapper, ?resource $stream, StreamError $error): void

Just some ideas :)

Best,
Jordi

1 month ago by Lynn — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part of
my stream evolution work (PHP Foundation project funded by Sovereign Tech
Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

The naming of "terminal" is a little confusing to me. Reading the name with
a bool made me think it was whether or not the CLI was being used, which
made little sense in the context of stream errors. After reading the
explanation "Terminal errors are those that prevent an operation from
completing (e.g., file not found, permission denied). Non-terminal errors
are warnings or notices that don't stop execution (e.g., buffer
truncation).", I feel like it might be more clear if this property was
named after what the result is, perhaps something like "operationBlocked"
or "operationPrevented".

1 month ago by ignace nyamagana butera — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part of
my stream evolution work (PHP Foundation project funded by Sovereign Tech
Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

The naming of "terminal" is a little confusing to me. Reading the name
with a bool made me think it was whether or not the CLI was being used,
which made little sense in the context of stream errors. After reading the
explanation "Terminal errors are those that prevent an operation from
completing (e.g., file not found, permission denied). Non-terminal errors
are warnings or notices that don't stop execution (e.g., buffer
truncation).", I feel like it might be more clear if this property was
named after what the result is, perhaps something like "operationBlocked"
or "operationPrevented".

I believe these types of issues can be resolved by replacing boolean values
by Enums, hence my suggestion. Enum would be more self explanatory and
would provide better insight on the outcome. I also like Jordi's
suggestion, anything that can help static analysis is the way to go IMHO.

Best regards,
Ignace

1 month ago by Lynn — view source — reply

unread

On Wed, Nov 19, 2025 at 10:48 AM ignace nyamagana butera <
nyamsprod@gmail.com> wrote:

Hello,

I would like to introduce a new stream error handling RFC that is part
of my stream evolution work (PHP Foundation project funded by Sovereign
Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

The naming of "terminal" is a little confusing to me. Reading the name
with a bool made me think it was whether or not the CLI was being used,
which made little sense in the context of stream errors. After reading the
explanation "Terminal errors are those that prevent an operation from
completing (e.g., file not found, permission denied). Non-terminal errors
are warnings or notices that don't stop execution (e.g., buffer
truncation).", I feel like it might be more clear if this property was
named after what the result is, perhaps something like "operationBlocked"
or "operationPrevented".

I believe these types of issues can be resolved by replacing boolean
values by Enums, hence my suggestion. Enum would be more self explanatory
and would provide better insight on the outcome. I also like Jordi's
suggestion, anything that can help static analysis is the way to go IMHO.

Best regards,
Ignace

I ended up removing this from my first reply, but an enum was also what I
thought of as it wouldn't surprise me if a 3rd state could be introduced:
fully prevented, fully executed but with warnings, and partially executed
like a buffer truncation. I just don't know enough about the underlying
errors/warnings to suggest this as alternative implementation.

1 month ago by Jakub Zelenka — view source — reply

unread

Hi,

On Wed, Nov 19, 2025 at 10:48 AM ignace nyamagana butera <
nyamsprod@gmail.com> wrote:

Hello,

I would like to introduce a new stream error handling RFC that is part
of my stream evolution work (PHP Foundation project funded by Sovereign
Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

The naming of "terminal" is a little confusing to me. Reading the name
with a bool made me think it was whether or not the CLI was being used,
which made little sense in the context of stream errors. After reading the
explanation "Terminal errors are those that prevent an operation from
completing (e.g., file not found, permission denied). Non-terminal errors
are warnings or notices that don't stop execution (e.g., buffer
truncation).", I feel like it might be more clear if this property was
named after what the result is, perhaps something like "operationBlocked"
or "operationPrevented".

I believe these types of issues can be resolved by replacing boolean
values by Enums, hence my suggestion. Enum would be more self explanatory
and would provide better insight on the outcome. I also like Jordi's
suggestion, anything that can help static analysis is the way to go IMHO.

Best regards,
Ignace

I ended up removing this from my first reply, but an enum was also what I
thought of as it wouldn't surprise me if a 3rd state could be introduced:
fully prevented, fully executed but with warnings, and partially executed
like a buffer truncation. I just don't know enough about the underlying
errors/warnings to suggest this as alternative implementation.

I don't think we will add another state. The enum based approach seems most
reasonable to me. I should also change the name terminal / non terminal to
terminating / non terminating as I think that's the correct name for this...

Cheers,

Jakub

1 month ago by Gina P. Banyard — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part of my stream evolution work (PHP Foundation project funded by Sovereign Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Hi Jakub,

Thank you for working on this, this has been a long-standing pain point of our stream layer.
However, I do have various comments about the design of this.

Echoing multiple voices, please use enums rather than constants where applicable, and use an object rather than an associative array for individual stream errors.
As others have pointed out, the notion of (non-)"terminal" errors is not clear, and I'm struggling to understand what is the point of it?

How does this interact with the stream notification stuff?
(I guess surrounding PHP_STREAM_NOTIFY_SEVERITY_INFO and PHP_STREAM_NOTIFY_SEVERITY_ERR)

Regarding "docref" what is the point of including this in the error?
This is mainly used with php_docref to generate links to INI settings by referencing an XML ID that exists in doc-en, this does not seem applicable here.
I guess you added this due to the two usages of "streams.crypto" as a docref in main/streams/transports.c, but this ID doesn't exist, and I fear this might never have worked properly.
As such, I would recommend just getting rid of this.

The last thing that I feel very strongly about is that there should only be 3 modes:

Silent
Warnings
Exceptions

As such, the cases where we currently emit a notice should either be promoted to warnings (which for some of them I don't even understand why they are just notices) or removed (mainly looking at the one in main/streams/userspace.c)
And the singular case which bailouts using E_ERROR should be converted to always throw an exception.
Bailouts cause all sorts of issues due to the lack of unwinding, things that exceptions solve.
This would also simplify both the internal, and userland API and get rid of the confusing notion of "errors" meaning "diagnostics/warnings".

Best regards,
Gina P. Banyard

1 month ago by Jakub Zelenka — view source — reply

unread

Hi Gina,

On Tuesday, 18 November 2025 at 18:41, Jakub Zelenka bukka@php.net
wrote:

Hello,

I would like to introduce a new stream error handling RFC that is part of
my stream evolution work (PHP Foundation project funded by Sovereign Tech
Fund) :

https://wiki.php.net/rfc/stream_errors

Hi Jakub,

Thank you for working on this, this has been a long-standing pain point of
our stream layer.
However, I do have various comments about the design of this.

Echoing multiple voices, please use enums rather than constants where
applicable, and use an object rather than an associative array for
individual stream errors.
As others have pointed out, the notion of (non-)"terminal" errors is not
clear, and I'm struggling to understand what is the point of it?

I just answered this in replay to Bob. The non terminating errors are those
that don't change return value of the function (it means not returning
failure). Some of those errors are still useful and users should be
notified but they are not critical for correct result. This was mainly
introduced for exceptions as those cases should not throw but it still
makes sense to somehow notify users (e.g. store the error).

How does this interact with the stream notification stuff?
(I guess surrounding PHP_STREAM_NOTIFY_SEVERITY_INFO and
PHP_STREAM_NOTIFY_SEVERITY_ERR)

It doesn't interact with it. Although PHP_STREAM_NOTIFY_SEVERITY_ERR
somehow overlaps with it but it's limited in use (currently used only in
ftp and http wrapper). Obviously I couldn't just use notifications and I
needed to keep them working so the only possibility was to introduce a new
API that handles just errors. Going forward I'm not
sure PHP_STREAM_NOTIFY_SEVERITY_ERR should be kept but that's more a future
scope I guess.

Regarding "docref" what is the point of including this in the error?
This is mainly used with php_docref to generate links to INI settings by
referencing an XML ID that exists in doc-en, this does not seem applicable
here.
I guess you added this due to the two usages of "streams.crypto" as a
docref in main/streams/transports.c, but this ID doesn't exist, and I
fear this might never have worked properly.
As such, I would recommend just getting rid of this.

Agreed I'm not going to expose it. I will keep it just for error mode to
keep it the same but I guess it might be better to drop it completely in
the future.

The last thing that I feel very strongly about is that there should only
be 3 modes:

Silent

Warnings

Exceptions

As such, the cases where we currently emit a notice should either be
promoted to warnings (which for some of them I don't even understand why
they are just notices) or removed (mainly looking at the one in
main/streams/userspace.c)

And the singular case which bailouts using E_ERROR should be converted to

always throw an exception.
Bailouts cause all sorts of issues due to the lack of unwinding, things
that exceptions solve.
This would also simplify both the internal, and userland API and get rid
of the confusing notion of "errors" meaning "diagnostics/warnings".

My aim is really not to touch the current logic and the error
classification except one case where reporting was done accidentally. I
think that non terminating errors make some sense as they give some hint
that something could be improved but it's not necessarily preventing the
functioning. This is something that has been already there and I don't
think it needs to be changed as part of this proposal.

Cheers,

Jakub

1 month ago by Larry Garfield — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part
of my stream evolution work (PHP Foundation project funded by Sovereign
Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

HI Jakub.

I agree with what others have said so far. I'm going to ask a little bit further, though.

AIUI, there's two main issues with the stream API:

It's very inconsistent.
It's very clunky, non-obvious, and hard to use if you don't know exactly what you're doing.

This RFC seems to be addressing the first point, which is fine. However, should we be trying to smooth out the current unpleasant API, or should the effort be put toward a more intuitive API that is smoother from the start? This RFC could be a stepping stone toward that, I'm not sure, but if so that's not clear to me.

Are there any longer term plans here for a more complete stream overhaul?

--Larry Garfield

1 month ago by Jakub Zelenka — view source — reply

unread

Hi,

On Wed, Nov 19, 2025 at 5:35 PM Larry Garfield larry@garfieldtech.com
wrote:

Hello,

I would like to introduce a new stream error handling RFC that is part
of my stream evolution work (PHP Foundation project funded by Sovereign
Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

HI Jakub.

I agree with what others have said so far. I'm going to ask a little bit
further, though.

AIUI, there's two main issues with the stream API:

It's very inconsistent.

It's very clunky, non-obvious, and hard to use if you don't know
exactly what you're doing.

This RFC seems to be addressing the first point, which is fine. However,
should we be trying to smooth out the current unpleasant API, or should the
effort be put toward a more intuitive API that is smoother from the start?
This RFC could be a stepping stone toward that, I'm not sure, but if so
that's not clear to me.

Are there any longer term plans here for a more complete stream overhaul?

This RFC is part of the bigger effort to address various stream issues and
provide other stream related improvements that I selected based on my TODO
list. You can see more detailed description of the whole scope in this
foundation blog:
https://thephp.foundation/blog/2025/10/30/php-streams-evolution/ .

The overhaul of the whole API is out of scope as there are lots of unknowns
and would require longer migration. It is thus harder to deliver in the
selected time frame. It doesn't mean that it won't happen. It's just out of
scope for this block of work and I don't have specific plans for it.

Kind regards,

Jakub

30 days ago by Arnaud Le Blanc — view source — reply

unread

Hi Jakub,

Hello,

I would like to introduce a new stream error handling RFC that is part of my stream evolution work (PHP Foundation project funded by Sovereign Tech Fund) :

https://wiki.php.net/rfc/stream_errors

Kind regards,

Jakub

Thank you for working on this.

The RFC doesn't specify when exactly the error_handler is invoked
(especially in case of non-fatal errors). Is it as soon as the error
happens, or are invocations batched at the end of the operation?

The reason I'm asking is because there is a long track of bugs related
to the global error handler [1], and it's likely that the
error_handler proposed here could suffer from the same issues. For
example the error_handler can reenter the operation being performed,
or change the state of the stream. For safety and simplicity, I would
suggest that error_handler calls are delayed until just before
returning to user code.

[1] https://github.com/php/php-src/issues/20018

Best Regards,
Arnaud

12 days ago by Jakub Zelenka — view source — reply

unread

Hello,

I would like to introduce a new stream error handling RFC that is part of
my stream evolution work (PHP Foundation project funded by Sovereign Tech
Fund) :

https://wiki.php.net/rfc/stream_errors

Hi, I just published version 2.0 which is a significant redesign including:

Introducing StreamErrorStore, StreamErrorMode and StreamErrorCode enums
(the code also includes various suggested helper methods). The code is
backed (int) because I need to keep the number to easily match the category
and simplify the mapping. I think that in this case a backed enum is better.
The error data are now in a StreamError class instead of an associative
array. This also simplifies StreamException which now has just a single
method to return the StreamError instance. The error handler callback
signature was also simplified to take just a single StreamError parameter
instead of multiple individual parameters.
Errors are now grouped per operation and reported at the end of the call.
This also supports nested calls (e.g. in user streams) so they don't
overlap. And it delays the execution of error handlers (not just the stream
error callback but also the usual error handler if default standard error
triggering is used) which should be safer as noted by Arnaud.
The storing of errors is done at the end of the operation grouping and
contains only the last grouped errors. So the function name was changed to
stream_get_last_error(). It returns just a single StreamError which has a
next property pointing to the next error if there is any.
Stream is no longer exposed and errors are not collected on streams but
globally in the same way as wrappers. It is actually no longer even
collected per wrapper because it doesn't make much sense when the errors
are grouped per operation.
Terminal (and NonTerminal) was renamed to Terminating (and
NonTerminating).
Invalid enum types for context options (error_mode and error_store) now
throw TypeError.
Implemented operation pooling to minimize memory allocations during error
handling (pre-allocated pool for up to 8 nested operations).

I hope this addresses all concerns.

Kind regards,

Jakub

12 days ago by Morgan — view source — reply

unread

Hi, I just published version 2.0 which is a significant redesign
including:

Introducing ... StreamErrorCode
enums.... The
code is backed (int) because I need to keep the number to easily match
the category and simplify the mapping. I think that in this case a
backed enum is better.

Just spitballing here, but perhaps a StreamErrorCategory enum? The Code
could have a method/property that matches it to the corresponding
Category; categorising errors of different kinds can be done by
branching on the Category, rather than (possibly multiple) is*Error() tests.

12 days ago by Christian Schneider — view source — reply

unread

Am 29.12.2025 um 19:52 schrieb Jakub Zelenka bukka@php.net:

The storing of errors is done at the end of the operation grouping and contains only the last grouped errors. So the function name was changed to stream_get_last_error(). It returns just a single StreamError which has a next property pointing to the next error if there is any.

I assume the chaining was based on Exception chaining but the example code under
https://wiki.php.net/rfc/stream_errors#error_chaining
looks a bit clunky to me and I was wondering if returning an array of StreamError would not be more straightforward, i.e. one could then use foreach, empty(), count(), array_first(), array_last(), array_find() etc. instead of methods like while, $e->count(), $e->hasCode() which would make it feel more idiomatic to me.

But then again it might be a matter of taste and/or I might have missed something.

Regards,

Chris