Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129298 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 42AC01A00BC for ; Wed, 19 Nov 2025 11:52:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1763553173; bh=oYkP51snKMRugbqoNTE1ylew+zYsixLJ0tt1khQharw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=HPHZa7yX475yoN4TWFZNN4E35sIsK5Pa7aJRKUl8rgoUAJUcr/q8+rEfLBHvYk/Fe 2XgvPy+mlXDi6hnKn//mkXmTgbclK8rjY0WKH9B80wRP6H7REuIMmqxOSd/mdlzBxB MLpNSzVSQGdUH6koyls/T+mHx5LqXuyPDeUdwX1mXmYrUvK4d4kXqHMxYGOmh4dNH7 IgLwPbUAzGupTZP3suesoz+EwYqbKa8eOgXIDtPaSEnKtApXpKKqxSKzcaB6DktUx1 aSEhGAAomSqcBXAemOcLmzSVzAgReNbH4JqseibSSryVnjPHA9UW51O/J69sScSAYf sD3u6HQxTsZbA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 799E018004D for ; Wed, 19 Nov 2025 11:52:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: * X-Spam-Status: No, score=1.8 required=5.0 tests=BAYES_50,DMARC_NONE, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-oo1-f47.google.com (mail-oo1-f47.google.com [209.85.161.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 19 Nov 2025 11:52:52 +0000 (UTC) Received: by mail-oo1-f47.google.com with SMTP id 006d021491bc7-6567a0d456bso2669615eaf.1 for ; Wed, 19 Nov 2025 03:52:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763553166; x=1764157966; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=oYkP51snKMRugbqoNTE1ylew+zYsixLJ0tt1khQharw=; b=GFFJk3o1CLb8E6SC/Rb2TD4ikWUlT9VUpxcB2HlU9nhR5n9whW9olxomBImgQa8cpV O67roLYV9PN75qAfmQpgZKnZ58Ns0VsCKVFOHq8O4qSJuAy2SPYyHQnAHteV6WlOPMGk lf3XQoRcrlmBlVnYDpwQY0KfmISnZ5sYdLZ6MOqknIFOAPiDrNXxfqWRYK7TkQ9dnvDt WSctJqHO6OnosDqMw2Zs/j3kCIP/qWjNyXpuLfFbW5OB7sRAKl8zukCbAltd1Vj5Le+N MUK85eYsyBxL6tsOCddZ/rC1/s7ahRtBkobmUO2DBk8wHy6WQjkct8sLv0Y/AXh638Fg TSMw== X-Gm-Message-State: AOJu0YwHGT4amjhC4UIRiMD1zybCuzrjTeABLM9ojhTIwJbEmZB3lrf0 g3KAIv/69eSGOKipKSzdzQHsNeGtg/YGRQF+UmJ1PYmIuSDjET5pouFc2SKpsWV8AgBL1/njC1m Bik+OgxObt2d+CeupsTgW3F9r7Z3mS/k= X-Gm-Gg: ASbGncvHFhG9dC+C/xsp9/x4huMZMoou99fvihtE4tH/Ju9tlM0X07u+BAWkZcSeTxM +hByyPjcOlhPxscaMqQL75p7ctRsMgQze+MGW2BLg1yKRhaJhEa3qW1umpRGmpTCMDO5uzJJ5Ic cq9TE6g0IU2FMFgjsUIjGA7B6FH5mTFL3KandUUJyBl4mwz+u7KkUNNPxCTj7U97OQINyXk6qPe J/6xXqAkpxfTz+aO9QlM11Ollq4rSD91Ftla+2hKS+/NGA3wV7kKqtL1rcniC24/PbHj7yd/UaB U20d X-Google-Smtp-Source: AGHT+IFHOA+F06s4Dq/FohgXcLEekh7nxdDs0pAbiDa1ts4TEGF4T/VKGqKI5wRDeTRMPsOO0OGUhOtkbbhSOdbkpnE= X-Received: by 2002:a05:6870:d68e:b0:3ec:6dff:3d3c with SMTP id 586e51a60fabf-3ec6dff4050mr2152967fac.14.1763553166592; Wed, 19 Nov 2025 03:52:46 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Wed, 19 Nov 2025 12:52:35 +0100 X-Gm-Features: AWmQ_bl89Wsix2i1EpPoZyicWiQU94eSZxiNJo90n8dUMVgSYcmYKA_y6F7IgW0 Message-ID: Subject: Re: [PHP-DEV] [RFC] Stream Error Handling Improvements To: Bob Weinand Cc: PHP internals list , ignace nyamagana butera , Jordi Boggiano Content-Type: multipart/alternative; boundary="000000000000098d9e0643f138c2" From: bukka@php.net (Jakub Zelenka) --000000000000098d9e0643f138c2 Content-Type: text/plain; charset="UTF-8" Hi Bob, Thanks for the feedback, the replies here also cover the feedback from Jordi and Ignace. > Why have you chosen a big list of constants rather than enums? > I think we should push for enums where applicable, meaning enum > StreamErrorMode, enum StreamErrorStore, enum StreamError. > Ok I will change it. StreamErrorMode and StreamErrorStore should be straightforward. The code one should be probably named StreamErrorCode (to not interact with what Jordi proposed) and think it will need to be backed as there should be a straightforward way how to compare it with exception code. I might need to extend the gen stub as I'm not sure it allows macro import for enum value like for constants or I will need to come up with some mapping. I just don't wan to duplicate the numbers. I will figure something out. > You define what a terminal error is, but does it correlate to specific > error codes? Like, is a given error code always terminal or not > terminal? Or can some error codes sometimes be terminal or not terminal > depending on context? > I've tried to look at the code, but I don't quite understand it, e.g. > php_plain_files_rename: When chown or chmod fail, the error is terminal. > When the copying fails (did you forget to adapt php_copy_file_ctx?), > that's non-terminal, though. Reading through the code, I have the > feeling that operations which do not call out to other operations, which > can error, are terminal. And all others are non-terminal. E.g. failing > to open a file is terminal. That the copying where the file opening is > part of, failed, is non-terminal. > And then *additionally* some operations, which don't prevent success are > also marked non-terminal - squeezing it into the boolean for the purpose > of not erroring. > Which makes some sense to me, but the description in the RFC is really > confusing here. And the meanings are muddied. > So currently the terminal / non terminal does not attach to code but it could be done. The current logic is really just supposed to reflect what is there already so the exception can be thrown in places where the function would fail. So the errors after which the function fails are terminal and the errors where the function still returns success are non terminal. Thus, if I understand that correctly, there should be an unique mapping > from error code to terminal. If we were to go with enums, we could make > it trivially a property of the individual enum values instead. > I think making it property of enum make more sense. Then I can drop that boolean property. > Should StreamException be attached the non-terminal errors which are > caused by the terminal errors? I.e. StreamException clearly is > STREAM_ERROR_CODE_PERMISSION_DENIED for example, but it happens during a > copy, so, should the information that a STREAM_ERROR_CODE_COPY_FAILED > was caused by that, also be present on the StreamException? > I think I get what you mean from the user point of view but I'm not sure how could this be reasonably linked. I could maybe check during stream error if there is EG(exception) it's a StreamException and if so, I could the update it with this error (e.g. some property that would hold those error can could be retrieved). I guess it could be potentially extended for terminal errors as well so it could have method like getAllErrors which would return all errors because sometimes there can be more terminal errors and some of them might get lost (last one will be added). However this would work just for exception so maybe more generic solution is needed here (see below). > Are stream errors which happen as part of a fully wrapped functionality > marked as wrapper error as well? Like if the server sent an invalid TLS > response to a file_get_contents("https://...") operation, making the > whole operation fail? You obviously don't have access to the > intermediary internal stream resource there, where the stream error is > attached. > I actually thought about this yesterday that it's not ideal that some errors might get lost in this way so I think it should go to wrapper errors as well. > Is this what the logged errors section is about? Is that section talking > about "php_stream_display_wrapper_errors", which can concat some errors? > I see that this logic is mostly retained. > Yeah the logged errors is an existing mechanism for grouping errors to a single one (concatenating message). The display is a bit messy because it separated them by new line or
(if html errors enabled) but I just kept it to limit BC impact. > Do I understand the implementation correctly, that user error handlers > can be called multiple times for a same operation? E.g. when copying a > file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should > these not be somehow merged? Especially with global user_error handlers, > these cannot be aware of the fact that the permission denied is part of > anything else - at least not until possibly a second user_error > invocation happens. > This is somehow related to the exception error grouping as it requires some sort of linking between the errors. I guess to make this work we would need some top level wrapping similar to what logged errors do but instead of concatenating, it would group the errors in a better way which could then be used in exception or handler. This would, however, increase the complexity significantly as it is possible to nest the stream calls (e.g. stream calls in notification callback or in the new error handler) so there would probably need to be some stack. Also I'm not sure about the API. Should we pass always array of errors to the callback (which would mostly have just one element anyway or somehow chain the StreamError object (something like previousError()) and what message should be used in exception (e.g. last error and then helper function to get all errors)? I'm willing to look into this but would like to know if this is something important for user space and what is the usual flow how could this be used in frameworks and so on. In other words I would like to get more input to make sure that this is really needed as it requires quite a bit of effort to do. > Also, is the internal resource used for e.g. file_get_contents() leaked > into user error handlers? I'm not sure whether that's desirable. If > that's intended, that's fine too, I think. > > Currently yes. I need to actually double check the consequences of that as I didn't really think about internal resources. > Further suggestion on stream_get_errors() - why not return a proper > object instead of a loosely typed array? Proper typing, and creation is > actually faster. Also "docref" is so... antiquated? maybe just > "function" and have it contain the function name of the current operation. > > Yeah I will go for it and introduce StreamError as Jordi suggested with slight modification to use enum and dropping terminal and docref. > Does stream_get_errors() ever reset? Looking at the code it doesn't seem > so. It will just forever grow. E.g. fwrite() will report an error every > single time when the buffer is full and a write is attempted. If errors > aren't freed, that's just a memory leak (as long as the socket lives). > This is a serious flaw. > > Lastly, stream_get_errors() is overkill in a lot of cases (especially > when using fread()/fwrite() on simple sockets). I just care about the > last error code (there's just going to be one anyway). I'd love having a > stream_get_last_error() returning just the code. > > I guess this makes more sense but it kind of also implies the top level wrapping as it should contain all errors in the call. Or alternatively it could be just a ring with limited number of errors be behave in the same way like openssl_error_string. I guess it depends if we go with that top level wrapping. Cheers, Jakub --000000000000098d9e0643f138c2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Bob,

Thanks for the feedback, the replies here also cover t= he feedback from Jordi and Ignace.
=C2= =A0
Why have you chosen a big list of constants rather than enums?
I think we should push for enums where applicable, meaning enum
StreamErrorMode, enum StreamErrorStore, enum StreamError.
<= div>
Ok I will change it.=C2=A0StreamErrorMode and=C2=A0Strea= mErrorStore should be straightforward. The code one should be probably name= d StreamErrorCode (to not interact with what Jordi proposed) and think it w= ill need to be backed as there should be a straightforward way how to compa= re it with exception code. I might need to extend the gen stub as I'm n= ot sure it allows macro import for enum value like for constants or I will = need to come up with some mapping. I just don't wan to duplicate the nu= mbers. I will figure something out.
=C2=A0
You define what a terminal error is, but does it correlate to specific
error codes? Like, is a given error code always terminal or not
terminal? Or can some error codes sometimes be terminal or not terminal depending on context?
I've tried to look at the code, but I don't quite understand it, e.= g.
php_plain_files_rename: When chown or chmod fail, the error is terminal. When the copying fails (did you forget to adapt php_copy_file_ctx?),
that's non-terminal, though. Reading through the code, I have the
feeling that operations which do not call out to other operations, which can error, are terminal. And all others are non-terminal. E.g. failing
to open a file is terminal. That the copying where the file opening is
part of, failed, is non-terminal.
And then *additionally* some operations, which don't prevent success ar= e
also marked non-terminal - squeezing it into the boolean for the purpose of not erroring.
Which makes some sense to me, but the description in the RFC is really
confusing here. And the meanings are muddied.

So currently the terminal / non terminal does not attach to code but= it could be done. The current logic is really just supposed to reflect wha= t is there already so the exception can be thrown in places where the funct= ion would fail. So the errors after which the function fails are terminal a= nd the errors where the function still returns success are non terminal.=C2= =A0

Thus, if I understand that correctly, there should be an unique mapping from error code to terminal. If we were to go with enums, we could make it trivially a property of the individual enum values instead.

I think making it property of enum make more sense.= Then I can drop that boolean property.
=C2=A0
Should StreamException be attached the non-terminal errors which are
caused by the terminal errors? I.e. StreamException clearly is
STREAM_ERROR_CODE_PERMISSION_DENIED for example, but it happens during a copy, so, should the information that a STREAM_ERROR_CODE_COPY_FAILED
was caused by that, also be present on the StreamException?

I think I get what you mean from the user point of vie= w but I'm not sure how could this be reasonably linked.=C2=A0

I could maybe check during stream error if there is EG(exce= ption) it's a StreamException and if so, I could the update it with thi= s error (e.g. some property that would hold those error can could be retrie= ved). I guess it could be potentially extended for terminal errors as well = so it could have method like getAllErrors which would return all errors bec= ause sometimes there can be more terminal errors and some of them might get= lost (last one will be added). However this would work just for exception = so maybe more generic solution is needed here (see below).
=C2=A0=
Are stream errors which happen as part of a fully wrapped functionality marked as wrapper error as well? Like if the server sent an invalid TLS response to a file_get_contents("https://...") operation, making = the
whole operation fail? You obviously don't have access to the
intermediary internal stream resource there, where the stream error is
attached.

I actually thought about this= yesterday that it's not ideal that some errors might get lost in this = way so I think it should go to wrapper errors as well.
=C2=A0
Is this what the logged errors section is about? Is that section talking about "php_stream_display_wrapper_errors", which can concat some = errors?
I see that this logic is mostly retained.

Yeah the logged errors is an existing mechanism for grouping errors to a= single one (concatenating message). The display is a bit messy because it = separated them by new line or <br/> (if html errors enabled) but I ju= st kept it to limit BC impact.
=C2=A0
Do I understand the implementation correctly, that user error handlers
can be called multiple times for a same operation? E.g. when copying a
file, you'll get first a PERMISSION_DENIED then a COPY_FAILED? Should <= br> these not be somehow merged? Especially with global user_error handlers, these cannot be aware of the fact that the permission denied is part of anything else - at least not until possibly a second user_error
invocation happens.

This is somehow rel= ated to the exception error grouping as it requires some sort of linking be= tween the errors.

I guess to make this work we wou= ld need some top level wrapping similar to what logged errors do but instea= d of concatenating, it would group the errors in a better way which could t= hen be used in exception or handler. This would, however, increase the comp= lexity significantly as it is possible to nest the stream calls (e.g. strea= m calls in notification callback or in the new error handler) so there woul= d probably need to be some stack. Also I'm not sure about the API. Shou= ld we pass always array of errors to the callback (which would mostly have = just one element anyway or somehow chain the StreamError object (something = like previousError()) and what message should be used in exception (e.g. la= st error and then helper function to get all errors)?=C2=A0

<= /div>
I'm willing to look into this but would like to know if this = is something important for user space and what is the usual flow how could = this be used in frameworks and so on. In other words I would like to get mo= re input to make sure that this is really needed as it requires quite a bit= of effort to do.
=C2=A0
Also, is the internal resource used for e.g. file_get_contents() leaked into user error handlers? I'm not sure whether that's desirable. If=
that's intended, that's fine too, I think.


Currently yes. I need to actually doub= le check the consequences of that as I didn't really think about intern= al resources.=C2=A0
=C2=A0
Further suggestion on stream_get_errors() - why not return a proper
object instead of a loosely typed array? Proper typing, and creation is actually faster. Also "docref" is so... antiquated? maybe just "function" and have it contain the function name of the current o= peration.


Yeah I will go for it and introduce St= reamError as Jordi suggested with slight modification to use enum and dropp= ing terminal and docref.
=C2=A0
Does stream_get_errors() ever reset? Looking at the code it doesn't see= m
so. It will just forever grow. E.g. fwrite() will report an error every single time when the buffer is full and a write is attempted. If errors aren't freed, that's just a memory leak (as long as the socket live= s).
This is a serious flaw.

Lastly, stream_get_errors() is overkill in a lot of cases (especially
when using fread()/fwrite() on simple sockets). I just care about the
last error code (there's just going to be one anyway). I'd love hav= ing a
stream_get_last_error() returning just the code.


I guess this makes more sense but it k= ind of also implies the top level wrapping as it should contain all errors = in the call. Or alternatively it could be just a ring with limited number o= f errors be behave in the same way like=C2=A0openssl_error_string. I guess = it depends if we go with that top level wrapping.

= Cheers,

Jakub
--000000000000098d9e0643f138c2--