Dear internal developers,
I recently discovered several failures in error detection involving
file access, stream compression and source inclusion that may bring the
program to process missing or invalid data (very severe safety bug) or
simply crash without apparent reason. I reported all these issues with
their test script trying to do as much as I can to really understand
what happen here. I think it's the time for some real internal expert
to take over these issues and kindly reply to the following questions:
-
Is there something very basic I'm missing? I'm doing something wrong?
-
If yes, what can I do to fix so that i/o errors can be detected?
-
If no, why i/o errors do not propagate through the engine, but are
mostly ignored? and why the user's program does not get signaled
about this, and keeps receiving empty strings or garbage instead?
There is the chance to close all these:
fread()
does not detect file access error
https://bugs.php.net/bug.php?id=71384
(Also includes disk image containing a damaged file for testing.)
require* and include* do not detect input/output error
https://bugs.php.net/bug.php?id=71385
(Probably related to the bug above.)
fread()
does not detect decoding errors from filter zlib.inflate
https://bugs.php.net/bug.php?id=71417
(Probably missing propagation of errors through streams.)
fread()
does not detects decoding errors from filter bzip2.decompress
https://bugs.php.net/bug.php?id=71263
(Probably the same as above.)
Fix for zlib.deflate and zlib.inflate filters; fix for example #1
https://bugs.php.net/bug.php?id=68556
(This latter is actually a doc problem, but it also shows how much is
difficult to explain to the user what exactly happens if something goes
wrong and how to make safe applications).
Regards,
/|\ Umberto Salsi
/_/ www.icosaedro.it
Hi Umberto,
I recently discovered several failures in error detection involving
file access, stream compression and source inclusion that may bring the
program to process missing or invalid data (very severe safety bug) or
simply crash without apparent reason. I reported all these issues with
their test script trying to do as much as I can to really understand
what happen here. I think it's the time for some real internal expert
to take over these issues and kindly reply to the following questions:
Is there something very basic I'm missing? I'm doing something wrong?
If yes, what can I do to fix so that i/o errors can be detected?
If no, why i/o errors do not propagate through the engine, but are
mostly ignored? and why the user's program does not get signaled
about this, and keeps receiving empty strings or garbage instead?
Plain file stream reads data by php_stdiop_read()
http://lxr.php.net/xref/PHP_5_6/main/streams/plain_wrapper.c#338
As you can see there is no way to return errors from it. We need
errno like error handling for PHP streams to propagate errors as well
as more robust code for unexpected.
So answer for 3 is "we need volunteers" for improvement, I suppose.
Anyone?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi,
I recently discovered several failures in error detection involving
file access, stream compression and source inclusion that may bring the
program to process missing or invalid data (very severe safety bug) or
simply crash without apparent reason. I reported all these issues with
their test script trying to do as much as I can to really understand
what happen here. I think it's the time for some real internal expert
to take over these issues and kindly reply to the following questions:
Is there something very basic I'm missing? I'm doing something wrong?
If yes, what can I do to fix so that i/o errors can be detected?
If no, why i/o errors do not propagate through the engine, but are
mostly ignored? and why the user's program does not get signaled
about this, and keeps receiving empty strings or garbage instead?Plain file stream reads data by php_stdiop_read()
http://lxr.php.net/xref/PHP_5_6/main/streams/plain_wrapper.c#338
As you can see there is no way to return errors from it. We need
errno like error handling for PHP streams to propagate errors as well
as more robust code for unexpected.So answer for 3 is "we need volunteers" for improvement, I suppose.
Anyone?
Since I replied, I'll write an implementation idea.
We may add "error" and "clear_error" stream ops in php_stream_ops.
/* operations on streams that are file-handles /
typedef struct _php_stream_ops {
/ stdio like functions - these are mandatory! */
size_t (*write)(php_stream *stream, const char *buf, size_t count
TSRMLS_DC);
size_t (*read)(php_stream *stream, char *buf, size_t count TSRMLS_DC);
int (*close)(php_stream *stream, int close_handle TSRMLS_DC);
int (*flush)(php_stream *stream TSRMLS_DC);
const char *label; /* label for this ops structure */
/* these are optional */
int (*seek)(php_stream *stream, off_t offset, int whence, off_t
*newoffset TSRMLS_DC);
int (*cast)(php_stream *stream, int castas, void **ret TSRMLS_DC);
int (*stat)(php_stream *stream, php_stream_statbuf *ssb TSRMLS_DC);
int (*set_option)(php_stream *stream, int option, int value, void
*ptrparam TSRMLS_DC);
} php_stream_ops;
"error" adds whatever errors from stream to an array.
"clear_error" cleanups the array when stream is closed.
There should be an API to get errors in the array.
This is easy change. Difficult part is making codes more robust against
unexpected for all streams.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
yohgaki@ohgaki.net (Yasuo Ohgaki) wrote:
Plain file stream reads data by php_stdiop_read()
http://lxr.php.net/xref/PHP_5_6/main/streams/plain_wrapper.c#338
As you can see there is no way to return errors from it. We need
errno like error handling for PHP streams to propagate errors as well
as more robust code for unexpected.So answer for 3 is "we need volunteers" for improvement, I suppose.
Anyone?Since I replied, I'll write an implementation idea.
We may add "error" and "clear_error" stream ops in php_stream_ops.
[...]
"error" adds whatever errors from stream to an array.
"clear_error" cleanups the array when stream is closed.
There should be an API to get errors in the array.This is easy change. Difficult part is making codes more robust against
unexpected for all streams.
Hi Yasuo,
thank you very much for the reply, I now start understanding better what
happen and why currently i/o cannot be handled in user's space.
Rather than an array of error, which might be quite expensive to handle,
why not a simple "char * lasterror" field added to the stream struct,
normally set to NULL, where a description of the last error occurred may
be stored? The high-level functions, that is PHP fopen, fread, fwrite,
fflush, fseek, fclose, may then check that string and, if set, trigger
E_WARNING
with a meaningful message after having reset that string back
to NULL
again. Many C libraries already provide such error strings,
starting from the same C strerror().
The same should then be done with the filters, which currently only may
return PSFS_ERR_FATAL, but might have their "char * lasterror" string
too with the same purpose. Knowing the exact reason of the failure is
essential to distinguish, for example, a dead disk drive or corrupted file
system, from a bare file mistakenly opened with the wrong decompressor
or a file which was badly encoded.
Then all the specific stream and filter implementations can be adjusted
to set that error string; for those still not fixed that string remains
NULL
and fopen, ferror, etc. will not report the error just as they do
today, and will be fixed later step by step.
Just an idea.
Regards,
/|\ Umberto Salsi
/_/ www.icosaedro.it
Hi Umberto,
thank you very much for the reply, I now start understanding better what
happen and why currently i/o cannot be handled in user's space.
You're welcome.
I think you have pointed out important missing parts in PHP.
Rather than an array of error, which might be quite expensive to handle,
why not a simple "char * lasterror" field added to the stream struct,
normally set to NULL, where a description of the last error occurred may
be stored? The high-level functions, that is PHP fopen, fread, fwrite,
fflush, fseek, fclose, may then check that string and, if set, trigger
E_WARNING
with a meaningful message after having reset that string back
toNULL
again. Many C libraries already provide such error strings,
starting from the same C strerror().
Pgsql's "notice" logging is implemented this way at first. Then I changed
it to array of notices recently. (Notice is raised during
transaction, for example)
It's very difficult or impossible to detect where something wrong happened
if there is only the last error. That's the reason why pgsql module is changed
to have array of notices.
The same would happen in streams, so it's better to store all errors during
operation. Zend hash is extremely fast and good enough for logging errors,
so we don't have to worry overheads.
The same should then be done with the filters, which currently only may
return PSFS_ERR_FATAL, but might have their "char * lasterror" string
too with the same purpose. Knowing the exact reason of the failure is
essential to distinguish, for example, a dead disk drive or corrupted file
system, from a bare file mistakenly opened with the wrong decompressor
or a file which was badly encoded.
I agree.
Then all the specific stream and filter implementations can be adjusted
to set that error string; for those still not fixed that string remains
NULL
and fopen, ferror, etc. will not report the error just as they do
today, and will be fixed later step by step.
Agree here, too.
Someone have to volunteer for this. It's not difficult for people on this list.
I have too many backlogs for PHP project, so I hope someone volunteers
for this.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Umberto,
thank you very much for the reply, I now start understanding better what
happen and why currently i/o cannot be handled in user's space.You're welcome.
I think you have pointed out important missing parts in PHP.Rather than an array of error, which might be quite expensive to handle,
why not a simple "char * lasterror" field added to the stream struct,
normally set to NULL, where a description of the last error occurred may
be stored? The high-level functions, that is PHP fopen, fread, fwrite,
fflush, fseek, fclose, may then check that string and, if set, trigger
E_WARNING
with a meaningful message after having reset that string back
toNULL
again. Many C libraries already provide such error strings,
starting from the same C strerror().Pgsql's "notice" logging is implemented this way at first. Then I changed
it to array of notices recently. (Notice is raised during
transaction, for example)It's very difficult or impossible to detect where something wrong happened
if there is only the last error. That's the reason why pgsql module is changed
to have array of notices.The same would happen in streams, so it's better to store all errors during
operation. Zend hash is extremely fast and good enough for logging errors,
so we don't have to worry overheads.The same should then be done with the filters, which currently only may
return PSFS_ERR_FATAL, but might have their "char * lasterror" string
too with the same purpose. Knowing the exact reason of the failure is
essential to distinguish, for example, a dead disk drive or corrupted file
system, from a bare file mistakenly opened with the wrong decompressor
or a file which was badly encoded.I agree.
Then all the specific stream and filter implementations can be adjusted
to set that error string; for those still not fixed that string remains
NULL
and fopen, ferror, etc. will not report the error just as they do
today, and will be fixed later step by step.Agree here, too.
Someone have to volunteer for this. It's not difficult for people on this list.
I have too many backlogs for PHP project, so I hope someone volunteers
for this.Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net--
I think what we should do is sit around the table with people
interested (Daniel Lowrey may be one of them), and plan a full rewrite
of streams for PHP next major (PHP 8). I don't think addind stuff to
next minors is a good idea, knowing we must prevent BC breaks in such
versions. We'll probably have to introduce some BC breaks is our tidy.
All our stream API is missing lots of features, and failing in many parts.
We could as well add to the reflection Async I/O and scatter/gather,
something I already thought about
for PHP 7, but found no time nor motivation nor friend contributors to
push to the level of quality we expect it for PHP.
While beeing perfectly valid, fixing just one little problem among
many others is not a durable solution for our stream API,
which is old, patched everywhere, but pretty stable and serving its
purpose as-is.
There are lots of problems in extensions as well, regarding streams,
where it is far from beeing easy to change one of the stream layer
behavior, or implement your own.
This should get addressed as well.
Julien.Pauli
Hi Umberto,
On Fri, Jan 22, 2016 at 9:49 PM, Umberto Salsi salsi@icosaedro.it
wrote:thank you very much for the reply, I now start understanding better
what
happen and why currently i/o cannot be handled in user's space.You're welcome.
I think you have pointed out important missing parts in PHP.Rather than an array of error, which might be quite expensive to
handle,
why not a simple "char * lasterror" field added to the stream struct,
normally set to NULL, where a description of the last error occurred
may
be stored? The high-level functions, that is PHP fopen, fread, fwrite,
fflush, fseek, fclose, may then check that string and, if set, trigger
E_WARNING
with a meaningful message after having reset that string back
toNULL
again. Many C libraries already provide such error strings,
starting from the same C strerror().Pgsql's "notice" logging is implemented this way at first. Then I
changed
it to array of notices recently. (Notice is raised during
transaction, for example)It's very difficult or impossible to detect where something wrong
happened
if there is only the last error. That's the reason why pgsql module is
changed
to have array of notices.The same would happen in streams, so it's better to store all errors
during
operation. Zend hash is extremely fast and good enough for logging
errors,
so we don't have to worry overheads.The same should then be done with the filters, which currently only may
return PSFS_ERR_FATAL, but might have their "char * lasterror" string
too with the same purpose. Knowing the exact reason of the failure is
essential to distinguish, for example, a dead disk drive or corrupted
file
system, from a bare file mistakenly opened with the wrong decompressor
or a file which was badly encoded.I agree.
Then all the specific stream and filter implementations can be adjusted
to set that error string; for those still not fixed that string remains
NULL
and fopen, ferror, etc. will not report the error just as they do
today, and will be fixed later step by step.Agree here, too.
Someone have to volunteer for this. It's not difficult for people on
this list.
I have too many backlogs for PHP project, so I hope someone volunteers
for this.Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net--
I think what we should do is sit around the table with people
interested (Daniel Lowrey may be one of them), and plan a full rewrite
of streams for PHP next major (PHP 8). I don't think addind stuff to
next minors is a good idea, knowing we must prevent BC breaks in such
versions. We'll probably have to introduce some BC breaks is our tidy.All our stream API is missing lots of features, and failing in many parts.
We could as well add to the reflection Async I/O and scatter/gather,
something I already thought about
for PHP 7, but found no time nor motivation nor friend contributors to
push to the level of quality we expect it for PHP.While beeing perfectly valid, fixing just one little problem among
many others is not a durable solution for our stream API,
which is old, patched everywhere, but pretty stable and serving its
purpose as-is.There are lots of problems in extensions as well, regarding streams,
where it is far from beeing easy to change one of the stream layer
behavior, or implement your own.
This should get addressed as well.
I totally agree with you here.
The same applies imho to the session (about the other rfc targetting 7.1 or
other).
Hi Pierre,
I totally agree with you here.
I agree with Julien, too.
The same applies imho to the session (about the other rfc targetting 7.1 or
other).
I can understand your opinion for session RFC you've mentioned. The
problem is a little more complex because it related to
security/stability of session...
BTW, I don't think we need complete rewrite for session. It would not
differ much as new one should have state management because of user
save handlers anyway. Session module has many issues, but fundamental
design is good enough. It's just missing some critical parts, needs
some cleanups. IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Le 26/01/2016 11:50, Julien Pauli a écrit :
I think what we should do is sit around the table with people
interested (Daniel Lowrey may be one of them), and plan a full rewrite
of streams for PHP next major (PHP 8). I don't think addind stuff to
next minors is a good idea, knowing we must prevent BC breaks in such
versions. We'll probably have to introduce some BC breaks is our tidy.All our stream API is missing lots of features, and failing in many parts.
We could as well add to the reflection Async I/O and scatter/gather,
+1 for a major rewrite in version 8. Examining every streams-related
feature requests would be a good start, IMO. I also have several
improvements in mind. So, I'd be glad to be included in the process.
Regards
François