Hi,
I prepared a patch to replace sapi_globals' request_info post_data and
raw_post_data with a temp stream and remove support for
HTTP_RAW_POST_DATA. [1]
PROS:
- save up to 300% on post_data_len memory (on non-form POSTs)
- a local siege (c=512/512, 2.4k form/2.2k json) showed no (negative)
performance impact; see attached logs - reusable php://input stream
- ...
CONS:
- no more $HTTP_RAW_POST_DATA, where BC could easily provided with a one-liner:
$GLOBALS["HTTP_RAW_POST_DATA"]=file_get_contents("php://input"); - memory is cheap
- ???
[1] https://github.com/m6w6/php-src/compare/slim-postdata-merge
--
Regards,
Mike
Hi,
I prepared a patch to replace sapi_globals' request_info post_data and
raw_post_data with a temp stream and remove support for
HTTP_RAW_POST_DATA. [1]PROS:
- save up to 300% on post_data_len memory (on non-form POSTs)
- a local siege (c=512/512, 2.4k form/2.2k json) showed no (negative)
performance impact; see attached logs- reusable php://input stream
- ...
CONS:
- no more $HTTP_RAW_POST_DATA, where BC could easily provided with a one-liner:
$GLOBALS["HTTP_RAW_POST_DATA"]=file_get_contents("php://input");- memory is cheap
- ???
I think it's generally a good idea, but I have some concerns:
-
The whole point of PG(enable_post_data_reading) is to be able to read
the input of the fly. If I read your patch correctly, the data is
completely gobbled when you open php://input and then the temp file is
reminded. This will result in a serious performance degradation on large
inputs, as processing can only start after everything has been read. The
behavior should be different if PG(enable_post_data_reading) is false. -
I don't see SG(request_info).request_body being closed anywhere. Is
this relying just on auto-cleanup? I recall that being problematic in
debug mode unless you use php_stream_auto_cleanup(). -
The php://input streams simply forward the calls to
SG(request_info).request_body, so if you open two, the per-stream cursor
position may not match the position of the wrapped stream (even if it
matched, we wouldn't want one stream to affect the position of the
other). I don't see any easy way out here; all I can think of is is
duping the file descriptor for the temporary file in these situations.
But then the book keeping for php://input would be more complicated. -
The cast added php_stream_get_resource_id() seems unnecessary; may
hide bugs. If you just want to be able to pass void* values, better to
put an inline function there with a php_stream* parameter, that way you
would have a compiler warning if you passed anything but a php_stream*
or a void* (though this would break taking a pointer).
--
Gustavo Lopes
Hi Gustavo, thank you for your review!
Hi,
I prepared a patch to replace sapi_globals' request_info post_data and
raw_post_data with a temp stream and remove support for
HTTP_RAW_POST_DATA. [1]PROS:
- save up to 300% on post_data_len memory (on non-form POSTs)
- a local siege (c=512/512, 2.4k form/2.2k json) showed no (negative)
performance impact; see attached logs- reusable php://input stream
- ...
CONS:
- no more $HTTP_RAW_POST_DATA, where BC could easily provided with a
one-liner:
$GLOBALS["HTTP_RAW_POST_DATA"]=file_get_contents("php://input");- memory is cheap
- ???
I think it's generally a good idea, but I have some concerns:
- The whole point of PG(enable_post_data_reading) is to be able to read the
input of the fly. If I read your patch correctly, the data is completely
gobbled when you open php://input and then the temp file is reminded. This
will result in a serious performance degradation on large inputs, as
processing can only start after everything has been read. The behavior
should be different if PG(enable_post_data_reading) is false.
Ok, I thought the whole point of enable_post_data_reading is not to
swamp your memory :) We can have that behaviour of course, but then
the input stream is not reusable.
- I don't see SG(request_info).request_body being closed anywhere. Is this
relying just on auto-cleanup? I recall that being problematic in debug mode
unless you use php_stream_auto_cleanup().
Yes, list destructors take care of that (I wrote that patch in debug mode).
- The php://input streams simply forward the calls to
SG(request_info).request_body, so if you open two, the per-stream cursor
position may not match the position of the wrapped stream (even if it
matched, we wouldn't want one stream to affect the position of the other). I
True! So my input stream handles concurrency as bad as the current
implementation... 8)
don't see any easy way out here; all I can think of is is duping the file
descriptor for the temporary file in these situations. But then the book
keeping for php://input would be more complicated.
I think that shouldn't be too hard to fix. But it depends on concern no.1 :)
- The cast added php_stream_get_resource_id() seems unnecessary; may hide
bugs. If you just want to be able to pass void* values, better to put an
inline function there with a php_stream* parameter, that way you would have
a compiler warning if you passed anything but a php_stream* or a void*
(though this would break taking a pointer).
Thanks, that's a left-over, because I had void *request_body
in the
first place.
--
Regards,
Mike
Hi Gustavo, thank you for your review!
I think it's generally a good idea, but I have some concerns:
- The whole point of PG(enable_post_data_reading) is to be able to read the
input of the fly. If I read your patch correctly, the data is completely
gobbled when you open php://input and then the temp file is reminded. This
will result in a serious performance degradation on large inputs, as
processing can only start after everything has been read. The behavior
should be different if PG(enable_post_data_reading) is false.Ok, I thought the whole point of enable_post_data_reading is not to
swamp your memory :) We can have that behaviour of course, but then
the input stream is not reusable.
...
- The php://input streams simply forward the calls to
SG(request_info).request_body, so if you open two, the per-stream cursor
position may not match the position of the wrapped stream (even if it
matched, we wouldn't want one stream to affect the position of the other). ITrue! So my input stream handles concurrency as bad as the current
implementation... 8)don't see any easy way out here; all I can think of is is duping the file
descriptor for the temporary file in these situations. But then the book
keeping for php://input would be more complicated.I think that shouldn't be too hard to fix. But it depends on concern no.1 :)
Fixed. The input stream is reusable and may be used JIT.
https://github.com/m6w6/php-src/compare/slim-postdata-merge
--
Regards,
Mike
Hi all!
I think it's generally a good idea, but I have some concerns:
...
Fixed. The input stream is reusable and may be used JIT.
Anything more to add before it hits master?
PS: To pre-empt Chris, I'm about to add notes to UPGRADING{,.INTERNALS} :)
--
Regards,
Mike
I think it's generally a good idea, but I have some concerns:
...
Fixed. The input stream is reusable and may be used JIT.Anything more to add before it hits master?
PS: To pre-empt Chris, I'm about to add notes to UPGRADING{,.INTERNALS}
:)
I think it looks very good now.
Regards
--
Gustavo Lopes