[RFC] [DISCUSSION] pecl_http

10 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Mike,

Awesome work!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of the
things mentioned previously.

I you still find anything lacking, please let me know, so I can expand the
RFC accordingly.

And of course, everything else is up for discussion.

https://wiki.php.net/rfc/pecl_http

The URL is this, right?

--
Yasuo Ohgaki
yohgaki@ohgaki.net

10 years ago by Michael Wallner — view source — reply

unread

Hi Mike,

Awesome work!

On Thu, Jan 29, 2015 at 8:14 PM, Michael Wallner <mike@php.net
mailto:mike@php.net> wrote:
I’ve rewritten the RFC for pecl_http and hopefully addressed most of
the things mentioned previously.

I you still find anything lacking, please let me know, so I can
expand the RFC accordingly.

And of course, everything else is up for discussion.
https://wiki.php.net/rfc/pecl_http

The URL is this, right?

Yes, of course! Bad mistake, sorry.

--
Regards,
Mike

10 years ago by Andrea Faulds — view source — reply

unread

Hi Mike,

I’ve rewritten the RFC for pecl_http and hopefully addressed most of the things mentioned previously.

I you still find anything lacking, please let me know, so I can expand the RFC accordingly.

The RFC is an improvement in that it covers more of what pecl/http is, but it still doesn’t answer the most important question: why? It still doesn’t answer any of the following key questions:

Why do we need pecl/http?
- Why should pecl/http be merged into PHP core?
- Why should pecl/http be enabled by default?
- Why should we have our own HTTP API and not follow PSR-7?
- What does it offer over PHP’s existing HTTP capabilities?
- Why should we merge this rather than, say, filling in gaps in PHP’s HTTP capabilities?

So, I think the RFC is still rather lacking. The Features section isn’t really any better than before, either. It only gives a sentence or two to each module, which isn’t terribly informative. Each module probably needs its own rationale, and a comparison to PHP’s existing facilities, as well.

Thanks.

--
Andrea Faulds
http://ajf.me/

10 years ago by Lester Caine — view source — reply

unread

Why do we need pecl/http?

Why should pecl/http be merged into PHP core?

Why should pecl/http be enabled by default?

Why should we have our own HTTP API and not follow PSR-7?

What does it offer over PHP’s existing HTTP capabilities?

Why should we merge this rather than, say, filling in gaps in PHP’s HTTP capabilities?

On one hand third party packages are being pushed in place of the
existing built in functions. Here a new set of built in functions are
being proposed. Having used Apache for many years and now moved to Nginx
as my 'interface', just where does this fit into the overall jigsaw.
Using Nginx to handle the static stuff and only passing dynamic calls to
php_fpm, just what http functionality is needed?

The main reason for asking this question actually relates to
implementing a server for tzdist which I can easily handle with the
Nginx/php_fpm framework, and that should serve static pages until a
change of version requires building a new data set. Is this the sort of
process that can be handled totally within PHP and does that actually
make sense where a large volume of static data is cached?

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

10 years ago by Crypto Compress — view source — reply

unread

Why should we have our own HTTP API and not follow PSR-7?

possible points:

PHP-FIG propose no implementations; pecl_http does
PHP-FIG focus on frameworks; pecl_http in core is useable without
dependencies by every simple script
PSR-7 is a moving target; pecl_http exists for ten years
PSR-7 can be complementary to pecl_http not the other way around (c
code can't use php code?)
native implementations should be faster

10 years ago by Andrea Faulds — view source — reply

unread

Why should we have our own HTTP API and not follow PSR-7?

possible points:

PHP-FIG propose no implementations; pecl_http does

native implementations should be faster

I don’t see how that’s relevant: I’m talking here about the API, not the implementation. Why should PHP’s HTTP API not be PSR-7?

PHP-FIG focus on frameworks; pecl_http in core is useable without dependencies by every simple script

Also irrelevant, there’s no reason it couldn’t use PSR-7’s API.

PSR-7 is a moving target; pecl_http exists for ten years

Fair point.

PSR-7 can be complementary to pecl_http not the other way around (c code can't use php code?)

Not necessarily true.

Andrea Faulds
http://ajf.me/

10 years ago by Larry Garfield — view source — reply

unread

Why should we have our own HTTP API and not follow PSR-7?

possible points:

PHP-FIG propose no implementations; pecl_http does

native implementations should be faster

I don’t see how that’s relevant: I’m talking here about the API, not the implementation. Why should PHP’s HTTP API not be PSR-7?

PHP-FIG focus on frameworks; pecl_http in core is useable without dependencies by every simple script

Also irrelevant, there’s no reason it couldn’t use PSR-7’s API.

PSR-7 is a moving target; pecl_http exists for ten years

Fair point.

PSR-7 can be complementary to pecl_http not the other way around (c code can't use php code?)

Not necessarily true.

Andrea Faulds
http://ajf.me/

In my rainbows and ponies vision of the future, it would go something
like this:

PSR-7 is approved. People use it. People love it.
Internals makes internal definitions of the interfaces from PSR-7,
renamed to a PHP namespace but otherwise identical. Eg:

\Php\Http\Message
\Php\Http\Request
\Php\Http\Response
\Php\Http\ServerRequest

FIG makes available an alternate set of interfaces that are PSR-7 but
extent the internal ones for type hint compatibility.

Someone (FIG, internals, me?) makes a user-space definition of the new
internals interfaces as a BC layer.

Internals improves its userspace stream APIs such that the
StreamInterface from PSR-7 can be retired completely. (It exists almost
entirely as a way to avoid dealing with the PHP-native stream APIs.)
Internals adds a function/routine/thing to spawn a ServerRequest that
is equivalent to the user-space code to create a PSR-7 ServerRequest out
of the superglobals. It is, of course, faster than doing in userspace
and also more standard. Internals also adds a "send this response
object to this stream (default stdout)" routine.
Everyone switches over to using the Internals-named interfaces and
ServerRequest builder. Because the interface is the same aside from the
name this is about a 5 minute task per project and could be done by a
small shell script.
Internals build a simple HTTP client that uses \Php\Http directly.
It's probably not as powerful as Guzzle et al but easily extensible.
Because user-space clients are already using those interfaces, that
makes swapping the new Internals one in instead is a really easy task.
Guzzle et al can convert to being extensions on the core one, all using
the common interfaces.
The Ewoks, they dance!

I don't expect it will happen exactly like that, of course, but the
closer we can get to that sort of chain of events the better I think it
will be for everyone.

Note that the interface definition parts are separate from the writing
of the HTTP client. I think it's important to address those two
separately. SRP applies to RFCs just as much as code. :-)

--Larry Garfield

10 years ago by Yasuo Ohgaki — view source — reply

unread

Hi all,

On Thu, Jan 29, 2015 at 9:18 PM, Crypto Compress <
cryptocompress@googlemail.com> wrote:

possible points:

PHP-FIG propose no implementations; pecl_http does

PHP-FIG focus on frameworks; pecl_http in core is useable without
dependencies by every simple script

PSR-7 is a moving target; pecl_http exists for ten years

PSR-7 can be complementary to pecl_http not the other way around (c code
can't use php code?)

native implementations should be faster

General pros

PHP is made for Web. API like pecl_http should be included by default.
Script implementation and module implementation can co-exists. Example
is mail/imap.

Now I understand why many people dislike this proposal probably.
Native module/script implementations may exist both. New module
may be added/replaced at any time if it's required/reasonable.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

10 years ago by Michael Wallner — view source — reply

unread

Hi!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of the things mentioned previously.

I you still find anything lacking, please let me know, so I can expand the RFC accordingly.

And of course, everything else is up for discussion.

Just a tiny note: I expanded the features section a bit.

https://wiki.php.net/rfc/pecl_http#features https://wiki.php.net/rfc/pecl_http#features

Regards,
Mike

10 years ago by Michael Wallner — view source — reply

unread

On 29 01 2015, at 12:14, Michael Wallner <mike@php.net
mailto:mike@php.net> wrote:

Hi!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of
the things mentioned previously.

I you still find anything lacking, please let me know, so I can expand
the RFC accordingly.

And of course, everything else is up for discussion.

Update:

Removed optional dependencies on all three extensions (json, iconv,
hash), and the one INI entry related to it. ETags of dynamic response
bodies would now always be just sha1 hashes.

https://wiki.php.net/rfc/pecl_http#dependencies

--
Regards,
Mike

10 years ago by Ferenc Kovacs — view source — reply

unread

On 29 01 2015, at 12:14, Michael Wallner <mike@php.net
mailto:mike@php.net> wrote:

Hi!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of
the things mentioned previously.

I you still find anything lacking, please let me know, so I can expand
the RFC accordingly.

And of course, everything else is up for discussion.

Update:

Removed optional dependencies on all three extensions (json, iconv,
hash), and the one INI entry related to it. ETags of dynamic response
bodies would now always be just sha1 hashes.

https://wiki.php.net/rfc/pecl_http#dependencies

--
Regards,
Mike

--

hi,

just a small question: what will be the upgrade path for existing pecl_http
users upgrading to 7.0?
if we would just move the current ext as is into php-src it would be no
problem for them, but based on the recent discussion that is unlikely and
if we change the userland api they will be forced to upgrade their code
(except if we are willing to provide a php7 compatible version of pecl_http
on pecl.php.net, but I think that would be a PITA for everybody).
ofc. this isn't a blocker problem, but something we should keep in mind.

--
Ferenc Kovács
@Tyr43l - http://tyrael.hu

10 years ago by Michael Wallner — view source — reply

unread

On Fri, Jan 30, 2015 at 2:45 PM, Michael Wallner <mike@php.net
mailto:mike@php.net> wrote:
https://wiki.php.net/rfc/pecl_http

hi,

just a small question: what will be the upgrade path for existing
pecl_http users upgrading to 7.0?

I'd consider PHP7's http extension a new major version, i.e. v3.

if we would just move the current ext as is into php-src it would be no
problem for them, but based on the recent discussion that is unlikely
and if we change the userland api they will be forced to upgrade their
code (except if we are willing to provide a php7 compatible version of
pecl_http on pecl.php.net http://pecl.php.net, but I think that would
be a PITA for everybody).

I definitley would stay away from having a separate PHP7 compatible PECL
version, but coming v2 releases could take measures to prepare any
transition to the PHP7 API.

ofc. this isn't a blocker problem, but something we should keep in mind.

Alas, not much has changed so far:

no automatic json_decode() into $_POST:

I doubt anybody used that

default etag hash algo for dynamic response bodies has changed to
sha1 from crc32 and cannot be changed through an INI setting

etags on dynamic content are, well, subject to change anyway,
so I don't see major troubles

missing http\QueryString::xlate()

This is a real BC break, though.

--
Regards,
Mike

10 years ago by Lester Caine — view source — reply

unread

default etag hash algo for dynamic response bodies has changed to
sha1 from crc32 and cannot be changed through an INI setting

etags on dynamic content are, well, subject to change anyway,
so I don't see major troubles

I presume that a calculated etag value can be returned as an
alternative. Something dictated by the service using this facility?
There is provision for the etag value being a version number or some
similarly defined value.

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

10 years ago by Michael Wallner — view source — reply

unread

default etag hash algo for dynamic response bodies has changed to
sha1 from crc32 and cannot be changed through an INI setting

etags on dynamic content are, well, subject to change anyway,
so I don't see major troubles

I presume that a calculated etag value can be returned as an
alternative. Something dictated by the service using this facility?
There is provision for the etag value being a version number or some
similarly defined value.

Definitely:
http://devel-m6w6.rhcloud.com/mdref/http/Env/Response/setEtag

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of the things mentioned previously.

Thank you, Michael, this is much better!

I would still like to hear more about 2 extensions, especially about
raphf - it seems to be some framework for handling persistent resources,
but it's not clear why we need persistent resources for HTTP module at
all (who keeps HTTP connections open between requests?) and the code in
the example doesn't show any significant logic there - it just basically
wraps curl functions, so I wonder why we need whole new API in core for
that?

About propro - it says that it improves by-reference access to handler
properties, which may be useful but I don't really see how it relates to
HTTP module either - why would you need by-ref access in this context?
Could you give any example of a common code pattern that was impossible
without it but is enabled with it?

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

Hi!

I’ve rewritten the RFC for pecl_http and hopefully addressed most of the things mentioned previously.

Thank you, Michael, this is much better!

I would still like to hear more about 2 extensions, especially about
raphf - it seems to be some framework for handling persistent resources,
but it's not clear why we need persistent resources for HTTP module at
all (who keeps HTTP connections open between requests?) and the code in
the example doesn't show any significant logic there - it just basically
wraps curl functions, so I wonder why we need whole new API in core for
that?

API consumers jump into mind f.e. libcurl caches connections itself, so,
we're just re-using functionality from curl... with lots of good extras,
like cached DNS entries, maybe TLS sessions etc.

I also use it in pecl/pq a PostgreSQL client, which seems a perfect fit
to me, i.e. any database extension not exposing its database handle as a
userland resource should use raphf over zend_list IMO. But Dmitry or any
other expert might of course want to say a few better words about zend_list.

raphf comes with what's needed to maintain (persistent) service handles
within objects. As already explained, we don't need extra refcounting
because the objects are already refcounted, but we could need support
for handle copy ctors to easily support object cloning, still manually,
but without mind boggling code.

The benefit of retire and wakeup handlers are already shown in the RFC,
additionally, check out the resource factory ops of pecl/pq here:
https://github.com/php/pecl-database-pq/blob/master/src/php_pqconn.c#L520-L561

Nothing super-complicated either; though, a copy constructor is not
implemented, but that might only be a matter of creating a new
connection with information out of PQconninfo().

I guess the brave one could implement similar stuff for zend_list, but
all this zend_resource business is useless for interal handles.

About propro - it says that it improves by-reference access to handler
properties, which may be useful but I don't really see how it relates to
HTTP module either - why would you need by-ref access in this context?
Could you give any example of a common code pattern that was impossible
without it but is enabled with it?

It doesn't improve it, it yet makes it possible. It's an aid for
internal classes that expose properties to userland that actually
represent state stored in foreign C struct members, and not in the zvals
in the properties table.

Hope that sounds sane in any way?!

--
Regards,
Mike

10 years ago by Michael Wallner — view source — reply

unread

Hi!

Do we want to discuss anything further before I put this to vote again?

https://wiki.php.net/rfc/pecl_http

Points explicitely marked for discussion in the RFC itself:

pecl/propro
Proxies for properties representing state in internal C structs
https://wiki.php.net/rfc/pecl_http#peclpropro
pecl/raphf
(Persistent) handle management within objects instead of resources
https://wiki.php.net/rfc/pecl_http#peclraphf
Also, take special note of the INI setting:
https://wiki.php.net/rfc/pecl_http#raphf_ini_setting
POST parsing disregarding request method, solely based on content-type
https://wiki.php.net/rfc/pecl_http#rinit

JFYI, I've create a POC for HTTP2 support:
http://git.php.net/?p=pecl/http/pecl_http.git;a=shortlog;h=refs/heads/http2-poc

This is for PHP5, though, and needs a recent libcurl built with nghttp2
support.

--
Regards,
Mike

10 years ago by pajousek@gmail.com — view source — reply

unread

Hi!

Do we want to discuss anything further before I put this to vote again?

https://wiki.php.net/rfc/pecl_http

Points explicitely marked for discussion in the RFC itself:

pecl/propro
Proxies for properties representing state in internal C structs
https://wiki.php.net/rfc/pecl_http#peclpropro

pecl/raphf
(Persistent) handle management within objects instead of resources
https://wiki.php.net/rfc/pecl_http#peclraphf
Also, take special note of the INI setting:
https://wiki.php.net/rfc/pecl_http#raphf_ini_setting

POST parsing disregarding request method, solely based on content-type
https://wiki.php.net/rfc/pecl_http#rinit

JFYI, I've create a POC for HTTP2 support:
http://git.php.net/?p=pecl/http/pecl_http.git;a=shortlog;h=refs/heads/http2-poc

This is for PHP5, though, and needs a recent libcurl built with nghttp2
support.

--
Regards,
Mike

--

Hello,

as I mentioned already in the other thread - there are currently no
coding standards related to namespace naming in
https://github.com/php/php-src/blob/master/CODING_STANDARDS and the
coding standards should probably be updated before voting, since this
extension contains namespaces.

Regards
Pavel Kouril

10 years ago by Michael Wallner — view source — reply

unread

Hello,

as I mentioned already in the other thread - there are currently no
coding standards related to namespace naming in
https://github.com/php/php-src/blob/master/CODING_STANDARDS and the
coding standards should probably be updated before voting, since this
extension contains namespaces.

Do we agree that namespaces should follow how class names look?
So, I guess it would be Http, or maybe HTTP since it's an abbreviation.

--
Regards,
Mike

10 years ago by Pierre Joye — view source — reply

unread

Hello,

as I mentioned already in the other thread - there are currently no
coding standards related to namespace naming in
https://github.com/php/php-src/blob/master/CODING_STANDARDS and the
coding standards should probably be updated before voting, since this
extension contains namespaces.

Do we agree that namespaces should follow how class names look?
So, I guess it would be Http, or maybe HTTP since it's an abbreviation.

I would go with Http\

10 years ago by Crypto Compress — view source — reply

unread

I would go with Http\

Why not the reserved Php\Http?

10 years ago by Andrea Faulds — view source — reply

unread

Hey,

I would go with Http\

Why not the reserved Php\Http?

This sounds good to me. php\ is already reserved, and it’s similar to the common community convention of vendor\packagename. (e.g. ajf\escapes.) Would work well with Composer and Packagist too, as it could be a virtual php/http package (Packagist naming convention).

Also, I’d like to say I’d prefer php\HTTP or php\http over php\Http. Capitalising an acronym doesn’t feel right to me, perhaps because case is usually significant, Following the Casing Rules Used by Titles. Of php\HTTP and php\http, php\http is probably better since the case matches that of php. It could also be PHP\HTTP, I guess, but lowercase is somehow more appealing to me.

Thoughts?

Andrea Faulds
http://ajf.me/

10 years ago by Andrey Andreev — view source — reply

unread

Hi,

Hey,

I would go with Http\

Why not the reserved Php\Http?

This sounds good to me. php\ is already reserved, and it’s similar to the common community convention of vendor\packagename. (e.g. ajf\escapes.) Would work well with Composer and Packagist too, as it could be a virtual php/http package (Packagist naming convention).

Also, I’d like to say I’d prefer php\HTTP or php\http over php\Http. Capitalising an acronym doesn’t feel right to me, perhaps because case is usually significant, Following the Casing Rules Used by Titles. Of php\HTTP and php\http, php\http is probably better since the case matches that of php. It could also be PHP\HTTP, I guess, but lowercase is somehow more appealing to me.

Thoughts?

I'm not sure about namespacing it in the first place, but otherwise I
agree with you - acronyms should be capitalised.

Some <put a popular styleguide here> nazis probably won't agree, but
capitalising only the first letter of an acronym does feel really
weird.

Cheers,
Andrey.

10 years ago by Andrea Faulds — view source — reply

unread

Hey,

I'm not sure about namespacing it in the first place, but otherwise I
agree with you - acronyms should be capitalised.

What’s your objection? We either namespace it, or we have to add some weird prefix to avoid conflicts. I’d rather we not have the prefix.

The global namespace is already plenty polluted. Sure, most extensions so far don’t use namespaces, but that’s because they were written before 5.3 came out. I think it’s time to move on. Users know what namespaces are now, I don’t think it’d cause problems.

Thanks.

Andrea Faulds
http://ajf.me/

10 years ago by Andrey Andreev — view source — reply

unread

Hi,

Hey,

I'm not sure about namespacing it in the first place, but otherwise I
agree with you - acronyms should be capitalised.

What’s your objection? We either namespace it, or we have to add some weird prefix to avoid conflicts. I’d rather we not have the prefix.

The global namespace is already plenty polluted. Sure, most extensions so far don’t use namespaces, but that’s because they were written before 5.3 came out. I think it’s time to move on. Users know what namespaces are now, I don’t think it’d cause problems.

I'm not objecting it really, just haven't had a detailed look at it
yet and so far there's no core/bundled extension that is namespaced
(the ever-present "consistency" argument ... cough).

Cheers,
Andrey.

10 years ago by pajousek@gmail.com — view source — reply

unread

Hey,

I would go with Http\

Why not the reserved Php\Http?

This sounds good to me. php\ is already reserved, and it’s similar to the common community convention of vendor\packagename. (e.g. ajf\escapes.) Would work well with Composer and Packagist too, as it could be a virtual php/http package (Packagist naming convention).

Also, I’d like to say I’d prefer php\HTTP or php\http over php\Http. Capitalising an acronym doesn’t feel right to me, perhaps because case is usually significant, Following the Casing Rules Used by Titles. Of php\HTTP and php\http, php\http is probably better since the case matches that of php. It could also be PHP\HTTP, I guess, but lowercase is somehow more appealing to me.

Thoughts?

Andrea Faulds
http://ajf.me/

--

Personally,

From my userland point of view, I would expect it to follow the same
capitalization rules as classes are supposed to follow, making it
"Php\Http".

Regards
Pavel Kouril

10 years ago by Michael Wallner — view source — reply

unread

Hey,

On 4 Feb 2015, at 17:10, Crypto Compress
cryptocompress@googlemail.com wrote:

I would go with Http\

Why not the reserved Php\Http?

This sounds good to me. php\ is already reserved, and it’s similar
to the common community convention of vendor\packagename. (e.g.
ajf\escapes.) Would work well with Composer and Packagist too, as
it could be a virtual php/http package (Packagist naming
convention).

Also, I’d like to say I’d prefer php\HTTP or php\http over
php\Http. Capitalising an acronym doesn’t feel right to me, perhaps
because case is usually significant, Following the Casing Rules
Used by Titles. Of php\HTTP and php\http, php\http is probably
better since the case matches that of php. It could also be
PHP\HTTP, I guess, but lowercase is somehow more appealing to me.

Thoughts?

Personally,

From my userland point of view, I would expect it to follow the same
capitalization rules as classes are supposed to follow, making it
"Php\Http".

So, should I make a separate vote out of this?

http
HTTP
Http
php\http
PHP\HTTP
Php\Http
PHttP

The last one was a joke actually, well, lame, I know.
Can we rule any of these out definitely?

As already mentioned, the case is not as relevant because we don't
depend on an autoloader...

--
Regards,
Mike

10 years ago by Yasuo Ohgaki — view source — reply

unread

Hi all,

On Thu, Feb 5, 2015 at 2:10 AM, Crypto Compress <
cryptocompress@googlemail.com> wrote:

Why not the reserved Php\Http?

+1

We must have the rule in CONDING_STANDARDS.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Points explicitely marked for discussion in the RFC itself:

pecl/propro
Proxies for properties representing state in internal C structs
https://wiki.php.net/rfc/pecl_http#peclpropro

pecl/raphf
(Persistent) handle management within objects instead of resources
https://wiki.php.net/rfc/pecl_http#peclraphf
Also, take special note of the INI setting:
https://wiki.php.net/rfc/pecl_http#raphf_ini_setting

I'm still not sure why we need these two, to be frank. E.g., for the
former I can kind of get it, though I don't see any use-case that really
requires going to such lengths, for the latter I'm not even sure what
the case for that is - i.e., why exactly would one need persistent HTTP
connections surviving the request?

POST parsing disregarding request method, solely based on content-type
https://wiki.php.net/rfc/pecl_http#rinit

I'm not sure what exactly happens there. Is _POST in the RFC code
populated for every kind of request automatically? That may be
unexpected, though the ability to parse incoming data as POST regardless
of method may be useful. But it'd help to have a description of what
exactly changes.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

Hi!

Points explicitely marked for discussion in the RFC itself:

pecl/propro
Proxies for properties representing state in internal C structs
https://wiki.php.net/rfc/pecl_http#peclpropro

pecl/raphf
(Persistent) handle management within objects instead of resources
https://wiki.php.net/rfc/pecl_http#peclraphf
Also, take special note of the INI setting:
https://wiki.php.net/rfc/pecl_http#raphf_ini_setting

I'm still not sure why we need these two, to be frank. E.g., for the
former I can kind of get it, though I don't see any use-case that really
requires going to such lengths, for the latter I'm not even sure what
the case for that is - i.e., why exactly would one need persistent HTTP
connections surviving the request?

Uh, for me it's actually the reverse :) While propro is nice to have, I
think raphf is far more of practical use. Why should HTTP, or even more
HTTPS or HTTP2, be any different than another service, especially when
HTTP APIs are so common nowadays.

Compare the timings accessing google 20 times sequentually:

With default of raphf.persistent_handle.limit=-1 (unlimited):
█ mike@smugmug:~$ time php -r 'for ($i=0;$i<20;++$i) {(new
http\Client("curl","google"))->enqueue(new http\Client\Request("GET",
"http://www.google.at/";))->send();}'

0.03s user 0.01s system 2% cpu 1.530 total

With raphf effectively disabled:
█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

POST parsing disregarding request method, solely based on content-type
https://wiki.php.net/rfc/pecl_http#rinit

I'm not sure what exactly happens there. Is _POST in the RFC code
populated for every kind of request automatically? That may be
unexpected, though the ability to parse incoming data as POST regardless
of method may be useful. But it'd help to have a description of what
exactly changes.

The sole code change would be removing the check for POST, i.e.
!strcasecmp(SG(request_method),"POST") so that actually any request
method with a recognized content-type (i.e. application/form-data or
application/x-www-form-urlencoded) would trigger standard post data
handling.

The RINIT of the http_env module would be removed in any way, because if
this behaviour is accepted, it's accompplished by the change described
above, and if not it makes no sense to keep for an extension bundled
with the core and maybe enabled by default.

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

think raphf is far more of practical use. Why should HTTP, or even more
HTTPS or HTTP2, be any different than another service, especially when

Which "another service"?

HTTP APIs are so common nowadays.

HTTP APIs are common, but almost none of them ever require persistent
connections. The nature of HTTP is a stateless protocol oriented for
short-lived connections (yes, I know there are exceptions, but they are
rare and mostly abuse HTTP rather than use it for intended purposes).

With default of raphf.persistent_handle.limit=-1 (unlimited):
█ mike@smugmug:~$ time php -r 'for ($i=0;$i<20;++$i) {(new
http\Client("curl","google"))->enqueue(new http\Client\Request("GET",
"http://www.google.at/";))->send();}'

I'm not sure why you need persistence here - it's all happening within
one request - or why would you make 20 connections to the same service?
If some service is used for multiple requests, it should implement
either batching or HTTP keepalive should be used, but simulating it
through keeping HTTP connection open when it is supposed to be closed by
the protocol sounds wrong. If you want to keep HTTP connection, why not
just have the client keep it?

0.03s user 0.01s system 2% cpu 1.530 total

With raphf effectively disabled:
█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

So, the difference is microscopic even here. But with proper HTTP
handling - like batch requests or keepalive - it would be even less.

The sole code change would be removing the check for POST, i.e.
!strcasecmp(SG(request_method),"POST") so that actually any request
method with a recognized content-type (i.e. application/form-data or
application/x-www-form-urlencoded) would trigger standard post data
handling.

By "standard post data handling" you mean _POST? I'm not sure it's a
good idea - it may lead some applications that assume _POST existence
means POST request into a wrong path, which may have some bad
consequences as GET and POST to the same URL may have completely
different meaning in REST application (e.g. GET may be read and POST may
be write). Why not just let the user ask for data if they need it, but
keep the environment as is for those that do not need it?

Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

Hi!

think raphf is far more of practical use. Why should HTTP, or even more
HTTPS or HTTP2, be any different than another service, especially when

Which "another service"?

Databases (see my pecl/pq example in the RFC), key/value stores, message
queues, whatever you can think of.

HTTP APIs are so common nowadays.

HTTP APIs are common, but almost none of them ever require persistent
connections. The nature of HTTP is a stateless protocol oriented for
short-lived connections (yes, I know there are exceptions, but they are
rare and mostly abuse HTTP rather than use it for intended purposes).

All of that is true, but it's also true that HTTP has Keep-Alive which
we should take advantage of if supported. Nothing else is happening here.

With default of raphf.persistent_handle.limit=-1 (unlimited):
█ mike@smugmug:~$ time php -r 'for ($i=0;$i<20;++$i) {(new
http\Client("curl","google"))->enqueue(new http\Client\Request("GET",
"http://www.google.at/";))->send();}'

I'm not sure why you need persistence here - it's all happening within
one request - or why would you make 20 connections to the same service?

To demonstrate to ou how it would work out over multiple requests.

If some service is used for multiple requests, it should implement
either batching or HTTP keepalive should be used, but simulating it
through keeping HTTP connection open when it is supposed to be closed by
the protocol sounds wrong. If you want to keep HTTP connection, why not
just have the client keep it?

Why do you think the connection should automatically be closed?
That's not the default case since HTTP/1.1, except the server is
explicitely configured to close each connection after serving.

0.03s user 0.01s system 2% cpu 1.530 total

With raphf effectively disabled:
█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

So, the difference is microscopic even here. But with proper HTTP
handling - like batch requests or keepalive - it would be even less.

Microscopic?! 50%?! Could you please elaborate? :)

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Databases (see my pecl/pq example in the RFC), key/value stores, message
queues, whatever you can think of.

HTTP and databases are principally different. HTTP protocol is stateless
message-oriented protocol, and database connection protocols have very
little in common with HTTP.

To demonstrate to ou how it would work out over multiple requests.

But this example doesn't demonstrate doing something that people
actually would or should do. And it also doesn't demonstrate a need for
persisting HTTP connections across requests. In fact, it may be harmful
for HTTP servers if connections were not closed when the user is done
with them but the server is forced to keep them open beyond the time
needed to serve the actual client.

Why do you think the connection should automatically be closed?
That's not the default case since HTTP/1.1, except the server is
explicitely configured to close each connection after serving.

If the client is done with the connection, it is natural to assume it is
closed. If it's not, then client failing to finish the communication for
some reason may leave connection in wrong state and the other client may
unsuspectingly reuse it, leading to very weird results.

█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

So, the difference is microscopic even here. But with proper HTTP
handling - like batch requests or keepalive - it would be even less.

Microscopic?! 50%?! Could you please elaborate? :)

Sorry, I looked at wrong number, there is a difference. But if you need
to talk a lot to the same service, why not just use keepalive connection
for all your requests? As I said, sharing connections between
unsuspecting clients may lead to problems.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi!

Databases (see my pecl/pq example in the RFC), key/value stores, message
queues, whatever you can think of.

HTTP and databases are principally different. HTTP protocol is stateless
message-oriented protocol, and database connection protocols have very
little in common with HTTP.

To demonstrate to ou how it would work out over multiple requests.

But this example doesn't demonstrate doing something that people
actually would or should do. And it also doesn't demonstrate a need for
persisting HTTP connections across requests. In fact, it may be harmful
for HTTP servers if connections were not closed when the user is done
with them but the server is forced to keep them open beyond the time
needed to serve the actual client.

Why do you think the connection should automatically be closed?
That's not the default case since HTTP/1.1, except the server is
explicitely configured to close each connection after serving.

If the client is done with the connection, it is natural to assume it is
closed. If it's not, then client failing to finish the communication for
some reason may leave connection in wrong state and the other client may
unsuspectingly reuse it, leading to very weird results.

█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

So, the difference is microscopic even here. But with proper HTTP
handling - like batch requests or keepalive - it would be even less.

Microscopic?! 50%?! Could you please elaborate? :)

Sorry, I looked at wrong number, there is a difference. But if you need
to talk a lot to the same service, why not just use keepalive connection
for all your requests? As I said, sharing connections between
unsuspecting clients may lead to problems.

Stas, we're talking in circles here :) Of course this is Keep-Alive.
Keep in mind that it's curl which is working under the hood, managing
alive connections for us. I just keep curl alive to take advantage of
that fact.

You can pass a persistent handle identifier (just a unique name) as
second argument to the constructor of http\Client, so you can group
related handles to max out that benefit.

--
Regards,
Mike

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

The sole code change would be removing the check for POST, i.e.
!strcasecmp(SG(request_method),"POST") so that actually any request
method with a recognized content-type (i.e. application/form-data or
application/x-www-form-urlencoded) would trigger standard post data
handling.

By "standard post data handling" you mean _POST? I'm not sure it's a
good idea - it may lead some applications that assume _POST existence
means POST request into a wrong path, which may have some bad
consequences as GET and POST to the same URL may have completely
different meaning in REST application (e.g. GET may be read and POST may
be write). Why not just let the user ask for data if they need it, but
keep the environment as is for those that do not need it?

Yes, I mean $_POST (and $_FILES). It's been requested multiple times,
but I know it's quite controversial. I think this approach is better
than any other proposed yet (think $_PUT and stuff).

If I receive form-data or www-form-urlencoded I'd like to have it
readily accessible, but that may only be my opinion, and I definitely
won't insist on it, so if people think this is too much for the naive
application testing $_POST for being a POST, I'll remove it.

Anybody else having an opinon on that matter?

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Yes, I mean $_POST (and $_FILES). It's been requested multiple times,
but I know it's quite controversial. I think this approach is better
than any other proposed yet (think $_PUT and stuff).

You're building an OO next-generation API, why you still need $_PUT or
$_FILES or anything like that? Just make a nice OO API where the client
can request the data cleanly, and hide the details of where they come
from. Which would probably also make an added value of making these
things actually testable with some minimal API support, without messing
with superglobals and general extreme ugliness.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi!

Yes, I mean $_POST (and $_FILES). It's been requested multiple times,
but I know it's quite controversial. I think this approach is better
than any other proposed yet (think $_PUT and stuff).

You're building an OO next-generation API, why you still need $_PUT or
$_FILES or anything like that? Just make a nice OO API where the client
can request the data cleanly, and hide the details of where they come
from. Which would probably also make an added value of making these
things actually testable with some minimal API support, without messing
with superglobals and general extreme ugliness.

Because PHP soaks up all data on POST and leaves us without php://input,
so on POST it would be PHP that decides what to happen, and on any other
method it would be pecl/http. That seems to cry out loud for
inconsistencies.

Anyway, that feature is not of that huge practical use, that it deserves
argumenting too long about it. I'll remove that section.

--
Regards,
Mike

10 years ago by Pierre Joye — view source — reply

unread

Hi Stas!

Hi!

Points explicitely marked for discussion in the RFC itself:

pecl/propro
Proxies for properties representing state in internal C structs
https://wiki.php.net/rfc/pecl_http#peclpropro

pecl/raphf
(Persistent) handle management within objects instead of resources
https://wiki.php.net/rfc/pecl_http#peclraphf
Also, take special note of the INI setting:
https://wiki.php.net/rfc/pecl_http#raphf_ini_setting

I'm still not sure why we need these two, to be frank. E.g., for the
former I can kind of get it, though I don't see any use-case that really
requires going to such lengths, for the latter I'm not even sure what
the case for that is - i.e., why exactly would one need persistent HTTP
connections surviving the request?

Uh, for me it's actually the reverse :) While propro is nice to have, I
think raphf is far more of practical use. Why should HTTP, or even more
HTTPS or HTTP2, be any different than another service, especially when
HTTP APIs are so common nowadays.

Compare the timings accessing google 20 times sequentually:

With default of raphf.persistent_handle.limit=-1 (unlimited):
█ mike@smugmug:~$ time php -r 'for ($i=0;$i<20;++$i) {(new
http\Client("curl","google"))->enqueue(new http\Client\Request("GET",
"http://www.google.at/";))->send();}'

0.03s user 0.01s system 2% cpu 1.530 total

With raphf effectively disabled:
█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

While I like the idea, I would not take it as it. Many things could affect
it and I am not sure the persistent resource is what spare times. Any
profiling info with delta?

10 years ago by Michael Wallner — view source — reply

unread

Hi Pierre!

On Feb 5, 2015 3:17 PM, "Michael Wallner" <mike@php.net
mailto:mike@php.net> wrote:

Compare the timings accessing google 20 times sequentually:

With default of raphf.persistent_handle.limit=-1 (unlimited):
█ mike@smugmug:~$ time php -r 'for ($i=0;$i<20;++$i) {(new
http\Client("curl","google"))->enqueue(new http\Client\Request("GET",
"http://www.google.at/";))->send();}'

0.03s user 0.01s system 2% cpu 1.530 total

With raphf effectively disabled:
█ mike@smugmug:~$ time php -d raphf.persistent_handle.limit=0 -r 'for
($i=0;$i<20;++$i) {(new http\Client("curl","google"))->enqueue(new
http\Client\Request("GET", "http://www.google.at/";))->send();}'

0.04s user 0.01s system 1% cpu 2.790 total

While I like the idea, I would not take it as it. Many things could
affect it and I am not sure the persistent resource is what spare times.
Any profiling info with delta?

Does the following kcachegrind screenshot give an idea (I used a minimum
node cost of 10% to simplify the graph)?

Left is raphf enabled (24M Ir) and on the right raphf disabled (35M Ir):
http://dev.iworks.at/ext-http/raphf.png

Have a look on the top-most far-right highlighted block, which is solely
devoted to tearing up curl instances when raphf is disabled.

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Does the following kcachegrind screenshot give an idea (I used a minimum
node cost of 10% to simplify the graph)?

Left is raphf enabled (24M Ir) and on the right raphf disabled (35M Ir):
http://dev.iworks.at/ext-http/raphf.png

Have a look on the top-most far-right highlighted block, which is solely
devoted to tearing up curl instances when raphf is disabled.

I still don't understand why the comparison is made against worst
possible implementation (going through all connection cycle every time)
as opposed to logical implementation of HTTP connection object
supporting keepalive.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

Hi!

Does the following kcachegrind screenshot give an idea (I used a minimum
node cost of 10% to simplify the graph)?

Left is raphf enabled (24M Ir) and on the right raphf disabled (35M Ir):
http://dev.iworks.at/ext-http/raphf.png

Have a look on the top-most far-right highlighted block, which is solely
devoted to tearing up curl instances when raphf is disabled.

I still don't understand why the comparison is made against worst
possible implementation (going through all connection cycle every time)
as opposed to logical implementation of HTTP connection object
supporting keepalive.

Uhm, I'm not sure I understand :-? Weren't I supposed to measure exacly
that? Let me know, if you wanted something else to be compared.

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Uhm, I'm not sure I understand :-? Weren't I supposed to measure exacly
that? Let me know, if you wanted something else to be compared.

I wanted to know why we need persistent resources. You brought comparing
persistent resources to reopening connection each time as an argument
that we need persistent resources. This, however, is not a good argument
for persistent resources, as the same performance improvement (or nearly
the same, discounting the closing/opening between requests - which may
be necessary anyway, see below) can be achieved without having
persistent resources, just by implementing HTTP keepalive within the
same connection object. It would also make it clearer who owns the
connection and in which state it is, right now I'm not sure what exactly
ensures the client can not end up with somebody else's connection in an
unclean state.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi Stas!

Hi!

Uhm, I'm not sure I understand :-? Weren't I supposed to measure exacly
that? Let me know, if you wanted something else to be compared.

I wanted to know why we need persistent resources. You brought comparing
persistent resources to reopening connection each time as an argument
that we need persistent resources. This, however, is not a good argument
for persistent resources, as the same performance improvement (or nearly
the same, discounting the closing/opening between requests - which may
be necessary anyway, see below) can be achieved without having
persistent resources, just by implementing HTTP keepalive within the
same connection object. It would also make it clearer who owns the
connection and in which state it is, right now I'm not sure what exactly
ensures the client can not end up with somebody else's connection in an
unclean state.

Are you saying performance is not the reason we use persistent handles?
Stas, I really don't understand what's the issue here for you.

Youself said that HTTP is a stateless protocol, so how would a
connection in an "unclean state" look like in your opinion?

There are just two states, alive and closed. I'm also not sure who that
"somebody else" might be, whose connection I'm about to re-use?

Curl caches connections the servers are fine with keeping alive, and I
cache curl handles grouped by the id you pass to the client constructor
and the authority of the url, that's all, nothing spooky.

--
Regards,
Mike

10 years ago by Stanislav Malyshev — view source — reply

unread

Hi!

Are you saying performance is not the reason we use persistent handles?

It is, for databases where connection setup is expensive. Even then
persistent handles are not always the best solution. But with DB, you
routinely connect to one service, with one set of credentials, and need
this connection constantly. With HTTP, it is rarely the case that you
want to maintain the connection to the same service for an extended time
(like hours or even days).

Stas, I really don't understand what's the issue here for you.

The issue is that I think maintaining long-time persistent HTTP
connections (I do not mean keepalive connection that serves a number of
requests within the context of one workload, like browser does) is not a
good idea, in fact it looks suspiciously like a DOS since many HTTP
servers, including Apache, are not equipped properly to handle such
model. While there may be corner cases where it may be useful,
encouraging the practice looks like a mistake to me.

Youself said that HTTP is a stateless protocol, so how would a
connection in an "unclean state" look like in your opinion?

Connection and protocol are different things. In connection, you could
be in the middle of the protocol - i.e. sending headers, sending body,
receiving headers, receiving body - and these are different states.

Curl caches connections the servers are fine with keeping alive, and I
cache curl handles grouped by the id you pass to the client constructor
and the authority of the url, that's all, nothing spooky.

Caching connections within the same request and reusing them is not
spooky, but caching them long term, across requests, across security
domains, for extended time - is spooky.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Michael Wallner — view source — reply

unread

Hi!

Are you saying performance is not the reason we use persistent handles?

It is, for databases where connection setup is expensive. Even then
persistent handles are not always the best solution. But with DB, you
routinely connect to one service, with one set of credentials, and need
this connection constantly. With HTTP, it is rarely the case that you
want to maintain the connection to the same service for an extended time
(like hours or even days).

The server dictates this ony you anyway. It's only leveraging Keep-Alive
and it is the server that's ultimately deciding whether it allows a
connection to stay open for a limited amount of time.

Stas, I really don't understand what's the issue here for you.

The issue is that I think maintaining long-time persistent HTTP
connections (I do not mean keepalive connection that serves a number of
requests within the context of one workload, like browser does) is not a
good idea, in fact it looks suspiciously like a DOS since many HTTP
servers, including Apache, are not equipped properly to handle such
model. While there may be corner cases where it may be useful,
encouraging the practice looks like a mistake to me.

There won't be any forced-open connections. Connections won't be cached
if the server doesn't support or allow Keep-Alive. Utilizing Keep-Alive
is not encouraged per-se, it's just an offer, because it is not enabled
by default.

--
Regards,
Mike

10 years ago by Michael Wallner — view source — reply

unread

Youself said that HTTP is a stateless protocol, so how would a
connection in an "unclean state" look like in your opinion?

Connection and protocol are different things. In connection, you could
be in the middle of the protocol - i.e. sending headers, sending body,
receiving headers, receiving body - and these are different states.

Curl caches connections the servers are fine with keeping alive, and I
cache curl handles grouped by the id you pass to the client constructor
and the authority of the url, that's all, nothing spooky.

Caching connections within the same request and reusing them is not
spooky, but caching them long term, across requests, across security
domains, for extended time - is spooky.

Everything you said also applies to database connections; there are even
database services that work over HTTP.

I'm still having a fairly hard time groking the issue at hand, sorry.

--
Regards,
Mike

10 years ago by Sanford Whiteman — view source — reply

unread

Caching connections within the same request and reusing them is not
spooky, but caching them long term, across requests, across security
domains, for extended time - is spooky.

This is exactly what reverse proxies like Nginx and the Akamai CDN do:
reuse the connection between the proxy and origin even after the
browser endpoint has disconnected from the proxy.

Like Mike said, it isn't spooky, since the reuse of an HTTP persistent
connection makes no claim about HTTP state. Heck, Firefox could take
over Chrome's HTTP connections and it would still be to-spec,
regardless of whether they shared credentials. Of course it is
incumbent on the remote server to keep things stateless above the TCP
level, but if it can't do that it shouldn't advertise persistent
connections.

-- Sandy

[RFC] [DISCUSSION] pecl_http

-- Lester Caine - G8HFL

Not necessarily true.

Not necessarily true.

-- Lester Caine - G8HFL

Thoughts?

Thanks.

Thoughts?

Sorry, I looked at wrong number, there is a difference. But if you need to talk a lot to the same service, why not just use keepalive connection for all your requests? As I said, sharing connections between unsuspecting clients may lead to problems.

Caching connections within the same request and reusing them is not spooky, but caching them long term, across requests, across security domains, for extended time - is spooky.

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

Sorry, I looked at wrong number, there is a difference. But if you need
to talk a lot to the same service, why not just use keepalive connection
for all your requests? As I said, sharing connections between
unsuspecting clients may lead to problems.

Caching connections within the same request and reusing them is not
spooky, but caching them long term, across requests, across security
domains, for extended time - is spooky.