Greetings,
I recently upgraded to the latest PHP 5.3 snapshot and I found the following SoapClient bug:
<?php
ini_set("soap.wsdl_cache_enabled", 0);
new SoapClient("http://localhost/ws/catalog?wsdl");
?>
Fatal error: Uncaught SoapFault exception: [WSDL] SOAP-ERROR: Parsing WSDL: Couldn't load from 'http://localhost/ws/catalog?wsdl' : Start tag expected, '<' not found
The problem turned out to be an invalid interpretation of the HTTP/1.1 protocol with "Transfer-Encoding: chunked" by the HTTP stream context, which caused get_sdl() to parse a WSDL including the chunk tags (hex numbers).
Chunked encoding is used by Apache 2.0 when "Content-Length" is unavailable, the data content being sent is large enough, and the protocol is HTTP/1.1.
I initially tried using readfile()
but I didn't get the same problem. Eventually I could finally reproduce the bug with the following script:
<?php
$opts = array('http' => array('method' => "GET", 'header' => "Accept-language: en\r\nConnection: close\r\n"));
$context = stream_context_create($opts);
stream_context_set_option($context, "http", "protocol_version", 1.1);
fpassthru(fopen('http://localhost/ws/catalog?wsdl', 'r', false, $context));
?>
I can notice various problems here:
-
All the chunk tags are left in place and the extra newlines are not stripped, leading to corrupted data.
-
Without the "Connection: close" header the stream blocks until http timeout. It should instead detect the chunk with 0 bytes and return from the
fpassthru()
.
In the meanwhile, the following patch is a workaround for the problem I had: it restores the default HTTP/1.0.
Side note: Shouldn't the last smart_str_appendl() call also contain an EOL?
Side note #2: Is there any way to avoid repeating the same string twice? It's very common in the soap extension and I think it's really error prone.
Side note #3: Is it possible to create a test for this bug? Like a raw HTTP/1.1 response stored in a text file with chunk encoding and a script that loads that data...?
--- ext/soap/php_sdl.c.orig 2008-12-31 12:37:12.000000000 +0100
+++ ext/soap/php_sdl.c 2009-01-25 22:09:14.000000000 +0100
@@ -3192,14 +3192,16 @@
basic_authentication(this_ptr, &headers TSRMLS_CC);
/* Use HTTP/1.1 with "Connection: close" by default */
+#if 0
if (php_stream_context_get_option(context, "http", "protocol_version", &tmp) == FAILURE) {
-
zval *http_version;
-
zval *http_version; MAKE_STD_ZVAL(http_version); ZVAL_DOUBLE(http_version, 1.1); php_stream_context_set_option(context, "http", "protocol_version", http_version); zval_ptr_dtor(&http_version); smart_str_appendl(&headers, "Connection: close", sizeof("Connection: close")-1); }
+#endif
if (headers.len > 0) {
zval *str_headers;
Regards
--
Giovanni Giacobbi
The problem turned out to be an invalid interpretation of the HTTP/1.1 protocol with "Transfer-Encoding: chunked" by the HTTP stream context, which caused get_sdl() to parse a WSDL including the chunk tags (hex numbers).
Chunked encoding is used by Apache 2.0 when "Content-Length" is unavailable, the data content being sent is large enough, and the protocol is HTTP/1.1.
PHP streams do not support chunked encoding, and I assume SOAP doesn't
either (although it does reinvent bunch of things, I don't think
chunked encoding was added). I thought the docs were quite explicit on
the subject.. if not, please open a doc bug.
<?php
$opts = array('http' => array('method' => "GET", 'header' => "Accept-language: en\r\nConnection: close\r\n"));
$context = stream_context_create($opts);
stream_context_set_option($context, "http", "protocol_version", 1.1);
fpassthru(fopen('http://localhost/ws/catalog?wsdl', 'r', false, $context));
?>
Changing the protocol version is "at your own risk".
If you want a full real HTTP support, you have to use the pecl/http extension.
- Without the "Connection: close" header the stream blocks until http timeout. It should instead detect the chunk with 0 bytes and return from the
fpassthru()
.
That however is a valid bug, which I have actually seen couple of
times myself.. never debugged it and just assumed I did something
weird :]
-Hannes
Hannes Magnusson wrote:
The problem turned out to be an invalid interpretation of the HTTP/1.1 protocol with "Transfer-Encoding: chunked" by the HTTP stream context, which caused get_sdl() to parse a WSDL including the chunk tags (hex numbers).
Chunked encoding is used by Apache 2.0 when "Content-Length" is unavailable, the data content being sent is large enough, and the protocol is HTTP/1.1.
PHP streams do not support chunked encoding, and I assume SOAP doesn't
either (although it does reinvent bunch of things, I don't think
chunked encoding was added). I thought the docs were quite explicit on
the subject.. if not, please open a doc bug.
Then there is a problem, HTTP/1.1 mandates support for chunked encoding.
In Apache's case, the stream should be dechunked before being passed to cgi,
fcgid or apache2handler. The appropriate EOF state should be signalled when
the chunked stream is exhausted. So how you are seeing the chunk headers
is beyond me. Perhaps this hop-by-hop header should be eaten in apache
before passing headers to the application?
<?php
$opts = array('http' => array('method' => "GET", 'header' => "Accept-language: en\r\nConnection: close\r\n"));
$context = stream_context_create($opts);
stream_context_set_option($context, "http", "protocol_version", 1.1);
fpassthru(fopen('http://localhost/ws/catalog?wsdl', 'r', false, $context));
?>Changing the protocol version is "at your own risk".
If you want a full real HTTP support, you have to use the pecl/http extension.
Please note that the code above is only as proof of concept, because it reproduces what is currently doing php_sdl.c:3193, get_sdl() function.
Actually, after further investigation I found that this bug was introduced with 1.114 by dmitry, fixing bug #43069.
So, if I understood correctly the situation is the following:
- HTTP stream context does NOT support chunked encoding, so you don't consider it a bug
- SoapClient currently uses HTTP stream context forcing HTTP/1.1 thus enabling possible chunked encoding result
- #43069 should then be reopened waiting for a different solution.
3bis) The workaround proposed in the last comment of the bug report works, but as SoapClient is using an unsupported feature, it should be the other way around: by default 1.0 (as it is in PHP 5.2) and then it can be forced to 1.1 php side.
IMHO the only clean solution I can see is by implementing the chunked encoding to HTTP context.
@William: I'm not sure I understand your reply: Apache#1 (the one running my test code) has nothing to do in all of this, because PHP is opening a raw socket and communicating on its own with Apache#2 (the one outputting the wsdl with chunked encoding). Thus PHP has to deal with this by itself, you cannot consider it a misconfiguration of Apache#2, because it is legitimately using the HTTP protocol.
--
Giovanni Giacobbi
Giovanni Giacobbi wrote:
@William: I'm not sure I understand your reply: Apache#1 (the one running my test code) has nothing to do in all of this, because PHP is opening a raw socket and communicating on its own with Apache#2 (the one outputting the wsdl with chunked encoding). Thus PHP has to deal with this by itself, you cannot consider it a misconfiguration of Apache#2, because it is legitimately using the HTTP protocol.
And further, if the raw socket is processing HTTP/1.1, it must comply with
HTTP/1.1 and be willing to accept chunked encoding. Referring to RFC2616,
you'll observe it is not optional for either clients nor servers.
Thanks for clarifying.
Hi,
The problem turned out to be an invalid interpretation of the HTTP/1.1
protocol with "Transfer-Encoding: chunked" by the HTTP stream context,
FYI, a related bug report: http://bugs.php.net/bug.php?id=47021
regards dtg
--
_
ASCII ribbon campaign ( )
against HTML e-mail X
/ \