Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:15044 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 31172 invoked by uid 1010); 17 Feb 2005 11:41:37 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 31044 invoked from network); 17 Feb 2005 11:41:36 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Feb 2005 11:41:36 -0000 X-Host-Fingerprint: 212.55.202.210 host-210.bitflux.ch Linux 2.4/2.6 Received: from ([212.55.202.210:57080] helo=devel.bitflux.ch) by pb1.pair.com (ecelerity 1.2 (r4437)) with SMTP id 7E/87-21802-EE284124 for ; Thu, 17 Feb 2005 06:41:34 -0500 Received: from localhost (localhost [127.0.0.1]) by devel.bitflux.ch (Postfix) with ESMTP id 1C4D59F39E; Thu, 17 Feb 2005 12:41:21 +0100 (CET) Received: from devel.bitflux.ch ([127.0.0.1]) by localhost (devel.bitflux.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 12634-01; Thu, 17 Feb 2005 12:41:05 +0100 (CET) Received: from [192.168.84.92] (host-195.bitflux.ch [212.55.202.195]) by devel.bitflux.ch (Postfix) with ESMTP id 8E2279F6ED; Thu, 17 Feb 2005 12:40:37 +0100 (CET) Message-ID: <421482B3.20504@bitflux.ch> Date: Thu, 17 Feb 2005 12:40:35 +0100 User-Agent: Mozilla Thunderbird 1.0 (Macintosh/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Joe Orton Cc: internals@lists.php.net References: <20050217112613.GA30445@redhat.com> In-Reply-To: <20050217112613.GA30445@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at bitflux.ch Subject: Re: [PHP-DEV] [PATCH] ext/xml/compat.c fix for #32001 From: chregu@bitflux.ch (Christian Stocker) On 17.2.2005 12:26 Uhr, Joe Orton wrote: > libxml2's charset encoding auto-detection mode is broken with the push > parser in current versions of libxml2, I found that recently: > > http://bugzilla.gnome.org/show_bug.cgi?id=162613 > > but trying to force it can trigger infinite loops in libxml2, which is > what happens in http://bugs.php.net/?id=32001 > > So I think it's best to not force this mode. Future versions of libxml2 > will set parser->charset to XML_CHAR_ENCODING_NONE by default with the > push parser and will hence work as desired with no explicit setting of > parser->charset required. Any BC breaks with that? Do I have to know now the encoding of the XML document, before I can use the push parser? But reading your bugreport at gnome.org, I assume it just defaults to UTF-8, right? chregu > > Is this patch OK? > > http://www.apache.org/~jorton/php_xmlenc.diff > > Index: ext/xml/compat.c > =================================================================== > RCS file: /repository/php-src/ext/xml/compat.c,v > retrieving revision 1.32.2.7 > diff -u -r1.32.2.7 compat.c > --- ext/xml/compat.c 17 Dec 2004 12:21:34 -0000 1.32.2.7 > +++ ext/xml/compat.c 17 Feb 2005 11:12:08 -0000 > @@ -379,8 +379,6 @@ > } > if (encoding != NULL) { > parser->parser->encoding = xmlStrdup(encoding); > - } else { > - parser->parser->charset = XML_CHAR_ENCODING_NONE; > } > parser->parser->replaceEntities = 1; > parser->parser->wellFormed = 0; > Index: ext/xml/tests/bug32001.phpt > =================================================================== > RCS file: ext/xml/tests/bug32001.phpt > diff -N ext/xml/tests/bug32001.phpt > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ ext/xml/tests/bug32001.phpt 17 Feb 2005 11:12:08 -0000 > @@ -0,0 +1,40 @@ > +--TEST-- > +Bug #32001 (infinite loop in libxml character encoding detection) > +--FILE-- > + +$myparser = xml_parser_create(''); > +$simple = "simple note"; > +xml_parse_into_struct($myparser, $simple, $myvals, $mytags); > +var_dump($myvals); > +--EXPECT-- > +array(3) { > + [0]=> > + array(3) { > + ["tag"]=> > + string(4) "PARA" > + ["type"]=> > + string(4) "open" > + ["level"]=> > + int(1) > + } > + [1]=> > + array(4) { > + ["tag"]=> > + string(4) "NOTE" > + ["type"]=> > + string(8) "complete" > + ["level"]=> > + int(2) > + ["value"]=> > + string(11) "simple note" > + } > + [2]=> > + array(3) { > + ["tag"]=> > + string(4) "PARA" > + ["type"]=> > + string(5) "close" > + ["level"]=> > + int(1) > + } > +} > -- christian stocker | Bitflux GmbH | schoeneggstrasse 5 | ch-8004 zurich phone +41 1 240 56 70 | mobile +41 76 561 88 60 | fax +41 1 240 56 71 http://www.bitflux.ch | chregu@bitflux.ch | gnupg-keyid 0x5CE1DECB