Newsgroups: php.internals,php.xml.dev Path: news.php.net Xref: news.php.net php.internals:1082 php.xml.dev:105 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8806 invoked from network); 28 Apr 2003 20:25:06 -0000 Received: from unknown (HELO gamma) (62.242.117.189) by pb1.pair.com with SMTP; 28 Apr 2003 20:25:06 -0000 Received: from adam by gamma with local (Exim 3.35 #1 (Debian)) id 19AFBQ-0007sq-00; Mon, 28 Apr 2003 22:25:04 +0200 Date: Mon, 28 Apr 2003 22:25:03 +0200 To: Sterling Hughes Cc: internals@lists.php.net, php-xml-dev@lists.php.net, php5-dev@lists.php.net Message-ID: <20030428202503.GA30225@indexdata.com> References: <1051482765.25677.15.camel@hasele> <3EAD180D.1040908@caraveo.com> <1051540676.25677.81.camel@hasele> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1051540676.25677.81.camel@hasele> User-Agent: Mutt/1.3.28i Subject: Re: [PHP-DEV] Re: [PHP5-DEV] Replacing expat with libxml From: adam@indexdata.dk (Adam Dickmeiss) BC is important. But don't repeat the mistake of letting the code ignore encoding="..." in an XML document. Bug #23292 :) An XML parser is supposed to deal with that and produce a desired output encoding. AFAIK libxml_2_ does the right thing. While we're at it. How about libxml2's sibling libxslt? Could that replace Sablotron? -- Adam On Mon, Apr 28, 2003 at 10:37:56AM -0400, Sterling Hughes wrote: > Shane, > > As for BC: again, I agree 100%. If it breaks BC in anyway, other than a > few error codes/error messages changing, then we can fix it, or revert > back to expat. I've tested it with pear and pres2 (including my xml and > php slides), and everything seems to be in working order. However, once > I commit it, other people can start testing. > > PHP5 is going to be unstable, things will break, but we will have a very > long QA period, and a lot of people testing their apps against it to > make sure that there are no BC breaks. Couple that with the fact that > libxml2 tries to be expat "compliant," I'm pretty confident we can > squash any and all BC breaks before a PHP5 release. > > As for the other features, well, that's a discussion for a different > thread. My main concern in this one is just switching the underlying > library, that way we're shipping with a robust XML library bundled as > default. This will allow extension developers (hopefully) to work on > more XML related technologies in extension space. > > -Sterling > > On Mon, 2003-04-28 at 08:01, Shane Caraveo wrote: > > > After some discussions with various people at the PHP-Con, I decided it > > > was important that we (at least) have libxml integrated with PHP by > > > PHP5. When it comes to XML processing, expat is a legacy library and it > > > doesn't support nearly what is required for processing XML by todays > > > standards. > > > > > The current version of expat makes processing SOAP documents (for > > > example) very hard, because XML schema is not available. > > > > I'd like to talk with you about what features will be implemented in > > time for php5, as there are some specific features that would be good > > for soap (ie. pull parsing). I'm at the airport now, so later this week. > > > > > - FTP and HTTP transports (as well as an IO wrapper library like > > > Streams) > > > > Can the wrapper be made to work with the php streams? It would be nice > > for instance, to simply be able to pass php://stdin to the library and > > have it handle input. A call back to handle protocol headers, or > > alternate encodings such as dime would be necessary. Anyway, those are > > some things I'd like to discuss. > > > > > I've currently completed the first two steps of the integration. I've > > > removed the expat library from ext/xml and replaced it with libxml. > > > I've also ported the XML extension to use libxml as the underlying > > > processing engine. > > > > > > The following incompatibilities exist: > > > > > > 1) some XML_ERROR_ * constants are irrelevant (they are stilled defined, > > > but they have no meaning for libxml). > > > > Are they mappable in any way, or simply library specific errors? > > > > > 2) xml_error_get_string() just returns a blank string. This can be > > > changed in the next couple of days, i just need to implement error > > > strings ontop of libxml error codes. > > > > > > Having this library bundled internally will allow people to develop > > > other extensions which use libxml features, and will allow for future > > > extensions (for example, a fully compliant DOM extension) to easily be > > > added, without requiring extra bundling. > > > > > > Thoughts? > > > > My concern is of course BC. I think a little bit of breakage is ok, but > > by little I'm thinking < 1%. Otherwise, there needs to be some way to > > load the expat xml extension. Perhaps the expat extension functions > > could be renamed to expat_xml_*, libxml would be libxml_*, and a utility > > function such as xml_use_library(XML_EXPAT) would map xml_* functions to > > the xpat library (or libxml). The default mapping would be to libxml. > > > > Shane > -- > "Whether you think you can or think you can't -- you are right." > - Henry Ford > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php -- Adam Dickmeiss mailto:adam@indexdata.dk http://www.indexdata.dk Index Data T: +45 33410100 Mob.: 212 212 66