Newsgroups: php.internals,php.xml.dev Path: news.php.net Xref: news.php.net php.internals:1055 php.xml.dev:95 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 14145 invoked from network); 27 Apr 2003 23:55:31 -0000 Received: from unknown (HELO carmine.bestweb.net) (209.94.102.73) by pb1.pair.com with SMTP; 27 Apr 2003 23:55:31 -0000 Received: from [192.168.1.100] (ip216-179-71-153.cust.bestweb.net [216.179.71.153]) by carmine.bestweb.net (Postfix) with ESMTP id 843D423273; Sun, 27 Apr 2003 18:55:26 -0500 (EST) To: internals@lists.php.net Cc: php-xml-dev@lists.php.net, php5-dev@lists.php.net, "Stig S. Bakken" , "Thies C. Arntzen" Content-Type: text/plain Organization: Message-ID: <1051482765.25677.15.camel@hasele> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 27 Apr 2003 18:32:45 -0400 Content-Transfer-Encoding: 7bit Subject: Replacing expat with libxml From: sterling@bumblebury.com (Sterling Hughes) Hi, After some discussions with various people at the PHP-Con, I decided it was important that we (at least) have libxml integrated with PHP by PHP5. When it comes to XML processing, expat is a legacy library and it doesn't support nearly what is required for processing XML by todays standards. The current version of expat makes processing SOAP documents (for example) very hard, because XML schema is not available. On the other hand libxml is a very robust xml processor, providing support for: - XML Schema - DOM - Validation (against a DTD or Schema) - SAX - XML Catalog - Docbook and HTML Parsers - XPath, XPointer, XInclude - Base XML datatypes (XML Schema Part 2) - FTP and HTTP transports (as well as an IO wrapper library like Streams) I've currently completed the first two steps of the integration. I've removed the expat library from ext/xml and replaced it with libxml. I've also ported the XML extension to use libxml as the underlying processing engine. The following incompatibilities exist: 1) some XML_ERROR_ * constants are irrelevant (they are stilled defined, but they have no meaning for libxml). 2) xml_error_get_string() just returns a blank string. This can be changed in the next couple of days, i just need to implement error strings ontop of libxml error codes. Having this library bundled internally will allow people to develop other extensions which use libxml features, and will allow for future extensions (for example, a fully compliant DOM extension) to easily be added, without requiring extra bundling. Thoughts? -Sterling -- "The computer programmer is a creator of universes for which he alone is responsible. Universes of virtually unlimited complexity can be created in the form of computer programs." - Joseph Weizenbaum