Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:1237
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Message-ID: <20030504200015.23721.qmail@pb1.pair.com>
To: internals@lists.php.net
Reply-To: "Dmitri Dmitrienko" <dd@cron.ru>
References: <002401c31263$364d1730$3d01a8c0@vdmitri> <Pine.LNX.4.44.0305041948360.12627-100000@bambi.chregu.tv>
Date: Mon, 5 May 2003 00:00:12 +0400
Lines: 55
Subject: Re: [PHP-DEV] Re: Bundling libxml2 and expat compatibility layer
From: dd@cron.ru ("Dmitri Dmitrienko")

Christian,

>xmlParseFile does build a DOM-Tree out of your
> XML-Document, which is of
>course slower than the SAX-parsing expat is doing..

It is not obvious conclusion that any XMLDOM parsing should be slower.
More over it is competely wrong if you compare libxml2 vs expat.

If you think more you'll see that DOM-parser only allocates nodes and link
them in lists.
Should it be so much slower ??? Are you sure that allocating nodes should
slow down everything by 14 times ?
I believe it is not.

Also I don not see any reasonable explanation why libxml disposes document
so slow.
It does not need verify, it does not need parse, it's only fries nodes.
NOTHING MORE.
And takes nearly the same time as allocating/parsing and verifying.

The only obvious conclusion is that all algorithms are written
inefficiently.

I'm not against XML DOM and it's benefits. I'm against wrong and inefficient
algorithms.
People, why don't you read Donald Knouth's books ?


>_But_ libxml2 can parse your XML document in SAX-style only without
>building an DOM-Tree. Without looking at Sterling's code, I assume,
>that's what he did and you should compare this code to expat and _not_
>xmlParseFile..

libxml2-2.5.7,parser.c:10670

xmlDocPtr
xmlParseDoc(xmlChar *cur) {
    return(xmlSAXParseDoc(NULL, cur, 0));
}

I think the code excerpt shown above is a good answer to your arguments.

> libxml2 is certainly not slow, it has a well known reputation as being
> very fast for what it does.

I would not discuss reputation. It's a competely different thing.
As with performance, I still insist that libxml2 has a) pretty slow parser
due to inefficient algorithms and b) pretty slow in some other respects
including freeing documents once again due to inefficietn algorithms.


-Dmitri.