Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:1145 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 76294 invoked from network); 1 May 2003 23:29:35 -0000 Received: from unknown (HELO mail.3gstech.com) (216.239.132.98) by pb1.pair.com with SMTP; 1 May 2003 23:29:35 -0000 Received: from [10.100.10.90] (ip12-162-2-6.us01.qualys.com [12.162.2.6]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by mail.3gstech.com (Postfix) with ESMTP id 9620FA7EF08 for ; Thu, 1 May 2003 16:29:33 -0700 (PDT) To: internals@lists.php.net In-Reply-To: <1051831866.2944.10.camel@hemna.uh.nu> References: <1051831866.2944.10.camel@hemna.uh.nu> Content-Type: text/plain Organization: Message-ID: <1051832265.2944.12.camel@hemna.uh.nu> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 01 May 2003 16:37:45 -0700 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] xmldoc() takes ages From: waboring@3gstech.com (Walt Boring) BTW tst.xml and foo_3.xml are the same file. Walt On Thu, 2003-05-01 at 16:31, Walt Boring wrote: > Howdy, > I have a 12Meg xml file that I am trying to get PHP 4.3.1 to parse > using libxml2 with xmldoc(), and it takes ~ 11 minutes to do this on a > P4 1.8Hhz box. > > The same operation in libxml2's python interface takes 6 seconds. > > here is my code. Any ideas why it takes 11 minutes? 12Megs of valid > xml is large, but not 11 minutes worth of large. > > > //my includes here > set_time_limit( 345600 ); > > $xml = implode('', file("foo_3.xml")); > debug_message("parsing xml file of len = ".strlen($xml)); > > $doc = xmldoc( $xml ); > //it takes 11 minutes to get here! > debug_message("GOT THE DOC"); > exit; > > ?> > > The output > [waboring@hemna bin]$ time php -f xml.php > PHP Notice: > DEBUG:(16:04:00):/home/waboring/devel/freemap/bin/xml.php::(11) parsing > xml file of len = 11223576 > in /home/waboring/devel/freemap/lib/util.inc on line 784 > PHP Notice: > DEBUG:(16:13:55):/home/waboring/devel/freemap/bin/xml.php::(16) GOT THE > DOC > in /home/waboring/devel/freemap/lib/util.inc on line 784 >
TOTAL EXECUTION TIME: 595.5
> > real 9m56.163s > user 8m57.790s > sys 0m4.280s > > > > python tst.py that comes with libxml2-python package. > > > #!/usr/bin/python -u > import sys > import libxml2 > > # Memory debug specific > libxml2.debugMemory(1) > > doc = libxml2.parseFile("tst.xml") > if doc.name != "tst.xml": > print "doc.name failed" > sys.exit(1) > root = doc.children > print "doc.name = "+root.name > > #if root.name != "doc": > # print "root.name failed" > # sys.exit(1) > child = root.children > #if child.name != "foo": > # print "child.name failed" > # sys.exit(1) > doc.freeDoc() > > # Memory debug specific > libxml2.cleanupParser() > if libxml2.debugMemory(1) == 0: > print "OK" > else: > print "Memory leak %d bytes" % (libxml2.debugMemory(1)) > libxml2.dumpMemory() > > output of > [waboring@hemna bin]$ time ./tst.py > doc.name = MAP > OK > > real 0m3.413s > user 0m2.860s > sys 0m0.390s > > Ok, so I'm not 100% sure that the python libxml2.parseFile() does > EXACTLY the same php's xmldoc(), but I don't think xmldoc() should take > 11 minutes. It also sucks up a ton of ram. > > Walt >