Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:1144 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 73280 invoked from network); 1 May 2003 23:22:57 -0000 Received: from unknown (HELO mail.3gstech.com) (216.239.132.98) by pb1.pair.com with SMTP; 1 May 2003 23:22:57 -0000 Received: from [10.100.10.90] (ip12-162-2-6.us01.qualys.com [12.162.2.6]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by mail.3gstech.com (Postfix) with ESMTP id CB0A1A7EF0B; Thu, 1 May 2003 16:22:55 -0700 (PDT) To: internals@lists.php.net Content-Type: text/plain Organization: Message-ID: <1051831866.2944.10.camel@hemna.uh.nu> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 01 May 2003 16:31:06 -0700 Content-Transfer-Encoding: 7bit Subject: xmldoc() takes ages From: waboring@3gstech.com (Walt Boring) Howdy, I have a 12Meg xml file that I am trying to get PHP 4.3.1 to parse using libxml2 with xmldoc(), and it takes ~ 11 minutes to do this on a P4 1.8Hhz box. The same operation in libxml2's python interface takes 6 seconds. here is my code. Any ideas why it takes 11 minutes? 12Megs of valid xml is large, but not 11 minutes worth of large. The output [waboring@hemna bin]$ time php -f xml.php PHP Notice: DEBUG:(16:04:00):/home/waboring/devel/freemap/bin/xml.php::(11) parsing xml file of len = 11223576 in /home/waboring/devel/freemap/lib/util.inc on line 784 PHP Notice: DEBUG:(16:13:55):/home/waboring/devel/freemap/bin/xml.php::(16) GOT THE DOC in /home/waboring/devel/freemap/lib/util.inc on line 784
TOTAL EXECUTION TIME: 595.5
real 9m56.163s user 8m57.790s sys 0m4.280s python tst.py that comes with libxml2-python package. #!/usr/bin/python -u import sys import libxml2 # Memory debug specific libxml2.debugMemory(1) doc = libxml2.parseFile("tst.xml") if doc.name != "tst.xml": print "doc.name failed" sys.exit(1) root = doc.children print "doc.name = "+root.name #if root.name != "doc": # print "root.name failed" # sys.exit(1) child = root.children #if child.name != "foo": # print "child.name failed" # sys.exit(1) doc.freeDoc() # Memory debug specific libxml2.cleanupParser() if libxml2.debugMemory(1) == 0: print "OK" else: print "Memory leak %d bytes" % (libxml2.debugMemory(1)) libxml2.dumpMemory() output of [waboring@hemna bin]$ time ./tst.py doc.name = MAP OK real 0m3.413s user 0m2.860s sys 0m0.390s Ok, so I'm not 100% sure that the python libxml2.parseFile() does EXACTLY the same php's xmldoc(), but I don't think xmldoc() should take 11 minutes. It also sucks up a ton of ram. Walt