Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:8660 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71904 invoked by uid 1010); 22 Mar 2004 22:54:44 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 71870 invoked from network); 22 Mar 2004 22:54:44 -0000 Received: from unknown (HELO imf25aec.mail.bellsouth.net) (205.152.59.73) by pb1.pair.com with SMTP; 22 Mar 2004 22:54:44 -0000 Received: from nutextonline.com ([216.78.248.14]) by imf25aec.mail.bellsouth.net (InterMail vM.5.01.06.08 201-253-122-130-108-20031117) with ESMTP id <20040322225444.RTNM17649.imf25aec.mail.bellsouth.net@nutextonline.com> for ; Mon, 22 Mar 2004 17:54:44 -0500 Message-ID: <40598D0D.2020700@nutextonline.com> Date: Thu, 18 Mar 2004 06:50:37 -0500 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030912 X-Accept-Language: en-us, en MIME-Version: 1.0 To: internals@lists.php.net X-Enigmail-Version: 0.76.4.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------010505000505080802000704" Subject: [Fwd: Strange SimpleXML Behavior] From: lateralus@nutextonline.com (Derek Ford) --------------010505000505080802000704 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit forewarding a bug report{?} I recieved. --------------010505000505080802000704 Content-Type: message/rfc822; name="Strange SimpleXML Behavior" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Strange SimpleXML Behavior" Return-path: Envelope-to: lateralus@nutextonline.com Delivery-date: Mon, 22 Mar 2004 16:28:52 -0600 Received: from [216.92.131.5] (helo=php.net) by host.allarounddanville.com with smtp (Exim 4.24) id 1B5Xue-0000WH-DQ for lateralus@nutextonline.com; Mon, 22 Mar 2004 16:28:52 -0600 Received: (qmail 79604 invoked by alias); 22 Mar 2004 22:29:19 -0000 Delivered-To: derek@pair2.php.net Received: (qmail 79438 invoked by alias); 22 Mar 2004 22:29:18 -0000 Delivered-To: alias-scan-derek@php.net Delivered-To: derek@php.net Received: (qmail 79146 invoked from network); 22 Mar 2004 22:29:14 -0000 Received: from unknown (HELO bug.zvuk.net) (212.36.9.161) by pb2.pair.com with SMTP; 22 Mar 2004 22:29:14 -0000 Received: from host117.net100.int.lan (juno [192.168.100.117]) by bug.zvuk.net (Postfix) with ESMTP id 7FE71E9F990 for ; Tue, 23 Mar 2004 00:29:51 +0200 (EET) Subject: Strange SimpleXML Behavior From: Alek Andreev To: derek@php.net Content-Type: text/plain Message-Id: <1079994933.6285.3.camel@localhost> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Tue, 23 Mar 2004 00:35:33 +0200 Content-Transfer-Encoding: 7bit X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on pb12.pair.com X-Spam-Status: No, hits=0.0 required=9.0 tests=none autolearn=no version=2.60 Hello! I wrote an RSS parser using PHP 5.0RC1 and SimpleXML and noticed a lot of irregularities (bugs). I'll summarize them here. 1) There is no way to know if a tag actually exists or not $xml = simplexml_load_string('123123'); echo(count($xml->nonExistantElement)); // outputs 1 echo(isset($xml->nonExistantElement)); // outputs 1 echo(!empty($xml->nonExistantElement)); // outputs 1 var_dump($xml->nonExistantElement); /* outputs: object(simplexml_element)#2 (0) { } */ 2) XPath doesn't work at all for files with default namespaces. $xml = simplexml_load_file('http://jeremy.zawodny.com/blog/atom.xml'); /* Atom feed with a default namespace, the first line is: */ echo(count($xml->xpath('/feed'))); //outputs 0 $xml = simplexml_load_file('atom.xml'); /* The same feed with a removed default namespace, the first line is: */ echo(count($xml->xpath('/feed'))); //outputs 1 $xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.xml'); /* RSS 0.91 feed with no namespaces defined, is the root element */ echo(count($xml->xpath('/rss'))); // outputs 1 $xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf'); /* RSS 1.0 (RDF) with no default namespace, the first line is: */ var_dump($xml->xpath('/RDF')); // empty array var_dump($xml->xpath('/rdf:RDF')); // works as expected Unfortunately, using namespaces in xpath is not an acceptable option either, because if the namespace is not defined, the xpath engine returns a PHP warning. Avoiding the error with a @ in front of the function call results in the function always returning true. 3) Tags with namespace prefixes are not referencable. $xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf'); /* Tags of interest: ...2004-03-20T20:25:55-08:00... */ var_dump($xml); /* ... ["item"]=> array(15) { [0]=> object(simplexml_element)#3 (6) { ["title"]=> string(24) "Slashdot feature request" ... ["date"]=> (that's the ) string(25) "2004-03-20T22:51:00-08:00" } ... */ echo($xml->item[0]->title); // "Slashdot Feature Request" echo($xml->item[0]->date); // nothing echo($xml->item[0]->{'dc:date'}); // nothing var_dump($xml->item[0]->xpath('date')); // empty array var_dump($xml->item[0]->xpath('dc:date')); // works as expected I don't know C well enough to be able to fix those bugs, so can the SimpleXML maintainer take a look? Regards, Alek Andreev alek@zvuk.net --------------010505000505080802000704--