Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:52369 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 34623 invoked from network); 13 May 2011 23:43:39 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 May 2011 23:43:39 -0000 Authentication-Results: pb1.pair.com header.from=andrewcurioso@gmail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=andrewcurioso@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.212.42 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: andrewcurioso@gmail.com X-Host-Fingerprint: 209.85.212.42 mail-vw0-f42.google.com Received: from [209.85.212.42] ([209.85.212.42:42741] helo=mail-vw0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B4/21-18532-922CDCD4 for ; Fri, 13 May 2011 19:43:38 -0400 Received: by vwl1 with SMTP id 1so2558020vwl.29 for ; Fri, 13 May 2011 16:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:date:x-google-sender-auth :message-id:subject:from:to:content-type; bh=hsNwt+hTwY8BFjaEnbXsC+qHgeos+Tl93d5qDy7QU3s=; b=UnNetomd8eCtPzm7QZEajNEtS/egPlJ/0KLo+2zw7zRKVqB92Mz9vc8G9NEWrD/lDj YzgGXahIXFtn0P18mDYzzy+crbnA/SQzgr7A0IITjPAK+EgKNlWUIjqYHZ6pOuutVyx8 cBq0ukoP5pLsI6UZerwHhtTDKurnMgebT7VKU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=gI6/keZsNlTkoCtWxaXOO8u5emgStOscIJAcDzjkhtltNg7EL7j10NeI2eOeHzZFJ9 37Yvl9dkpvQ4BIlk9l5IDH3KIh5zpC0wWvmo04e74yFCn8cdsoqz4GfDCKZFL0NSNBjJ uMnn9GEBxkg+GCy8szjryLbN3ZKK9b52H49fY= MIME-Version: 1.0 Received: by 10.52.187.194 with SMTP id fu2mr2725580vdc.258.1305330215036; Fri, 13 May 2011 16:43:35 -0700 (PDT) Sender: andrewcurioso@gmail.com Received: by 10.52.101.167 with HTTP; Fri, 13 May 2011 16:43:35 -0700 (PDT) Date: Fri, 13 May 2011 19:43:35 -0400 X-Google-Sender-Auth: HcASRpISFTsHEDRbvl5tP5Ln6Aw Message-ID: To: internals@lists.php.net Content-Type: multipart/mixed; boundary=bcaec548a117c8270504a330df98 Subject: SimpleXML bug with the properties hash From: andrew@andrewcurioso.com (Andrew Curioso) --bcaec548a117c8270504a330df98 Content-Type: multipart/alternative; boundary=bcaec548a117c826fd04a330df96 --bcaec548a117c826fd04a330df96 Content-Type: text/plain; charset=ISO-8859-1 I'm looking for feedback since this is my first commit to PHP and it changes some behavior of SimpleXMLElement. If no-one has an objections I'll go ahead and commit the code. First, here is the original bug: -------BEGIN------ $string = '

Blah 1

Blah 2

Blah 3

Blah 4
'; $foo = simplexml_load_string($string); $p = $foo->bar->p; echo count($p); $p = (array)$foo->bar->p; echo count($p); --------END-------- The output should be 33 but is 31 instead. If you do a var_dump() of $p you get this: -------BEGIN------ array(1) { [0]=> string(6) "Blah 1" } --------END-------- With my updated code you get this: -------BEGIN------ array(3) { [0]=> string(6) "Blah 1" [1]=> string(6) "Blah 2" [2]=> string(6) "Blah 3" } --------END-------- The same also applies if you do a var_dump() of "->p" directly (without the cast). In the current releases, the dump will contain only the first child node and not the second and third. But with my fix in place it will contain all child nodes. All the other behavior is unchanged. The code did break one test. The test for bug #51615 expected the var_dump() to contain only one child node, where the new code outputs (correctly, I think) all child nodes. So I also changed that test to have the new expected output. It does change in the behavior of var_dump() on SimpleXMLElement. Which may break people who depend on this bug (if there is anyone). However, I believe the new behavior is correct. Since in the current release code, var_dump() excludes valid object properties. The way I fixed it was to detect if the node is part of node iterator (for lack of a better word) which is meant to loop over some but not all children of a node. If it is, then I use the SimpleXML iterator functions for traversal when getting the properties hash. If it isn't, then I just use the trusty "ptr->next" method. Taking care to store and then restore the existing iterator data so as to not break any outer loops. I attached a diff of my changes. Any thoughts? -- Andrew --bcaec548a117c826fd04a330df96 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'm looking for feedback since this is my first commit to PHP and it ch= anges some behavior of SimpleXMLElement.
If no-one has an objections I&#= 39;ll go ahead and commit the code.


First, here is the original = bug:

-------BEGIN------
$string =3D '<?xml version=3D"1.0&quo= t;?>
<foo><bar>
=A0=A0 <p>Blah 1</p>
= =A0=A0 <p>Blah 2</p>
=A0=A0 <p>Blah 3</p>
=A0= =A0 <tt>Blah 4</tt>
</bar></foo>
';

$foo =3D simplexml_load_string($s= tring);
$p =3D $foo->bar->p;
echo count($p);
$p =3D (array)$= foo->bar->p;
echo count($p);
--------END--------

The output should be 33 but is 31 instead.

If you do a var_dump(= ) of $p you get this:
-------BEGIN------
array(1) {
=A0 [0]=3D>=
=A0 string(6) "Blah 1"
}
--------END--------

With my updated code you get this:
-------BEG= IN------
array(3) {
=A0 [0]=3D>
=A0 string(6) "Blah 1"= ;
=A0 [1]=3D>
=A0 string(6) "Blah 2"
=A0 [2]=3D>=A0 string(6) "Blah 3"
}
--------END--------

The same also applies if you do a var_dump() of "->p" dire= ctly (without the cast).
In the current releases, the dump will contain = only the first child node and not
the second and third. But with my fix = in place it will contain all child nodes.

All the other behavior is unchanged. The code did break one test. The t= est
for bug #51615 expected the var_dump() to contain only one child nod= e, where
the new code outputs (correctly, I think) all child nodes. So I= also changed
that test to have the new expected output.

It does change in the beh= avior of var_dump() on SimpleXMLElement. Which may
break people who depe= nd on this bug (if there is anyone).

However, I believe the new beha= vior is correct. Since in the current release code,
var_dump() excludes valid object properties.

The way I fixed it was = to detect if the node is part of node iterator (for lack of
a better wor= d) which is meant to loop over some but not all children of a node.
If it is, then I use the SimpleXML iterator functions for traversal when ge= tting the
properties hash. If it isn't, then I just use the trusty &= quot;ptr->next" method.

Taking care to store and then restor= e the existing iterator data so as to not break
any outer loops.

I attached a diff of my changes.

Any thought= s?

-- Andrew
--bcaec548a117c826fd04a330df96-- --bcaec548a117c8270504a330df98--