Hello,
If I have an XML document such:
.
<para>
This is some <i>test</i> text.
</para>
.
When I use simplexml_load_string and then do:
print( $obj->para );
The result is:
"This is some text."
Note that the embedded <i></i> text is missing. Internally, the cast is
made using the xmlNodeListGetString function. Is there any reason that
it makes more sense to do that rather than use the xmlNodeGetContent
function. The latter function returns the complete text. I am a newbie
to most of this, so there may be many reasons for the choice.
If the former method is preferred, would it be possible to add another
method for retrieving the text of a node using the xmlNodeGetContent
function. I have added such a method to my beta 3 installation called
toString() and it works fine. I would probably rename it for general
redistribution.
Thanks in advance!
Blake Schwendiman
Software Development
:: http://www.lulu.com/intechrabooks
:: In-depth software development books priced right
Note that the embedded <i></i> text is missing. Internally, the cast is
made using the xmlNodeListGetString function. Is there any reason that
it makes more sense to do that rather than use the xmlNodeGetContent
function. The latter function returns the complete text. I am a newbie
to most of this, so there may be many reasons for the choice.
I would say this is a bug and we should switch to xmlNodeGetContent
instead.
-adam
Hello Adam,
Wednesday, January 7, 2004, 10:54:24 PM, you wrote:
Note that the embedded <i></i> text is missing. Internally, the cast is
made using the xmlNodeListGetString function. Is there any reason that
it makes more sense to do that rather than use the xmlNodeGetContent
function. The latter function returns the complete text. I am a newbie
to most of this, so there may be many reasons for the choice.
I would say this is a bug and we should switch to xmlNodeGetContent
instead.
From my opinion the current behavior is perfect because i see simplexml from
an xml developers side and not from an html developers side. The former must
typically know exactly where his strings comme from while the latter has
only text to deal with and often has situations where he needs to filter out
formatting tags like the <i> in the example.
So i'd say let us add a method for returning the complete content. Adam
could you do that?
Best regards,
Marcus mailto:helly@php.net
From my opinion the current behavior is perfect because i see simplexml from
an xml developers side and not from an html developers side. The former must
typically know exactly where his strings comme from while the latter has
only text to deal with and often has situations where he needs to filter out
formatting tags like the <i> in the example.
All of my SimpleXML work is strictly XML, too. However, my thought was
that I could always call strip_tags()
to eliminate the information I
didn't want, but there was no apply_tags() function to do the
reverse. :) Therefore, it was better to use the other method.
So i'd say let us add a method for returning the complete content. Adam
could you do that?
That wouldn't be too difficult (although I am busy for the next day or
two). However, as much as I loathe toggles, I'm wondering if it
wouldn't be better to make this an object-wide setting. My thought are
that on an object-by-object basis, you either always want tags or
never want them.
Something like:
$sxe = simplexml_load_file('doc.xml');
$sxe->displayTags = true;
This would keep the interface clean. Or would that just confuse things
with more magic?
Also, what would the default behavior should be? I can argue both
sides of the issue right now. :)
-adam
Hello Adam,
Thursday, January 8, 2004, 12:59:19 AM, you wrote:
From my opinion the current behavior is perfect because i see simplexml from
an xml developers side and not from an html developers side. The former must
typically know exactly where his strings comme from while the latter has
only text to deal with and often has situations where he needs to filter out
formatting tags like the <i> in the example.
All of my SimpleXML work is strictly XML, too. However, my thought was
that I could always callstrip_tags()
to eliminate the information I
didn't want, but there was no apply_tags() function to do the
reverse. :) Therefore, it was better to use the other method.
Well there's ext/SPL in PECL. A SimpleXML Object is also a Iterator and
it is a RecursiveIterator if ext/SPL is build in. Then SPL also offers a
RecursiveIteratorIterator that taks a RecursiveIterator...so it is just a
foreach-four-liner.
$text = '';
forach(new RecursiveIteratorIterator($sxe) as $el) {
$text .= $el;
}
So i'd say let us add a method for returning the complete content. Adam
could you do that?
That wouldn't be too difficult (although I am busy for the next day or
two). However, as much as I loathe toggles, I'm wondering if it
wouldn't be better to make this an object-wide setting. My thought are
that on an object-by-object basis, you either always want tags or
never want them.
Something like:
$sxe = simplexml_load_file('doc.xml');
$sxe->displayTags = true;
This would keep the interface clean. Or would that just confuse things
with more magic?
Also, what would the default behavior should be? I can argue both
sides of the issue right now. :)
The idea is really nice but you mentioned the problems yourself it is
increasing the wtf factor. Maybe we implement both getContent and getText
and do this kind of magic in the next version aka 5.1?
--
Best regards,
Marcus mailto:helly@php.net
BTW, I can send my source implementation of my toString() method which
returns the full content. Then you can do what you need in terms of
renaming, etc.
Thanks!
Blake Schwendiman
----- Original Message -----
From: "Adam Maccabee Trachtenberg" adam@trachtenberg.com
To: "Marcus Boerger" helly@php.net
Cc: "Blake Schwendiman" blake@mediafence.com; internals@lists.php.net
Sent: Wednesday, January 07, 2004 4:59 PM
Subject: Re: [PHP-DEV] SimpleXML and Default Cast To String
From my opinion the current behavior is perfect because i see simplexml
from
an xml developers side and not from an html developers side. The former
must
typically know exactly where his strings comme from while the latter has
only text to deal with and often has situations where he needs to filter
out
formatting tags like the <i> in the example.All of my SimpleXML work is strictly XML, too. However, my thought was
that I could always callstrip_tags()
to eliminate the information I
didn't want, but there was no apply_tags() function to do the
reverse. :) Therefore, it was better to use the other method.So i'd say let us add a method for returning the complete content. Adam
could you do that?That wouldn't be too difficult (although I am busy for the next day or
two). However, as much as I loathe toggles, I'm wondering if it
wouldn't be better to make this an object-wide setting. My thought are
that on an object-by-object basis, you either always want tags or
never want them.Something like:
$sxe = simplexml_load_file('doc.xml');
$sxe->displayTags = true;This would keep the interface clean. Or would that just confuse things
with more magic?Also, what would the default behavior should be? I can argue both
sides of the issue right now. :)-adam
Would be glad for the code, send it me as an attachment cause the list
can be finicky. :)
(Or in other words, it needs to be plain text and end in .txt.)
-adam
BTW, I can send my source implementation of my toString() method which
returns the full content. Then you can do what you need in terms of
renaming, etc.Thanks!
Blake Schwendiman
----- Original Message -----
From: "Adam Maccabee Trachtenberg" adam@trachtenberg.com
To: "Marcus Boerger" helly@php.net
Cc: "Blake Schwendiman" blake@mediafence.com; internals@lists.php.net
Sent: Wednesday, January 07, 2004 4:59 PM
Subject: Re: [PHP-DEV] SimpleXML and Default Cast To StringFrom my opinion the current behavior is perfect because i see simplexml
from
an xml developers side and not from an html developers side. The former
must
typically know exactly where his strings comme from while the latter has
only text to deal with and often has situations where he needs to filter
out
formatting tags like the <i> in the example.All of my SimpleXML work is strictly XML, too. However, my thought was
that I could always callstrip_tags()
to eliminate the information I
didn't want, but there was no apply_tags() function to do the
reverse. :) Therefore, it was better to use the other method.So i'd say let us add a method for returning the complete content. Adam
could you do that?That wouldn't be too difficult (although I am busy for the next day or
two). However, as much as I loathe toggles, I'm wondering if it
wouldn't be better to make this an object-wide setting. My thought are
that on an object-by-object basis, you either always want tags or
never want them.Something like:
$sxe = simplexml_load_file('doc.xml');
$sxe->displayTags = true;This would keep the interface clean. Or would that just confuse things
with more magic?Also, what would the default behavior should be? I can argue both
sides of the issue right now. :)-adam
--
adam@trachtenberg.com
author of o'reilly's php cookbook
avoid the holiday rush, buy your copy today!