Simple XML Parser Question

liunx

Guest
Hi all --

I'm having a problem with the XML parser... if there is a single (or double, I believe) quote between <tag> and </tag> it's crashing the parser :X Does anyone know of a workaround?

Thankswell, double quotes are usually a no-no in XML/XHTML, try using " instead.

<!-- m --><a class="postlink" href="http://www.w3.org/TR/2004/REC-xml-20040204/#syntaxOriginally">http://www.w3.org/TR/2004/REC-xml-20040 ... Originally</a><!-- m --> posted by goldbug
well, double quotes are usually a no-no in XML/XHTML, try using " instead.

<!-- m --><a class="postlink" href="http://www.w3.org/TR/2004/REC-xml-20040204/#syntax">http://www.w3.org/TR/2004/REC-xml-20040204/#syntax</a><!-- m -->

Yea -- I know they're typically a no-no. Problem is -- this is an XML feed coming from an application that a mass amount of people enter data into constantly. I'm trying to figure out if there is a way to have it avoid crashing if PHP doesn't like a chunk of content.

One workaround I thought of would be to have it read the XML feed line by line and just split() by the tags, but I would REALLY prefer not to be sloppy and do it that way :XIf that is the case firstly why dont you check the user data, if there are invalid characters such as quotes replace them with the proper special character.

Secondly why dont you read the file (this should be a one off of course), change characters which are *crashing the parser* from there use that string to read the XML. I should also say you should overwrite the current file with the new string as well so you dont have to keep reading the entire file.I tried this code:

<?
$string = '<root><node>"hey there", he said</node></root>';
$xml = simplexml_load_string($string);
$result = $xml->xpath('/root/node');
echo $result[0];
?>


and it displays

"hey there" he saidOriginally posted by odel
I tried this code:

<?
$string = '<root><node>"hey there", he said</node></root>';
$xml = simplexml_load_string($string);
$result = $xml->xpath('/root/node');
echo $result[0];
?>


and it displays

"hey there" he said

Thank you! I will try that!

The problem is, I don't have any control over what gets displayed. This is for a RETS based IDX solution (allows realtors to display MLS listings on their web site). The realtors add a variety of different data in through their MLS program, and then a query is sent to the RETS server and it returns the data in XML -- so there's no way to control what goes in or modify what gets parsed ahead of time. Everything needs to be done on-the-fly.

Will try the script you shared out with single quotes.

Thank you!Although I found that this char & crashed the parser cause it thinks it's the beginning of an entity reference.

Therefore & q u o t ; will display " whereas bonny & clyde will just crash.

There is an option in simple_xml_load_string (LIBXML_NOENT) but it hasn't worked here and I'm not even sure it's been implemented yet.

I think I would just forget about simplexml and use DOM insteadHI there,
I'm also working on a RETS solution and wondering how you made out with this. I'm using xml_parse(); (php4) and while the parse doesn't 'crash' it does see the & characters and 'break' the string. Example:
array ListData: Array
(
[0] => xlistingdata Object
(
[ListDate] => 2005-05-05
[ListPrice] => 1200000.00
[ExpirationDate] =>
[Commission] =>
[PublicRemarks] => Home to the Henry
)

[1] => stdClass Object
(
[PublicRemarks] => '
)

[2] => stdClass Object
(
[PublicRemarks] => s Fork of the Snake River, Island Park Reservoir
)</pre> <snip>

from the data string:
Home to the Henry&apos;s Fork of the Snake River, Island Park Reservoir... <snip> creating new elements in my array. My client is disenclined to upgrade to php5, so I can't test if simplexml_load_string($string) would solve the problem.

I've heard of including the source for the function to make it available in php4 but have never trid this....

Any thoughts?I'm starting a project that also queries a RETS server. Right now I'm simulating the process locally. If anyone has a sample xml response ( and the php that parses it, if you're really nice ) from a RETS server, I would much appreciate taking a look at it.

Thanks!
 
Back
Top