Help requested with Dom XML

liunx

Guest
sample xml file:


<?xml version="1.0"?>
<newsIndex>
<Item>
<Date>20051019</Date>
<Title>Massive stars can grow near black holes </Title>
</Item>
<Item>
<Date>20050808</Date>
<Title>Researchers Study Changes in Hurricanes</Title>
</Item>
<newsIndex>


i would like to delete the first 'Item' with the date '20051019' and i have been having problems sorting this out.

is this the right sequence to go about the removal of that node:

loop Item's length :
- while checking the 'Date' of each particular 'Item' for a match
if match is found : create numerical pointer variable based on where the loop is

removeChild->getElementByTagName('Item')->item(variablewejustmade);

$dom->saveXML();

I am novice with DOM but mostly using javascript, and havent really taken a good look at XPath... this particular problem has been stumping me for quite some time, i keep returning to it in my spare time trying to figure it out so any help is so appreciated...

snxThere is no "getElementByTagName". There's a "getElementsByTagName" which returns a list of all the elements with the given tag name. You still need to go through that list.

E.g., $items = $dom->documentElement->getElementsByTagName('Item');
foreach ($items as $item) {
if($item->firstChild->textContent=='20051019')
{
$item->parentNode->removeChild($item);
}
}Jeesh you make it seem so easy - i'm jealous! ;)

especially since that makes totaly logical sense and its not working for me...

your code:

$items = $dom->documentElement->getElementsByTagName('Item');
foreach ($items as $item) {
if($item->firstChild->textContent=='20051019')
{
$item->parentNode->removeChild($item);
}
}


doesnt seem to work, so when i debug using just a"
print $item->firstChild->textContent;
i get nothing however a
print $item->textContent;
gives me each items' nodes text value...

wierd...
thanks for your time weedpacket!
snx

BTW the getElementsByTagName thingy was a typo!Eh; I had to guess what exactly was in your $dom - I simply loaded mine up with the XML your provided; but $item->firstChild is supposed to be referring to the first child of the Item element - i.e, the Date element, which is the one with the date text in it.it is 'Date' but even if it wasn't, for some reason it still doesnt do a print or echo... wierd...


$items = $dom->documentElement->getElementsByTagName('Item');
foreach ($items as $item) {

print $item ->firstChild->textContent . '<br />';

if($item->firstChild -> textContent == '20051013')
{
$item->parentNode->removeChild($item);

}
}


at first i thought i might be encoding but then, why would it print if i take the firstChild away... i gonna keep working it...
thx

so if i take away the 'firstChild' reference then i get this:

20051019 Massive stars can grow near black holes
20050808 Researchers Study Changes in Hurricanes


if i try to use firstChild i get nothing...


and if i print using firstChild without the 'textContent' :
then i get the dom id's
Object id #6
Object id #4

snxtried a bunch of things, heres what i have:

test.xml:

<?xml version="1.0"?>
<Index>
<Item>
<Date>20051013</Date>
<Title>Massive Stars Can Grow Near Black Holes</Title>
</Item>
<Item>
<Date>20051019</Date>
<Title>Researchers Study Changes in Hurricanes</Title>
</Item>
</Index>


remove.php:

<?php
$dom = new DOMDocument;
$dom->load('test.xml');

$Items = $dom->documentElement->getElementsByTagName('Item');

foreach ($Items as $item) {
print $item->firstChild->textContent;
print '<br />';
if($item->firstChild->textContent == '20051019')
{
//$item->parentNode->removeChild($item);
}
}
//$dom->saveXML();
print '<br /> done.';
?>


not sure why i can get the element id's, but not the textContent of the same element.

any help would be solid!
snxWelp, here's what I wrote:
<?php
$xml=<<<EOX
<?xml version="1.0"?>
<newsIndex>
<Item>
<Date>20051019</Date>
<Title>Massive stars can grow near black holes </Title>
</Item>
<Item>
<Date>20050808</Date>
<Title>Researchers Study Changes in Hurricanes</Title>
</Item>
</newsIndex>
EOX;
$dom = new DomDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);

$items = $dom->documentElement->getElementsByTagName('Item');
foreach ($items as $item) {
if($item->firstChild->textContent=='20051019')
{
$item->parentNode->removeChild($item);
}
}
echo $dom->saveXML();

Output:<?xml version="1.0"?>
<newsIndex><Item><Date>20050808</Date><Title>Researchers Study Changes in Hurricanes</Title></Item></newsIndex>...and the 20051019 item is gone.yo weeps:
thanks for the heads up on that preserve whitespace... that was definitely the problem! i have been trying to resolve that on and off for over a month!!



cheers
snxAh-ha - of course! When the XML standard changed to remove confusion about whitespace rules, preserve-whitespace became default behaviour - so the first child of the Item element ended up being a text node containing the whitespace between <Item> and <Date>!
 
Back
Top