Trying to pull content with tags from XML with PHP

We use Acalog at our institution and want to use their (unsupported) API to pull catalog content into our site from theirs. I can access their files and pull out the information, but the formatting (paragraph, bold, italics, breaks) is done as nodes (h:p, h:b, h:i, h:br). Unfortunately, the text I've pulled from searching for a:content only brings straight text and does not include the formatting nodes. How can I bring the nodes with the text? Where am I going wrong?The start of the XML (I broke it off at about the half mark)\[code\]<catalog xmlns="http://acalog.com/catalog/1.0" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:a="http://www.w3.org/2005/Atom" xmlns:xi="http://www.w3.org/2001/XInclude" id="acalog-catalog-6"><hierarchy> <legend> <key id="acalog-entity-type-5"> <name>Department</name> <localname>Department</localname> </key> </legend> <entity id="acalog-entity-239"> <type xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" xi:xpointer="xmlns(c=http://acalog.com/catalog/1.0) xpointer((//c:key[@id='acalog-entity-type-5'])[1])"/> </type> <a:title xmlns:a="http://www.w3.org/2005/Atom">American Studies</a:title> <code/> <a:content xmlns:a="http://www.w3.org/2005/Atom" xmlns:h="http://www.w3.org/1999/xhtml"> <h:p xmlns:h="http://www.w3.org/1999/xhtml"> <h:span class="dept_intro"> <h:i>Chair of the Department of American Studies: </h:i> </h:span> <h:span class="dept_intro">John Smith</h:span> <h:br/> <h:span class="dept_intro"> <h:br/>
Professors: Jane Smith; Sarah Smith, <h:i class="dept_intro">The Douglas Family Chair in American Culture, History, and Literary and Interdisciplinary Studies</h:i> <h:br/><h:br/>
Associate Professor: Michael Smith </h:span> <h:span class="dept_intro"><h:br/></h:span> </h:p> <h:p xmlns:h="http://www.w3.org/1999/xhtml"> <h:span class="dept_intro">Assistant Professor: Rebecca Smith</h:span> </h:p> <h:p xmlns:h="http://www.w3.org/1999/xhtml"> <h:span class="dept_intro">Lecturer: * Leonard Smith</h:span></h:p> <h:p xmlns:h="http://www.w3.org/1999/xhtml"> <h:span class="dept_intro">Visiting Lecturer: * Robert Smith<h:br/><h:br/><h:br/><h:br/></h:span><h:strong>Department Overview</h:strong></h:p> <h:p xmlns:h="http://www.w3.org/1999/xhtml" class="MsoNormal">American studies is an interdiscipl\[/code\]Here's the code I've written thus far:\[code\]$xml = file_get_contents($url); if ($xml === false) { return false; } else { // Create an empty DOMDocument object to hold our service response $dom = new DOMDocument('1.0', 'UTF-8'); // Load the XML $dom->loadXML($xml); // Create an XPath Object $xpath = new DOMXPath($dom); // Register the Catalog namespace $xpath->registerNamespace('h', 'http://www.w3.org/1999/xhtml'); $xpath->registerNamespace('a', 'http://www.w3.org/2005/Atom'); $xpath->registerNamespace('xi', 'http://www.w3.org/2001/XInclude'); // Check for error $status_elements = $xpath->query('//c:status[text() != "success"]'); if ($status_elements->length > 0) { // An error occurred return false; } $x = $dom->documentElement; foreach ($x->childNodes AS $item) { //echo $item->nodeName . " = " . $item->nodeValue . "<br/><br />"; } // Retrieve all catalogs elements $pageText = $xpath->query('//a:content'); if ($pageText->length == 0) { // No text found return false; } foreach ($pageText AS $item) { $txt = (string) $item->nodeValue; $txt = str_replace('<h:i>','<i>',$txt); $txt = str_replace('</h:i>','</i>',$txt); $txt = str_replace('<h:span class="dept_intro">','<p>',$txt); $txt = str_replace('</h:span>','</p>',$txt); if(strpos($txt,'Department Overview')) { echo '<p>' . str_replace('Department Overview','',$txt) . '</p>'; break; } else { echo '<p>' . $txt . '</p>'; } //echo $pageText->nodeValue; } }\[/code\]The line $pageText = $xpath->query('//a:content'); pulls the content, but not the tags.
 
Back
Top