PHP: Fetch content from a html page using xpath()

skastefoerfah · Sep 13, 2012

I'm trying to fetch the content of a div in a html page using xpath and domdocument. This is the structure of the page:\[code\]<div id="content"><div class="div1"></div><span class="span1></span><p></p><p></p><p></p><p></p><p></p><div class="div2"></div></div>\[/code\]I want to get only the content of p, not spans and divs. I came thru this xpath expression .//*[@id='content']/p but guess something's not right because i'm getting only the first p. Tried using other expression with following-sibling and node() but all return the first p only.\[code\].//*[@id='content']/span/following-sibling:

.//*[@id='content']/node()[self:

]\[/code\]This is how's used xpath:\[code\]$domDocument=new DOMDocument();$domDocument->encoding = 'UFT8';$domDocument->loadHTML($page);$domXPath = new DOMXPath($domDocument);$domNodeList = $domXPath->query($this->xpath);$content = $this->GetHTMLFromDom($domNodeList);\[/code\]And this is how i get html from nodes:\[code\]private function GetHTMLFromDom($domNodeList){$domDocument = new DOMDocument();$node = $domNodeList->item(0); foreach($node->childNodes as $childNode) $domDocument->appendChild($domDocument->importNode($childNode, true));return $domDocument->saveHTML();}\[/code\]

PHP: Fetch content from a html page using xpath()

skastefoerfah

New Member