Accessing child divs using DOMDocument and XPath

7331

New Member
I'm building a basic screen scraper for personal use and learning purposes, so please do not post comments like "You need to ask permission" etc. The data I'm trying to access is structured as follows:\[code\]<tr> <td> <div class="wrapper"> <div class="randomDiv"> <div class="divContent"> <div class="event">asd</div> <div class="date">asd</div> <div class="venue">asd</div> <div class="state">asd</div> </div> </div> </div> </td></tr>\[/code\]I'm attempting to gather all this data (as there are about 20 rows on the given page).Using the following code I have managed to gather the data I need:\[code\]$remote = file_get_contents("linktoURL");$doc = new DOMDocument();$doc->preserveWhiteSpace = false;$file = @$doc->loadHTML($remote);$rows = $doc->getElementsByTagName('tr');$xp = new DOMXpath($doc);//initialize variables$rows = array();foreach($xp->query('//*[contains(@class, \'wrapper\')]', $doc) as $found) { echo "<pre>"; print_r($found->nodeValue);}\[/code\]Now my question is, how would I go about storing all this data into an associative array like below:\[code\]Array ( [0] => Array ( [Event] => Name [Date] => 12/12/12 [Venue] => NameOfPlace [state] => state ) [1] => Array ( [Event] => Name [Date] => 12/12/12 [Venue] => NameOfPlace [state] => state ) [2] => Array ( [Event] => Name [Date] => 12/12/12 [Venue] => NameOfPlace [state] => state ))\[/code\]Right now, the only solution that comes to mind would be to call the xpath query for each class name \[code\]//*[contains(@class, \'className\')]\[/code\] in the foreach loop.Is there a more idiomatic way via DOMDocument and XPath wherein I am able to create an associative array of the above data?edit:I'm not limited to using DOMDocument and XPath, if there are other solutions which might be easier, then please post them.
 
Back
Top