Extract info from html?

First of all, I've seen a good deal of similar questions. I know regex or dom can be used, but I can't find any good examples of DOM and regex makes me pull my hair. In addition, I need to pull out multiple values from the html source, some simply contents, some attributes.Here is an example of the html I need to get info from:\[code\]<div class="log"> <div class="message"> <abbr class="dt" title="time string"> DATA_1 </abbr> : <cite class="user"> <a class="tel" href="http://stackoverflow.com/questions/11322623/tel:+xxxx"> <abbr class="fn" title="DATA_2"> Me </abbr> </a> </cite> : <q> DATA_3 </q> </div></div>\[/code\]The "message" block may occur once or hundreds of times. I am trying to end up with data like this:\[code\]array(4) { [0] => array(3) { ["time"] => "DATA_1" ["name"] => "DATA_2" ["message"] => "DATA_3" } [1] => array(3) { ["time"] => "DATA_1" ["name"] => "DATA_2" ["message"] => "DATA_3" } [2] => array(3) { ["time"] => "DATA_1" ["name"] => "DATA_2" ["message"] => "DATA_3" } [3] => array(3) { ["time"] => "DATA_1" ["name"] => "DATA_2" ["message"] => "DATA_3" }}\[/code\]I tried using simplexml but it only seems to work on very simple html pages. Could someone link me to some examples? I get really confused since I need to get DATA_2 from a title attribute. What do you think is the best way to extract his data? It seems very similar to XML extraction which I have done, but I need to use some other method.
 
Back
Top