Data Scraping Problem

nanoaxl

New Member
I am scraping data from facebook page for the wall posts, here is the url:http://www.facebook.com/GMHTheBook?v=wall&ref=ts#!/GMHTheBook?v=wall&ref=tsI sucessfully scraped all the visible wall posts using CURL.Problem:At the end of visible wall posts, there is Older Posts link which shows more wall posts once you click on that link. Now how do I sort of manually click that link to show more wall posts and scrap those posts as well?Any solution using any method for that? I am using CURL though but I hope there is just about any solution to deal with such situation?Update:Now I am using this code to get all the data, find the next link and fetch the data for that url and so on, here is the code:\[code\]ini_set('display_errors', true);error_reporting(E_ALL);$data = http://stackoverflow.com/questions/3592421/json_decode(file_get_contents(($url)), true);$names = array();$stories = array();foreach($data['data'] as $post){ $names[] = $post['from']['name']; $stories[] = $post['message'];}$url = $data['paging']['next'];// this is meant to scrap data recurssively from the next linkswhile($url !== ''){ $url = $data['paging']['next']; $data = http://stackoverflow.com/questions/3592421/json_decode(file_get_contents(($url)), true); foreach($data['data'] as $post) { $names[] = $post['from']['name']; $stories[] = $post['message']; } $url = urldecode($data['paging']['next']); echo $url . '<br />';}for($j = 0; $j < count($names); $j++){ $data .= $names[$j] . '|' . $stories[$j] . "\n";}$h = fopen("data.txt", "a+");fwrite($h, $data);fclose($h);\[/code\]But the problem is that script keeps on running with no output at all, also no file is created. I have set the script time settings to higher value too. \[code\]allow_url_fopen\[/code\] is also set to on. Is there anything wrong in the script or probably I am not doing the recurssion in the right way? Any solution/alternative to this?
 
Back
Top