Unstructured text extraction from website

Lercignee

New Member
I need to extract the addresses of the restaurants in India from different web sites. The addresses are typically of unstructured nature. So I could not figure out any regular expression. Also I want the extraction method to be generalized across web sites so that it become site independent.Some of the web sites I am currently focusing are:
http://www.zomato.com
http://www.indianfoodforever.com/eating-out/restaurants/
etc. I am looking for an approach towards solving the problem.
 
Back
Top