How to parse HTML or XHTML or XML with python in a efficient way?

Lillian_Nevada · Jul 20, 2012

My python env is 2.7I know this is an old question, but I've lost my mind while I was searching and reading other people's questions and answers. Some of them is really out of date. Like the code below:\[code\]import lxml #wrongimport xml #correct\[/code\]So, since I'm a newbie to python and know nothing whatsoever in the great python history, I wanna make things more clear to me. Such as, what is the so-called standard xml-parser module in python now? what can I do when I need parse some HTML by using the xpath syntax. If I have a mal-formed HTML source code, how can handle it by not using BeautifulSoup or something else like. If u can brief me with something, I'll be much appreciated.OK, all in all, I just got one question. How can I parse mal-formed html code by using standard python module with python2.7?

How to parse HTML or XHTML or XML with python in a efficient way?

Lillian_Nevada

New Member