PeterdeVis
New Member
I need to parse a given file into list of strings,the given file style is this:\[code\]<DOC><DOCNUM> NUMBER </DOCNUM><DOCTYPE> TYPE </DOCTYPE><HEADER>&SOMETHING</HEADER><BODY><HEADLINE>SOME TEXT</HEADLINE>TEXTTEXT TEXT <TEXT><P>INPUT TEXT1</P><P>INPUT TEXT2</P>...</TEXT></BODY></DOC>\[/code\]I need to make a list of all the TEXTi instances inside the P tag appearances.i tried doing this with lxml xml parser but because &something isn't acceptable in the xml format it didn't work...i tried using html parser but i didn't figure out exactly how to make it work.does anyone know of a good way for me to get the list i need?