How to parse XML containing prefixes but no namespace declarations with lxml?

attaterchal

New Member
I have a bunch of XML files which are using prefixes but without the corresponding namespace declaration.Stuff like:\[code\]<tal:block tal:condition="foo">...</tal:block>\[/code\]or:\[code\]<div i18n:domain="my-app">...\[/code\]I know where those prefixes come from, an I tried the following, but without success:\[code\]from lxml import etree as ElementTreeElementTree.register_namespace("i18n", "http://namespaces.zope.org")ElementTree.register_namespace("tal", "http://xml.zope.org/namespaces/tal")with open(path) as fp: tree = ElementTree.parse(fp)\[/code\]but lxml still chokes with:\[code\]lxml.etree.XMLSyntaxError: Namespace prefix i18n for domain on div is not defined, line 4, column 20\[/code\]I know I can use \[code\]ElementTree.XMLParser(recover=True)\[/code\], but I would like to keep the prefix anyway, which this method don't.Any idea?
 
Back
Top