parsing invalid xml using jaxb - can the parser be more lenient?

jonn

New Member
I've been using JAXB for a while now to parse xml that looks roughly like this:\[code\]<report> <-- corresponds to a "wrapper" object that holds some properties and two lists - a list of A's and list of B's <some tags with> general <info/> ... <A> <-- corresponds to an "A" object with some properties <some tags with> info related to the <A> tag <bla/> ... <A/> <B> <-- corresponds to an "B" object with some properties <some tags with> info related to the <B> tag <bla/> ... </B></report>\[/code\]The side responsible of marshalling the xml is terrible but is out of my control.
It often sends invalid xml chars and/or malformed xml.
I talked to the side responsible and got lots of errors fixed, but some they just can't seem to fix.
I want my parser to be as forgiveful as possible to these errors, and when it's not possible, to get as much info as possible from the the xml with the errors.
So if the xml contains 100 A's and one has a problem, I would still like to be able to keep the other 99.
These are my most common problems:
\[code\]1. Some info tag inner value contains invalid chars <bla> invalid chars here, either control chars or just &>< </bla>2. The root entity is missing a closing tag <report> ..... stuff here .... NO </report> at the end!3. An inner entity (A/B) is missing it's closing tag, or it's somehow malformed. <A> ...stuff here... <somethingMalformed_blabla_A/> OR <A> ... Something malformed here...</A>\[/code\]I hoped I explained myself well.
I really want to get as much info as possible from these xml's, even when they have problems.
I guess I need to employ some strategy that uses stax/sax along with JAXB but I'm not sure how.
If of 100 A's, one A has a xml problem I don't mind throwing just that A.
Although it would be much better if I could get an A object with as much data that could be parsed until the error.
 
Back
Top