Resumable XML parsing

gareb

New Member
I have a data import tool which parses huge XMLs (it uses a SAX parser, but adaptation is the least of my problems). Failures, new deployments, system restarts happen, and I don't want to start entirely over, so I need to save the parser state (we can call it an XML cursor if we want) from time to time.Are there any parsers out there capable of saving their states and resume them (obviously I have to seek into the file as well on resuming)?I haven't found such a parser, so I have doubts about that, so here's my second question: do you have any suggestions how I should start implementing it? Take a SAX parser implementation and dig in, or I would be better off starting from scratch?If it matters, I need xml namespaces, but no schema/DTD check.The cursors could also come in handy for pre-parsing the xml and distributing the work for parallel processing.
 
Back
Top