Extraction of multiple XMLs from a 500mb file

KelliCarletta

New Member
I need to extract XMLs from a huge (around 500mb) file but I have a 32bit JVM which always runs out of heap space.I have written a program to extract XMLs from this file but to do that the whole file has to be read (I can not go with 100 lines or so per iteration, as I can not make sure the 100th line would be the end of an XML).SO how do I do it?My program for extraction:\[code\]private static ArrayList<String> extractXml(String xml) { String[] newXml = xml.split("\\<\\?"); ArrayList<String> xmlList = new ArrayList<String>(Arrays.asList(newXml)); for(int i = 0; i<xmlList.size();i++){ if(!xmlList.get(i).contains("xml version=\"1.0\" encoding=\"UTF-8\"")){ xmlList.remove(i); } } int size = xmlList.size(); if(xml.contains("#")) for(int j = 0;j<size;j++){ xmlList.set(j, "<?"+xmlList.get(j)); xmlList.set(j,xmlList.get(j).split("\\#")[0]); }else for(int j = 0;j<size;j++){ xmlList.set(j, "<?"+xmlList.get(j).trim()); System.out.println(xmlList.get(j)); } return xmlList;}\[/code\]The XMLs also have a Header (its a JMSStream header. Like a wrapper on the XML) which I have been successfully removing using above logic.
 
Back
Top