what would make my html parsing code more efficient?

leroylim

New Member
this morning I decided I wanted to work on a little project to parse all the gas prices for maverik gas stations into an array. I got most of that working fairly easily, the only part that I feel is "dirty" in my code is the actual parsing of the html to variables. I'm using indexOf and substrings to get to the data I want and I feel that there has to be a cleaner way to do it? Anyways here is my code, it compiles and works great just not as clean as I'd like.maverik.java contains the main method and the bulk of the code for the project.maverikObj.java contains the getters and setters, constructor and toString methods.To change the gas station you are getting console data from you can simply change the number in the array println on line 90 of maverik.java. Future revisions will have methods to control what data is displayed based on user requests.Here is an example HTML with prices:html4 = "<b>Maverik Store 4</b><br/>5200 Chinden Blvd<br>Boise, ID<br>208-376-0532<br><center><b></b></center><br /><font color=red>Fuel Prices -- Updated every 30 minutes</font><br /><div><div style=\"float: left; width: 70%; text-align:right;\">Adventure Club Card</div><div style=\"float: right; width: 30%; text-align:center;\">Retail</div><br /><div style=\"float: left;width: 30%;\">Unleaded:</div><div style=\"float: left; width: 30%; text-align:center;\"> 3.379</div><div style=\"float: right; width: 30%; text-align:center;\"> 3.399</div><br /><div style=\"float: left;width: 30%;\">Blend 89:</div><div style=\"float: left; width: 30%; text-align:center;\"> 3.469</div><div style=\"float: right; width: 30%; text-align:center;\"> 3.499</div><br /><div style=\"float: left;width: 30%;\">Blend 90:</div><div style=\"float: left; width: 30%; text-align:center;\"> 3.549</div><div style=\"float: right; width: 30%; text-align:center;\"> 3.579</div><br /><div style=\"float: left;width: 30%;\">Premium:</div><div style=\"float: left; width: 30%; text-align:center;\"> 3.599</div><div style=\"float: right; width: 30%; text-align:center;\"> 3.639</div><br /><div style=\"float: left;width: 30%;\">Diesel:</div><div style=\"float: left; width: 30%; text-align:center;\"> 4.039</div><div style=\"float: right; width: 30%; text-align:center;\"> 4.059</div>";Currently I'm parsing the address, city, state, phone number and all of the 8 gas types possible at each station. (Unleaded, Blend 87,88,89,99, Premium, Diesel). It gets a bit trickier though because some of the html entries do not have all 8 of those listed, most only have 4 or 5 of the 8 possible fuel types. So to parse this data I used two method.Address, City, State, Phone number are parsed using:if(line.contains(" = \"<b>Maverik Store")&&!line.contains("Coming Soon!")){ address=splitLine[3].substring(0,splitLine[3].length()-3).replace(" ", " "); city=splitLine[4].substring(0,splitLine[4].length()-7); state=splitLine[4].substring(splitLine[4].length()-5,splitLine[4].length()-3); phone=splitLine[5].substring(0,splitLine[5].length()-3);Fuel types are parsed using if else statements, using the if statement to record data if its present and the else statement to record a 0.0 double since my constructor requires all fuel types to have some value.if(line.indexOf("Unleaded:")>0){ unleaded=Double.parseDouble(line.substring(line.indexOf("Unleaded:")+147, line.indexOf("Unleaded:")+152));}else{ unleaded=0.0;}As you can see I use a lot of substrings and indexOf string methods to get the data I want. My fear is that this is an extremely static method of getting the data I want and thus I feel its a really dirty way of doing things. Any tips on how I can clean up my code are appreciated! =)
 
Back
Top