How to programmatically measure the elements' sizes in HTML source code using python?

prash87

New Member
I'm doing webpage layout analysis in python. A fundamental task is to programmatically measure the elements' sizes given HTML source codes, so that we could obtain statistical data of content/ad ratio, ad block position, ad block size for the webpage corpus. An obvious approach is to use the width/height attributes, but they're not always available. Besides, things like \[code\]width: 50%\[/code\] needs to be calculated after loading into DOM. So I guess loading the HTML source code into a window-size-predefined-browser (like mechanize although I'm not sure if window's size could be set) is a good way to try, but mechanize doesn't support the return of an element size anyway.Is there any universal way (without width/height attributes) to do it in python, preferably with some library?Thanks!
 
Back
Top