I have a semi-large website locally stored (ripped from the server using httrack). This particular website's directory structure has several folders/subfolders as well as a large number of html files. I would like to know if there are any tools (it really can be anything: scripts, c++/c code, etc) that would allow me to generate a single word frequency counter table across all html files.The trick here is that I am only interested on counting actual content words (i.e., not html code, although these could be easily removed later if that is the case). Any suggestions are much appreciated!