AdSense code and 404 errors

gurevich29

New Member
Recently I have been getting hundreds and hundreds of 404 errors on my site.They have the format:domain/directory/directory/FFCC33 ordomain/directory/directory/008800where the numbers and letters at the end of the request are what are generating the 404's. It happens not continually but every few days for an hour or so.Those hex values happen to be part of the AdSense code, specifically this piece of code that controls the font colour of the url that is in the ad:Code: [ Select ]Problem has been solved.For the record it is an offline browser used to download entire sites thathas been causing these errors ... its called HTTrack 3.0Why would someone download my entire site (2,000 pages worth) ?Spooky.To steal content, just to browse it offline, could be one of many reasons.How would I exclude this critter because it has been active again in the last hour or two and is using up a serious amount of bandwith?I believe it can ignore the robots.txt file.robots.txt won't block anything. robots.txt only works if search engines & bots are willing to abide by your REQUESTS contained within that file. The kinds you want to block, generally don't You can, however, block specific user-agents using a .htaccess file - but HTTrack allows you to forge the user-agent to pretend to be anything you want.As Axe said, you can use a .htaccess file to block many of these, and usually it will do a sufficient job. However as Axe also pointed out, they can easily get around that if they are determined. If you want to try it though here is an example of how you might do your .htaccess file. I have included many others besides HTTrack to attempt to block:Code: [ Select ]I'll give that a try ... thanks for your time.Hey Bigwebmaster,Really thanks you a ton for this. Have been looking for this for a very long time.Once again, thanks a lotCheers
 
Back
Top