How To Ban Spambot(s) and or Bad Bots

Hoxxy

New Member
To reduce bandwidth theft and website abuse from Spambots, Inline linking & Scrapers on your Apache Web Server. You would first have to be able to track their activities and identify them. You can use AWStats to see how much bandwidth is been used by each User-Agents. Using 'robots.txt' will work for search bots that follow the instruction, but most bad_bots ignor or use the list to scan what you do not want displayed. This method is not the best method nor is adding robot.txt.

How To Ban Spambot(s) and or Bad Bots

Identify troublesome USER_AGENTS
Mail address harvesting bot:
spider visits your board and searches out symbol [@] or (X)HTML character entity [& #064;] or string [mailto:] to mail addresses to send spam later.
Misuse of robots.txt:
bot reads /robots.txt and then deliberately jumps right into the Disallow:ed directory.
Scraper bot:
they will duplicate the entire contents of your site, then set it up somewhere else and place Adwords on it. Using a Bot-trap will not protect your site against these bots because they usually follow robots.txt.
Ignoring robots.txt:
bot reads /robots.txt but then during spidering forgets and ignores the Disallow: directive.
Not looking at robots.txt:
bot starts spidering the site without even looking for a /robots.txt.
Corporate tattletales:
these are good bots for most who do not violate any corporations use of their trademark, copyrights, openly criticize them and so forth.
Chinese spambot:
some of the dumb and silly spam bots using UA strings "Indy Library" or "Internet Explore 5.x".
Guestbook harvester:
bot spiders guestbook pages only, often very aggressively (several pages per second), and of course ignoring /robots.txt.

Create or add to existing .htaccess file
How to create a .htaccess file extention
You do not need to download and or install any software or hardware product to create an htaccess file. Basically all you want to use is your default systems text editor program such as 'Notepad'

The only difficult part in saving file with extension name .htaccess is that if your system can accept a no-name filename with a long string extension.
".htaccess"

# Open text editor 'Notepad'
# Add your mod_rewrite for redirection header code (301)
# From 'text editor 'Notepad' Toolbar - File - Save As...
# File name: ".htaccess" - Note: Use double quotes around all characters
# Save as type: All Files
# Click [Save]
# Upload file to your files directory.

If this does not work for your current P.C., then you can upload file using a FTP program as ".htaccess.txt" and then use FTP to rename.

Add the following to .htacces file:
Code:
#  Block bad_bots, spiders, crawlers and harvesters
RewriteEngine on
Options +FollowSymlinks
RewriteBase /
#  Block bad_bots by their own internet protocol(ip)
RewriteCond %{REMOTE_HOST} ^209.189.115..* [OR]
#  Bad_bots list
RewriteCond %{HTTP_USER_AGENT} ^AESOP_com_SpiderMan [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR] 
RewriteCond %{HTTP_USER_AGENT} Anonymouse.org [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^asterias [OR] 
RewriteCond %{HTTP_USER_AGENT} ^attach [OR] 
RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Baiduspider [OR] 
RewriteCond %{HTTP_USER_AGENT} Bandit [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bigfoot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Black.Hole [OR] 
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlowFish [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:[email protected] [OR]
RewriteCond %{HTTP_USER_AGENT} ^BotALot [OR] 
RewriteCond %{HTTP_USER_AGENT} Buddy [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bullseye [OR] 
RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh [OR] 
RewriteCond %{HTTP_USER_AGENT} ^CheeseBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} Collector [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Copier [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck [OR] 
RewriteCond %{HTTP_USER_AGENT} ^cosmos [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Curl [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR] 
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Devil [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download.*Demon[NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Downloader [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^dragonfly [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EasyDL [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ebingbong [OR] 
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^EroCrawler [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Exabot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} Extractor [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR] 
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^flunky [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Foobot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^FrontPage [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR] 
RewriteCond %{HTTP_USER_AGENT} Grabber [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^Harvest [OR] 
RewriteCond %{HTTP_USER_AGENT} ^hloader [OR] 
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^httplib [OR] 
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR] 
RewriteCond %{HTTP_USER_AGENT} ^humanlinks [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]  
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^InfoTekies [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Intelliseek [OR] 
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Jakarta [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JennyBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Jyxobot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density [OR] 
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR] 
RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^likse [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LNSpiderguy [OR]
RewriteCond %{HTTP_USER_AGENT} ^LWP::Simple [OR] 
RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR] 
RewriteCond %{HTTP_USER_AGENT} ^MarkWatch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mata.Hari [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control [OR] 
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIIxpc [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Missigua\ Locator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^moget [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NAMEPROTECT [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^Netcraft [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NextGenSearchBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NG [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NimbleCrawler [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NPbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR] 
RewriteCond %{HTTP_USER_AGENT} ^OutfoxBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^PHP\ version\ tracker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14 [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR] 
RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch [OR] 
RewriteCond %{HTTP_USER_AGENT} Recorder [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} Reaper [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey [OR] 
RewriteCond %{HTTP_USER_AGENT} ^RMA [OR] 
RewriteCond %{HTTP_USER_AGENT} sitecheck.internetseer.com [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} Siphon [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Snapbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Snoopy [OR] 
RewriteCond %{HTTP_USER_AGENT} ^sogou [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SpankBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^spanner [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Sqworm [OR] 
RewriteCond %{HTTP_USER_AGENT} Stripper [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Sucker [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^suzuran [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4 [OR] 
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR] 
RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant [OR] 
RewriteCond %{HTTP_USER_AGENT} ^TheNomad [OR] 
RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Titan [OR] 
RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR] 
RewriteCond %{HTTP_USER_AGENT} ^True_Robot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^turingos [OR] 
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5 [OR] 
RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR] 
RewriteCond %{HTTP_USER_AGENT} ^VCI [OR] 
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Webclipping.com [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSite [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website.Quester [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WISENutbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu's [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zyborg

Select the final action:
Code:
# Sends to 'Forbidden Error' page once a match is made [Forbidden 403 Page, Last Rule] 
RewriteRule ^.* - [F,L] 
# Sends to any fake site or custom html page [Last Rule, Redirect] 
RewriteRule ^(.*)$ http://fake_link.com/ [L,R] 
# Sends to any fake info custom html page [Last Rule, Redirect]
RewriteRule /*$ http://www.yourdomainname.com [L,R]
RewriteRule ^.* - [F] 
RewriteCond %{HTTP_REFERER} ^http://www.spam_back.com$ 
RewriteRule !^http://[^/.]\.yourdomainname.com.* - [F] 
# Sends to any fake info custom html page or origin spam URL[Last Rule, Redirect]
RewriteRule /*$ http://www.spam_site-or-custom_page.com [L,R]
 

Hoxxy

New Member
but what will do if i have already a VBSEO htacces file.
I currently have this code, vbSEO, Yslow and some other custom rewrites in my htaccess file and my forum is running fine :)
 

bouncer

New Member
Hoxxy,

I have an empty .htaccess file in my /public-html directory

Are you suggesting I overwrite it with the code mentioned in the op? (I think so)

Thanks!
 
Top