September 2008 Archives
Image via Wikipedia
Basically you can instruct a search engine to not index certain parts of your site or disallow some spiders from accessing your site entirely.
Unfortunately some search engine spiders are either badly written or intentionally evil and totally ignore any commands you might try to pass them via the robots.txt
One such robot is Voila.
Voila identifies itself with the UserAgent string:
VoilaBot BETA 1.2
Depending on the type of site you have you're probably best advised to block it entirely.
If you have access to iptables then you can simply issue a series of commands similar to this one:
iptables -I INPUT -s 81.52.143.15 -j DROP
I'm trying to get a full list of the IP ranges used by Voila, but so far I've found two which you could block. They are:
193.252.148.0/23
81.52.142.0/23
On one server the VoilaBot had caused the sites to become completely unresponsive with the load average climbing constantly!
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=527f1dd3-bb60-42cf-9ba6-4441ee86b53b)
