[CRS_META] Privacy of archives

Tim Tyler seemysig at googlemail.com
Fri Feb 15 08:15:36 EST 2008


Re: "It was my fault (not googles).  The robots.txt
file could not be read because I forgot to make an
entry in the .htaccess fie (Brian will understand
this).  I fixed the problem and submitted requests to
google to have all the archived pages removed from
google."

Of course, the Robots Exclusion Standard is a
*voluntary* protocol for "well-behaved" robots.

Email address harvesting spiders run by spammers
are /extremely/ unlikely to be well-behaved:

"almost all bad robots ignore /robots.txt"

 - http://www.robotstxt.org/faq/blockjustbad.html
-- 
__________
 |im |yler  http://timtyler.org/  tim at tt1lock.org  Remove lock to reply.



More information about the CRS_META mailing list