[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Sks-devel] robots.txt, grub-client
From: |
Jason Harris |
Subject: |
[Sks-devel] robots.txt, grub-client |
Date: |
Sat, 18 Dec 2004 15:52:32 -0500 |
User-agent: |
Mutt/1.4.2.1i |
Is anyone (else) serving robots.txt from pks and SKS and watching the
User-Agent: headers on incoming requests? I've noticed a lot (30 and
counting, since yesterday afternoon) of requests from grub-client-2.3
to my pks server, which is wrong because I've been serving robots.txt
containing:
User-agent: *
Disallow: /
for quite some time now. grub[.org] seems to be the newest search engine
that doesn't respect robots.txt, but it is also hard to block because it
is a distributed system. Still, 64.241.242.18=sv-fw.looksmart.com is the
main offender and can be blocked by IP.
Of course, M$ in 65.52.0.0/14 and 207.68.128.0 - 207.68.207.255 and
Yahoo/Inktomi in 66.196.64.0/18 are also blocked by IP due to over-
zealous web crawlers and/or not respecting robots.txt.
Most of the grub requests have been for "Host: skylane.kjsl.com:11371"
as well. The few for "Host: wwwkeys.pgp.net:11371" are understandable
because it is a DNS RR, of course, but I imagine the remaining servers
in wwwkeys.pgp.net (and other DNS RRs) that don't block these crawlers
will see their bot-induced load eventually rise to unacceptable levels.
--
Jason Harris | NIC: JH329, PGP: This _is_ PGP-signed, isn't it?
address@hidden _|_ web: http://keyserver.kjsl.com/~jharris/
Got photons? (TM), (C) 2004
pgpm8OcEoxEvJ.pgp
Description: PGP signature
- [Sks-devel] robots.txt, grub-client,
Jason Harris <=