[tech] robots.txt on secure.ucc

Daniel Axtens danielax at gmail.com
Mon Jan 30 21:04:57 WST 2012


Hi all,

I was watching apache's error.log today while debugging a php script, and realised that the google-bot was attempting to crawl our secure services. Unsurprisingly, it wasn't getting very far, but it was making for messy logs and quite severe load (one apache process was sitting at 100% trying to handle hits on all our different complicated secure services).

Interesting, there are several services on secure for which google has indexed the front page: see http://www.google.com.au/search?q=site:secure.ucc.asn.au . As it doesn't help us - or anyone else on the internet - to have these googleable, I have blocked all the webmails, the openid server and some management-y stuff. Dropbear remains untouched.

The full file, accessible at https://secure.ucc.asn.au/robots.txt , is below.

All the best,
[DJA]

== mussel:/var/www/robots.txt ==
User-agent: *
# Don't allow any of our webmails
Disallow: /horde3
Disallow: /rcube
Disallow: /SOGo

# No point in indexing an OpenID server, either
Disallow: /openid

# Or any of our internal services
Disallow: /phppgadmin
Disallow: /glpi
Disallow: /ocsreports



More information about the tech mailing list