From zanchey at ucc.gu.uwa.edu.au Tue Jan 3 13:00:02 2012 From: zanchey at ucc.gu.uwa.edu.au (David Adam) Date: Tue, 3 Jan 2012 13:00:02 +0800 (WST) Subject: [tech] Problems with Mylah's network connection Message-ID: For as long as I can remember, people have been complaining that the network is slow. "Whatever", I usually think, and go back to IRC. Then I wrote a backups script that uses SSH and discovered that it would die occasionally with "Corrupted MAC on input. Disconnecting: Packet corrupt". Oops. Further investigation reveals that an `ssh root at murasoi cat /dev/zero | pv > /dev/null` pipe will eventually die with the same error message (i.e. the stream gets corrupted at some point), but it usually takes hours and up to twenty gigabytes of traffic. It happens between Mylah and Motsugo, too, but it doesn't seem to happen between Murasoi and anything else (including anything on the same switch as Mylah) - over 3TB transferred without a problem. Interestingly even though both Mylah and Murasoi are on gigabit connections, the maximum throughput is more like 2-300 megabit as measured by pv(1); on other gigabit-enabled hosts on the machine room it is more like 800 megabit. The throughput also drops every few minutes to basically zero. iperf(1) shows similar information. A little bit of analysis with tcpdump(8) shows that captures on Mylah show significant packet loss interrupting the TCP stream - lots of missed ACKs and retransmissions. I suspect this causing the throughput limitations and occasional pauses, but I'm not sure it is responsible for the corrupted packets. I'm not really sure where to go from here. iperf is supposed to give some in-depth indication of TCP performance or dropped datagrams in UDP mode but does neither. The tcpdump traces are not particularly enlightening; not much is changing quickly in `netstat -s`. I wonder about the performance of a 32-bit 33mHz gigabit network card but have no idea how to measure the PCI utilisation or interrupt frequency on Linux. (pcitop looked promising but only works on HP IA-64 machines.) Anyone have any thoughts? David From tpg at ucc.gu.uwa.edu.au Thu Jan 19 21:41:39 2012 From: tpg at ucc.gu.uwa.edu.au (John Hodge) Date: Thu, 19 Jan 2012 21:41:39 +0800 (WST) Subject: [tech] Madako turned off Message-ID: Tech: Since murasoi has taken over all services once done my madako, I have turned it off to reduce heat load in the machine room (especially in the righthand rack, which doesn't really get much cool air). Hopefully this will stop poor murasoi overheating so often (and stop my inbox being flooded by overheat messages) :) Wheel: I've taken a backup of /root /etc /usr /home /tftpboot and /opt from madako and put it in a tarball in murasoi:/root, just in case we need to get to get access to that data. John Hodge [TPG] From danielax at gmail.com Mon Jan 30 21:04:57 2012 From: danielax at gmail.com (Daniel Axtens) Date: Mon, 30 Jan 2012 21:04:57 +0800 Subject: [tech] robots.txt on secure.ucc Message-ID: Hi all, I was watching apache's error.log today while debugging a php script, and realised that the google-bot was attempting to crawl our secure services. Unsurprisingly, it wasn't getting very far, but it was making for messy logs and quite severe load (one apache process was sitting at 100% trying to handle hits on all our different complicated secure services). Interesting, there are several services on secure for which google has indexed the front page: see http://www.google.com.au/search?q=site:secure.ucc.asn.au . As it doesn't help us - or anyone else on the internet - to have these googleable, I have blocked all the webmails, the openid server and some management-y stuff. Dropbear remains untouched. The full file, accessible at https://secure.ucc.asn.au/robots.txt , is below. All the best, [DJA] == mussel:/var/www/robots.txt == User-agent: * # Don't allow any of our webmails Disallow: /horde3 Disallow: /rcube Disallow: /SOGo # No point in indexing an OpenID server, either Disallow: /openid # Or any of our internal services Disallow: /phppgadmin Disallow: /glpi Disallow: /ocsreports From matt at ucc.asn.au Mon Jan 30 21:11:44 2012 From: matt at ucc.asn.au (Matt Johnston) Date: Mon, 30 Jan 2012 21:11:44 +0800 Subject: [tech] robots.txt on secure.ucc In-Reply-To: References: Message-ID: <20120130131144.GB6433@ucc.gu.uwa.edu.au> Huh, that's pretty strange googlebot behaviour, copied below. Also, the hg server there is intended to be used by anyone, not just for Dropbear. Send an email to wheel at ucc with a hg repo directory in your homedir and we'll add it. Matt 66.249.67.105 66.249.67.105 secure.ucc.asn.au - - [29/Jan/2012:18:06:43 +0800] "GET /horde3/imp/redirect.php?Horde=o9jghg22ma665b8iqbdar5a0t4&imapuser=$(_imapuser)&pass=$(_pass)&server=$(_server)&new_lang=$(_new_lang)&url=/horde3/index.php& HTTP/1.1" 200 1442 "-" "SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" "-" 1003 3429 "-" TLSv1 RC4-SHA 66.249.67.105 66.249.67.105 secure.ucc.asn.au - - [29/Jan/2012:18:09:25 +0800] "GET /horde3/imp/redirect.php?Horde=uahvjbhen3ghoe1aqoso5qo983&imapuser=$(_imapuser)&pass=$(_pass)&server=$(_server)&new_lang=$(_new_lang)&url=/horde3/index.php& HTTP/1.1" 200 1442 "-" "SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" "-" 1003 3429 "-" TLSv1 RC4-SHA 66.249.67.105 66.249.67.105 secure.ucc.asn.au - - [29/Jan/2012:18:12:06 +0800] "GET /horde3/imp/redirect.php?Horde=jfd5emnovmsjs0u0ubl0d8jpu6&imapuser=$(_imapuser)&pass=$(_pass)&server=$(_server)&new_lang=$(_new_lang)&url=/horde3/index.php& HTTP/1.1" 200 1443 "-" "SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" "-" 1003 3430 "-" TLSv1 RC4-SHA On Mon, Jan 30, 2012 at 09:04:57PM +0800, Daniel Axtens wrote: > Hi all, > > I was watching apache's error.log today while debugging a php script, and realised that the google-bot was attempting to crawl our secure services. Unsurprisingly, it wasn't getting very far, but it was making for messy logs and quite severe load (one apache process was sitting at 100% trying to handle hits on all our different complicated secure services). > > Interesting, there are several services on secure for which google has indexed the front page: see http://www.google.com.au/search?q=site:secure.ucc.asn.au . As it doesn't help us - or anyone else on the internet - to have these googleable, I have blocked all the webmails, the openid server and some management-y stuff. Dropbear remains untouched. > > The full file, accessible at https://secure.ucc.asn.au/robots.txt , is below. > > All the best, > [DJA] > > == mussel:/var/www/robots.txt == > User-agent: * > # Don't allow any of our webmails > Disallow: /horde3 > Disallow: /rcube > Disallow: /SOGo > > # No point in indexing an OpenID server, either > Disallow: /openid > > # Or any of our internal services > Disallow: /phppgadmin > Disallow: /glpi > Disallow: /ocsreports > From zanchey at ucc.gu.uwa.edu.au Tue Jan 31 09:13:00 2012 From: zanchey at ucc.gu.uwa.edu.au (David Adam) Date: Tue, 31 Jan 2012 09:13:00 +0800 (WST) Subject: [tech] Martello down Message-ID: Martello crashed last night (2012 Jan 30 23:29:52), possibly from a tg3 network driver bug. I upgraded it to Debian 6.0.4 earlier in the day but it hadn't rebooted for a new kernel yet. The new version does have a bunch of tg3 fixes, though. There's nothing particuarly important service-wise running on Martello; most clients are pointed at it for LDAP but that's been running happily on Motsugo for a few months now, and we can probably change them over (to point at ldap-slave rather than direct to motsugo). [DAA]