[tech] Molmol reboot and fallout
David Adam
zanchey at ucc.gu.uwa.edu.au
Fri Nov 6 21:29:45 AWST 2015
Yesterday the NFS server on Molmol was acting up - nlockmgr/rpc.lockd was
wedged and lots of operations were failing. We decided to reboot it.
Unfortunately, Mussel's disk image was hosted on the NFS server and for
some reason the superblock got corrupted. Usually, all the VMs work just
fine when the underlying storage disappears temporarily.
I restored a bunch of stuff from backups and used debsums to check the
consistency of most of the system.
The sticking points were PostgreSQL and MySQL.
Postgres refused to start until the transaction logs were flushed; as far
as I can tell no data was lost.
MySQL refused to start as a configuration file was missing;
`dpkg-reconfigure mysql` made that work, but then it just dropped a whole
bunch of databases without so much as a peep. I restored the ones that
were missing from the backup. There's a small chance of data loss but most
of the affected DBs didn't appear to be terribly high traffic.
[DAA]
zanchey@
More information about the tech
mailing list