[tech] Molmol Upgrade - 8.30PM onward on Thursday the 20th October
David Adam
zanchey at ucc.asn.au
Sat Oct 29 22:51:38 AWST 2016
On Sat, 15 Oct 2016, Mitchell Pomery wrote:
> Zanchey and I will be upgrading FreeBSD on Molmol and upgrading it's
> storage from 8.30PM onwards on Thursday the 20th of October.
>
> Molmol will be taken offline during this time and as such things that rely
> on it (like clubroom logins) will be unavailable during this time. We
> apologise for any inconvenience.
>
> If you are interested in helping us to do this (or interested in
> learning more about server hardware, FreeBSD or our storage server setup),
> let one of us know and come join us in the clubroom.
Everything is done!
First, we prepped the OS for upgrade:
# freebsd-update -r 11.0-RELEASE upgrade
(this took an hour or more - should have done it earlier)
Then, we installed the upgrade:
# freebsd-update install
# freebsd-update install
(reboot; now running new kernel and userland)
Then we shut the system down and installed the new SAS card. There was
some fiddling required to get the whole backplane, plus the three SSDs,
all powered - and in the process we discovered that Molmol actually
supports blinkenlichts which show which drives are plugged in! These just
weren't powered up before.
We restarted the machine, plugged all the drives in and started FreeBSD.
Once everything was properly powered then there was no problem.
I decided to try and add some extra drives to the array.
# zpool add space mirror /dev/da0 /dev/da1
Those of you familiar with ZFS are cringing about now; the best practice
is not to use the symbolic name (/dev/da0), as that may change, but
instead to use a guaranteed-unique identifier like
/dev/diskid/DISK-WD-WXN1A56AH3J8. So, I thought I'd fix my mistake:
# zpool offline space /dev/da0
# zpool detach space /dev/da0
(hangs forever)
I got sick of waiting (repeat investigation shows that this spins in a
ZFS-specific lock) and rebooted the machine, which promptly kernel
panicked on restart:
panic: Solaris(panic): blkptr at 0xfffff800120bb048 DVA 0 has invalid VDEV 5 cpuid = 1
KDB: stack backtrace: #0 0xffffffff80b24077 at kdb_backtrace+0x67
#1 0xffffffff80ad93e2 at vpanic+0x182 #2 0xffffffff80ad9253 at panic+0x43
#3 0xffffffff8262a192 at vcmn_err+0xc2 #4 0xffffffff824afcdd at zfs_panic_recover+0x5d
#5 0xffffffff824d6903 at zfs_blkptr_verify+0x2c3 #6 0xffffffff824d694f at zio_read+0x2f #7 0xffffffff824526b3 at arc_read+0x8d3
#8 0xffffffff8246e0ad at dmu_objset_open_impl+0xed #9 0xffffffff8248861a at dsl_pool_init+0x2a
#10 0xffffffff824a4552 at spa_load+0x802 #11 0xffffffff824a379e at spa_load_best+0x6e
#12 0xffffffff8249ff12 at spa_open_common+0x102 #13 0xffffffff824a028f at spa_get_stats+0x4f
#14 0xffffffff824ef875 at zfs_ioc_pool_stats+0x25 #15 0xffffffff824f3e55 at zfsdev_ioctl+0x5f5
#16 0xffffffff809861cf at devfs_ioctl_f+0x13f
#17 0xffffffff80b41ab4 at kern_ioctl+0x2d4
I tried lots of things to fix this, but the one thing that actually worked
was flushing the cached array information with `mv /boot/zfs/zpool.cache
/boot/zfs/zpool.cache.0` and rebooting. That way, ZFS didn't get confused
about which drives were still available or not and was happy to reload the
pool just by inspecting the drives.
Finally, I added the drives properly - by disk ID - and added them all as
mirrors.
A few `pkg upgrade` and one final `freebsd-update install` and the machine
was sorted.
pool: space
state: ONLINE
scan: scrub repaired 0 in 17h44m with 0 errors on Fri Oct 28 21:30:56 2016
config:
NAME STATE READ WRITE CKSUM
space ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
diskid/DISK-WD-WXF1A8371196 ONLINE 0 0 0
diskid/DISK-WD-WXF1A83E2255 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
diskid/DISK-WD-WXF1A8372507 ONLINE 0 0 0
diskid/DISK-WD-WX11E83HKN64 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
diskid/DISK-WD-WXM1E83KPU73 ONLINE 0 0 0
diskid/DISK-WD-WXM1E83KPT93 ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
diskid/DISK-WD-WXM1E83JZD83 ONLINE 0 0 0
diskid/DISK-WD-WX11E83HKM57 ONLINE 0 0 0
mirror-5 ONLINE 0 0 0
diskid/DISK-WD-WXT1EB54LMF4 ONLINE 0 0 0
diskid/DISK-WD-WXL1A560AY6Z ONLINE 0 0 0
mirror-6 ONLINE 0 0 0
diskid/DISK-WD-WXN1A56AH3J8 ONLINE 0 0 0
diskid/DISK-WD-WXN1A56NDYS0 ONLINE 0 0 0
mirror-7 ONLINE 0 0 0
diskid/DISK-WD-WX21A561V3C9 ONLINE 0 0 0
diskid/DISK-WD-WXL1A567K16D ONLINE 0 0 0
mirror-8 ONLINE 0 0 0
diskid/DISK-WD-WXN1A56NDAKY ONLINE 0 0 0
diskid/DISK-WD-WXL1A560AEEX ONLINE 0 0 0
logs
mirror-4 ONLINE 0 0 0
gpt/molmol-slog ONLINE 0 0 0
gpt/molmol-slog0 ONLINE 0 0 0
cache
gpt/molmol-l2arc1 ONLINE 0 0 0
errors: No known data errors
NAME USED AVAIL REFER MOUNTPOINT
space 3.42T 3.60T 311G /space
Thanks to Mitch [BG3] and Sam [SAS] for their contribution!
David Adam
UCC Wheel Member
zanchey@
More information about the tech
mailing list