[tech] martello downtime

James Andrewartha trs80 at ucc.gu.uwa.edu.au
Tue Sep 2 18:20:09 WST 2008


Both drives on martello's sil24 SATA controller got kicked off briefly, 
which was enough to break its RAID 5 setup. After recovering it using the 
method described in http://ubuntuforums.org/showthread.php?t=410136 and 
http://kev.coolcavemen.com/2008/07/heroic-journey-to-raid-5-data-recovery/ 
I gave all the volumes a fsck and everything seems to be OK. I've switched 
back to the 2.6.18 kernel, as I checked the logs and noticed the disks had 
been having ATA bus errors recently.

Here's the log, I'll be following this up on linux-ide:
Sep  2 13:12:03 martello kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xa frozen
Sep  2 13:12:03 martello kernel: ata6.00: irq_stat 0x01100010, PHY RDY changed
Sep  2 13:12:03 martello kernel: ata6: SError: { 10B8B }
Sep  2 13:12:03 martello kernel: ata6.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Sep  2 13:12:03 martello kernel:          res 50/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Sep  2 13:12:03 martello kernel: ata6.00: status: { DRDY }
Sep  2 13:12:03 martello kernel: ata6: hard resetting link
Sep  2 13:12:03 martello kernel: ata8.00: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xa frozen
Sep  2 13:12:03 martello kernel: ata8.00: irq_stat 0x01100010, PHY RDY changed
Sep  2 13:12:03 martello kernel: ata8: SError: { 10B8B }
Sep  2 13:12:03 martello kernel: ata8.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Sep  2 13:12:03 martello kernel:          res 50/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Sep  2 13:12:03 martello kernel: ata8.00: status: { DRDY }
Sep  2 13:12:03 martello kernel: ata8: hard resetting link
Sep  2 13:12:05 martello kernel: ata6: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:05 martello kernel: ata6: failed to recover some devices, retrying in 5 secs
Sep  2 13:12:05 martello kernel: ata8: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:05 martello kernel: ata8: failed to recover some devices, retrying in 5 secs
Sep  2 13:12:10 martello kernel: ata6: hard resetting link
Sep  2 13:12:10 martello kernel: ata8: hard resetting link
Sep  2 13:12:12 martello kernel: ata6: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:12 martello kernel: ata6: failed to recover some devices, retrying in 5 secs
Sep  2 13:12:12 martello kernel: ata8: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:12 martello kernel: ata8: failed to recover some devices, retrying in 5 secs
Sep  2 13:12:17 martello kernel: ata6: hard resetting link
Sep  2 13:12:17 martello kernel: ata8: hard resetting link
Sep  2 13:12:19 martello kernel: ata6: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:19 martello kernel: ata6.00: disabled
Sep  2 13:12:19 martello kernel: ata8: SATA link down (SStatus 0 SControl 0)
Sep  2 13:12:19 martello kernel: ata8.00: disabled
Sep  2 13:12:19 martello kernel: ata6: EH complete
Sep  2 13:12:19 martello kernel: ata8: EH complete
Sep  2 13:12:19 martello kernel: ata6.00: detaching (SCSI 5:0:0:0)
Sep  2 13:12:19 martello kernel: sd 5:0:0:0: [sdc] Synchronizing SCSI cache
Sep  2 13:12:19 martello kernel: md: super_written gets error=-5, uptodate=0
Sep  2 13:12:19 martello kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 3 devices
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: rejecting I/O to offline device
Sep  2 13:12:19 martello last message repeated 2 times
Sep  2 13:12:19 martello kernel: md: super_written gets error=-5, uptodate=0
Sep  2 13:12:19 martello kernel: raid5: Disk failure on sde1, disabling device. Operation continuing on 2 devices
Sep  2 13:12:19 martello kernel: sd 5:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep  2 13:12:19 martello kernel: sd 5:0:0:0: [sdc] Stopping disk
Sep  2 13:12:19 martello kernel: sd 5:0:0:0: [sdc] START_STOP FAILED
Sep  2 13:12:19 martello kernel: sd 5:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep  2 13:12:19 martello kernel: ata8.00: detaching (SCSI 7:0:0:0)
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: [sde] Synchronizing SCSI cache
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: [sde] Stopping disk
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: [sde] START_STOP FAILED
Sep  2 13:12:19 martello kernel: sd 7:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep  2 13:12:19 martello kernel: RAID5 conf printout:
Sep  2 13:12:19 martello kernel:  --- rd:4 wd:2
Sep  2 13:12:19 martello kernel:  disk 0, o:1, dev:sdd1
Sep  2 13:12:19 martello kernel:  disk 1, o:1, dev:sdb1
Sep  2 13:12:19 martello kernel:  disk 2, o:0, dev:sde1
Sep  2 13:12:19 martello kernel:  disk 3, o:0, dev:sdc1
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 23082301
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 8705
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: Aborting journal on device dm-3.
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 55458
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: journal commit I/O error
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 18258036
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 18238496
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 5551239
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: Buffer I/O error on device dm-3, logical block 16650170
Sep  2 13:12:19 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:19 martello kernel: journal commit I/O error
Sep  2 13:12:19 martello kernel: ext3_abort called.
Sep  2 13:12:19 martello kernel: EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted journal
Sep  2 13:12:19 martello kernel: Remounting filesystem read-only
Sep  2 13:12:19 martello kernel: journal commit I/O error
Sep  2 13:12:19 martello kernel: RAID5 conf printout:
Sep  2 13:12:19 martello kernel:  --- rd:4 wd:2
Sep  2 13:12:19 martello kernel:  disk 0, o:1, dev:sdd1
Sep  2 13:12:19 martello kernel:  disk 1, o:1, dev:sdb1
Sep  2 13:12:19 martello kernel:  disk 2, o:0, dev:sde1
Sep  2 13:12:19 martello kernel: RAID5 conf printout:
Sep  2 13:12:19 martello kernel:  --- rd:4 wd:2
Sep  2 13:12:19 martello kernel:  disk 0, o:1, dev:sdd1
Sep  2 13:12:19 martello kernel:  disk 1, o:1, dev:sdb1
Sep  2 13:12:19 martello kernel:  disk 2, o:0, dev:sde1
Sep  2 13:12:19 martello kernel: RAID5 conf printout:
Sep  2 13:12:19 martello kernel:  --- rd:4 wd:2
Sep  2 13:12:19 martello mdadm: Fail event detected on md device /dev/md0, component device /dev/sde1
Sep  2 13:12:19 martello kernel:  disk 0, o:1, dev:sdd1
Sep  2 13:12:19 martello kernel:  disk 1, o:1, dev:sdb1
Sep  2 13:12:19 martello mdadm: Fail event detected on md device /dev/md0, component device /dev/sdc1
Sep  2 13:12:36 martello kernel: ata6: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xa frozen
Sep  2 13:12:36 martello kernel: ata6: irq_stat 0x00a00080, device exchanged
Sep  2 13:12:36 martello kernel: ata6: hard resetting link
Sep  2 13:12:36 martello kernel: ata8: exception Emask 0x10 SAct 0x0 SErr 0x40000 action 0xa frozen
Sep  2 13:12:36 martello kernel: ata8: irq_stat 0x00800080, device exchanged
Sep  2 13:12:36 martello kernel: ata8: SError: { CommWake }
Sep  2 13:12:36 martello kernel: ata8: hard resetting link
Sep  2 13:12:38 martello kernel: printk: 5 messages suppressed.
Sep  2 13:12:38 martello kernel: Buffer I/O error on device dm-3, logical block 0
Sep  2 13:12:38 martello kernel: lost page write due to I/O error on dm-3
Sep  2 13:12:44 martello kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 0)
Sep  2 13:12:44 martello kernel: ata6.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
Sep  2 13:12:44 martello kernel: ata6.00: 390721968 sectors, multi 0: LBA48
Sep  2 13:12:44 martello kernel: ata6.00: configured for UDMA/100
Sep  2 13:12:44 martello kernel: ata6: EH complete
Sep  2 13:12:44 martello kernel: scsi 5:0:0:0: Direct-Access     ATA      ST3200822AS      3.01 PQ: 0 ANSI: 5
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] 390721968 512-byte hardware sectors (200050 MB)
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Write Protect is off
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] 390721968 512-byte hardware sectors (200050 MB)
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Write Protect is off
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep  2 13:12:44 martello kernel:  sdf: sdf1
Sep  2 13:12:44 martello kernel: sd 5:0:0:0: [sdf] Attached SCSI disk
Sep  2 13:12:44 martello kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 0)
Sep  2 13:12:44 martello kernel: ata8.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
Sep  2 13:12:44 martello kernel: ata8.00: 390721968 sectors, multi 0: LBA48
Sep  2 13:12:44 martello kernel: ata8.00: configured for UDMA/100
Sep  2 13:12:44 martello kernel: ata8: EH complete
Sep  2 13:12:44 martello kernel: scsi 7:0:0:0: Direct-Access     ATA      ST3200822AS      3.01 PQ: 0 ANSI: 5
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] 390721968 512-byte hardware sectors (200050 MB)
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Write Protect is off
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Mode Sense: 00 3a 00 00
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] 390721968 512-byte hardware sectors (200050 MB)
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Write Protect is off
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Mode Sense: 00 3a 00 00
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep  2 13:12:44 martello kernel:  sdg: sdg1
Sep  2 13:12:44 martello kernel: sd 7:0:0:0: [sdg] Attached SCSI disk

-- 
# TRS-80              trs80(a)ucc.gu.uwa.edu.au #/ "Otherwise Bub here will do \
# UCC Wheel Member     http://trs80.ucc.asn.au/ #|  what squirrels do best     |
[ "There's nobody getting rich writing          ]|  -- Collect and hide your   |
[  software that I know of" -- Bill Gates, 1980 ]\  nuts." -- Acid Reflux #231 /


More information about the tech mailing list