[committee] Fwd: [tech] Fail event on /dev/md/0:medico (fwd)

Mon Jun 23 16:52:17 AWST 2014

I think it was more of a case of "Wheel needs to tell us exactly what 
they need"

medico is our VM server; it hosts all our virtual machines including 
mussel which does pretty much everything.

The system disks are two SSDs in raid1 (ie: the same data is on both 
disks). One of these has just died. Medico will still run with the other 
disk, but if that one dies as well bad things will happen*.

So, last time this happened, we got the 128GB Samsung 840 Pro series.
This should be what the remaining disk in medico is.

This is where we got it from:
http://www.pccasegear.com/index.php?main_page=product_info&cPath=210_902_1370&products_id=22102

I suggest we get this or something similar to replace the failed SSD.

[SZM]

* See also: http://zanchey.ucc.asn.au/qdb/index.cgi?id=9261

On 23/06/14 15:41, Andrew Adamson wrote:
> There are people on committee@ who aren't on tech@? Seriously?
>
> Andrew Adamson
> bob at ucc.asn.au
>
> |"If you can't beat them, join them, and then beat them."                |
> | ---Peter's Laws                                                        |
>
> On Mon, 23 Jun 2014, Sam Moore wrote:
>
>> Since we got complaints it wasn't brought to the committee...
>>
>> [SZM]
>>
>>
>> -------- Original Message --------
>> Subject: [tech] Fail event on /dev/md/0:medico (fwd)
>> Date: Sat, 21 Jun 2014 21:17:39 +0800 (WST)
>> From: Andrew Adamson <bob at ucc.gu.uwa.edu.au>
>> To: tech at ucc.gu.uwa.edu.au
>>
>> This appears to be the second of the two original SSD's in medico to die,
>> and needs to be replaced ASAP.
>>
>> Andrew Adamson
>> bob at ucc.asn.au
>>
>> |"If you can't beat them, join them, and then beat them."                |
>> | ---Peter's Laws                                                        |
>>
>> ---------- Forwarded message ----------
>> Date: Sat, 21 Jun 2014 14:35:53 +0800 (WST)
>> From: mdadm monitoring <root at ucc.gu.uwa.edu.au>
>> To: root at ucc.gu.uwa.edu.au
>> Subject: Fail event on /dev/md/0:medico
>>
>> This is an automatically generated mail message from mdadm
>> running on medico
>>
>> A Fail event had been detected on md device /dev/md/0.
>>
>> Faithfully yours, etc.
>>
>> P.S. The /proc/mdstat file currently contains the following:
>>
>> Personalities : [raid1]
>> md0 : active raid1 sda1[3](F) sdb1[2]
>>         125032767 blocks super 1.2 [2/1] [_U]
>>
>> unused devices: <none>
>> _______________________________________________
>> List Archives: http://lists.ucc.gu.uwa.edu.au/pipermail/tech
>>
>> Unsubscribe here:
>> http://lists.ucc.gu.uwa.edu.au/mailman/options/tech/matches%40ucc.gu.uwa.edu.au
>>
>>
>> _______________________________________________
>> List Archives: http://lists.ucc.gu.uwa.edu.au/pipermail/committee
>>