[tech] Motsugo Monitoring

Andrew Adamson bob at ucc.gu.uwa.edu.au
Sun Mar 18 13:36:21 WST 2012


This is probably more relevant to hostmasters, but sent to tech so that 
others can learn.

Motsugo has been spamming hostmasters with email alerts about RAM 
temperature pretty much every day since the management controller got 
un-broken/rebooted. This is probably due to the internal layout of the 
server causing decreased airflow over some ram modules. Anyway, today I 
finally got sick of it and increased the alert thresholds an extra 5 
degrees.

Since you can't change thresholds (or do anything useful) from within the 
webpage config, you have to use ipmitool from the command line. The 
required command is `ipmitool sensor list' to get the list of available 
sensors, and then `ipmitool sensor thresh "P1-DIMM1A Temp" upper 70 75 80' 
(and similar for the other DIMMs). The three temperatures are the 
non-critical, critical and non-recoverable thresholds respectively.

Andrew Adamson
bob at ucc.asn.au

|"If you can't beat them, join them, and then beat them."                |
| ---Peter's Laws    


More information about the tech mailing list