[tech] Motsugo Monitoring
Andrew Adamson
bob at ucc.gu.uwa.edu.au
Sun Mar 18 13:36:21 WST 2012
This is probably more relevant to hostmasters, but sent to tech so that
others can learn.
Motsugo has been spamming hostmasters with email alerts about RAM
temperature pretty much every day since the management controller got
un-broken/rebooted. This is probably due to the internal layout of the
server causing decreased airflow over some ram modules. Anyway, today I
finally got sick of it and increased the alert thresholds an extra 5
degrees.
Since you can't change thresholds (or do anything useful) from within the
webpage config, you have to use ipmitool from the command line. The
required command is `ipmitool sensor list' to get the list of available
sensors, and then `ipmitool sensor thresh "P1-DIMM1A Temp" upper 70 75 80'
(and similar for the other DIMMs). The three temperatures are the
non-critical, critical and non-recoverable thresholds respectively.
Andrew Adamson
bob at ucc.asn.au
|"If you can't beat them, join them, and then beat them." |
| ---Peter's Laws
More information about the tech
mailing list