[tech] [committee] Temperature Monitoring in Server Room [repost]

Andrew Williams andrew at ucc.gu.uwa.edu.au
Mon Mar 18 22:54:00 AWST 2019


On 2019-03-18 9:58 PM, David Adam wrote:
> On Mon, 18 Mar 2019, Melissa Star wrote:
>> I just realised - if you have smartmontools installed on linux machines,
>> each hard drive or SSD will provide its “Airflow Temperature”, which I
>> can extract via script.
>>
>> I'm thinking of centralising this for all the servers I run, and
>> collecting the data to chart, having a display at home that gives me
>> live info for all machines under my control.
> 
> We used to do this on all the servers, but I think evil is the only one
> still running:

Rather than rolling your own temperature monitoring scripts and code to 
display them, I highly recommend installing Nagios/Icinga or equivalent. 
That will monitor network services (web, database, NTP, SSH, etc), host 
state, disk space, rack and internal temperatures, voltages, fan speeds, 
etc, on tens or hundreds of machines.

Here's the Icinga2 setup for the MWA telescope - it's using a mix of 
built-in and third-party plugins for the sort of things you'd see in a 
normal server room, plus custom plugins to monitor the actual telescope 
hardware and software health.

http://icinga.mwa128t.org/icingaweb2/monitoring/list/hostgroups

(username 'guest', password 'mwa-guest')

The performance data (raw values from every sensor or measurement) is 
automatically piped from icinga to a Whisper/Carbon backend, and we use 
Graphite to view the time series plots:

http://graphite.mwa128t.org/dashboard

You can either go to Dashboard/Finder and choose one of our pre-saved 
plot layouts (please don't change them, or save new ones), or drill down 
through the monitoring point tree using the top half of the page 
(starting with icinga2. then going down through a hostname, then a 
service on that host, until you reach a ....value leaf node, and add a 
graph showing that value to the dashboard). I usually prefer to use the 
Tree interface instead - go to Dashboard/Configure UI, then choose 'Tree 
(left nav)'.

Andrew


More information about the tech mailing list