[tech] [committee] Temperature Monitoring in Server Room [repost]

Melissa Star melissa at netexperts.com.au
Tue Mar 19 12:21:50 AWST 2019


Hi Andrew,

Thanks, this looks interesting.

I may try this, rather than writing my own. Or I may write my own anyway from the exercise, but use it as a point of comparison.

It's important for me that the tool can both send me SMS warnings and auto-shutdown a server in extreme conditions.

Regards,

Melissa

> On 18 Mar 2019, at 10:54 pm, Andrew Williams <andrew at ucc.gu.uwa.edu.au> wrote:
> 
> On 2019-03-18 9:58 PM, David Adam wrote:
>> On Mon, 18 Mar 2019, Melissa Star wrote:
>>> I just realised - if you have smartmontools installed on linux machines,
>>> each hard drive or SSD will provide its “Airflow Temperature”, which I
>>> can extract via script.
>>> 
>>> I'm thinking of centralising this for all the servers I run, and
>>> collecting the data to chart, having a display at home that gives me
>>> live info for all machines under my control.
>> We used to do this on all the servers, but I think evil is the only one
>> still running:
> 
> Rather than rolling your own temperature monitoring scripts and code to display them, I highly recommend installing Nagios/Icinga or equivalent. That will monitor network services (web, database, NTP, SSH, etc), host state, disk space, rack and internal temperatures, voltages, fan speeds, etc, on tens or hundreds of machines.
> 
> Here's the Icinga2 setup for the MWA telescope - it's using a mix of built-in and third-party plugins for the sort of things you'd see in a normal server room, plus custom plugins to monitor the actual telescope hardware and software health.
> 
> http://icinga.mwa128t.org/icingaweb2/monitoring/list/hostgroups
> 
> (username 'guest', password 'mwa-guest')
> 
> The performance data (raw values from every sensor or measurement) is automatically piped from icinga to a Whisper/Carbon backend, and we use Graphite to view the time series plots:
> 
> http://graphite.mwa128t.org/dashboard
> 
> You can either go to Dashboard/Finder and choose one of our pre-saved plot layouts (please don't change them, or save new ones), or drill down through the monitoring point tree using the top half of the page (starting with icinga2. then going down through a hostname, then a service on that host, until you reach a ....value leaf node, and add a graph showing that value to the dashboard). I usually prefer to use the Tree interface instead - go to Dashboard/Configure UI, then choose 'Tree (left nav)'.
> 
> Andrew



More information about the tech mailing list