[tech] UCC system monitoring, was Re: Runaway chromium processes on motsugo?
Andrew Adamson
bob at ucc.gu.uwa.edu.au
Sat Feb 26 19:38:11 AWST 2022
Further to this, I've had to kill all the chromium processes for neromirt
on motsugo as it was making the machine visibly lag for other users.
Andrew Adamson
bob at ucc.asn.au
|"If you can't beat them, join them, and then beat them." |
| ---Peter's Laws |
On Sat, 26 Feb 2022, Nick Bannon wrote:
> Hi there - good on getting the ~/tmp $TMPDIR issue under control, but
> there was enough there to keep motsugo busy all night and interactive
> logins for the rest of us were visibly suffering. Also, ~neromirt/tmp
> has hung around long enough to add 1.8GB to the daily backups.
>
> I take it you're learning UNIX/bash/Python scripting? You'll need to
> ask questions in a bit more of a visible place to get more help:
> - email to the tech mailing list, I've taken the liberty of getting us
> started, there
> - https://lists.ucc.gu.uwa.edu.au/mailman/listinfo/tech
> - or chat - have you checked out the Fresher Guide?
> - matrix.ucc.asn.au (or Discord or IRC)
>
> For a start, check system load with uptime(1).
> $ uptime
> 11:28:49 up 144 days, 1:02, 46 users, load average: 25.25, 24.52, 23.46
>
> Monitor your processes with ps(1), and keep load average below CPU
> count. On motsugo that's 8 cores, but it's a shared machine with lots
> of other people, too.
>
> For a broader view:
> Login to
> - http://uccmonitor.ucc.asn.au:3000/login
> - that will need to be from inside the UCC network
> - https://gitlab.ucc.asn.au/users/sign_in
>
> I would like some people to help make an uccmonitor/grafana dashboard that
> we can display on cerberus - the three screens at the door to the clubroom.
>
> This shows motsugo since last night - there's the "CPU Basic" and the
> "System Detail -> System Load" further down.
>
> http://uccmonitor.ucc.asn.au:3000/d/uYiRn3BZk/node-exporter-full?orgId=1&var-job=other&var-name=motsugo&var-node=motsugo.ucc.asn.au&var-port=9100&from=1645790400000&to=1645848000000
>
> Thanks,
> Nick.
>
> On Mon, Jan 24, 2022 at 03:47:33PM +0000, Ming Han Ong (22493665) wrote:
> > Hi Nick,
> >
> > Sorry, I am a bit new to using chromium and using selenium with python and wasn't aware of the extra processes being created and clogging up the system (I guess even headless chrome still finds a way to eat up your system).
> >
> > I will try to be more careful next time, could you provide any tips on how I could keep track of how many system resources I am using so that I can try prevent this from happening in the future.
> >
> > Regards,
> > Ming Han
> > ________________________________
> > From: Nick Bannon <nick at ucc.gu.uwa.edu.au>
> > Sent: 24 January 2022 19:29
> > To: Ming Han Ong <neromirt at ucc.gu.uwa.edu.au>
> > Cc: wheel at ucc.gu.uwa.edu.au <wheel at ucc.gu.uwa.edu.au>
> > Subject: Runaway chromium processes on motsugo?
> >
> > Hi there!
> >
> > Would you be able to cut back the number of Chromium process instances a
> > bit and make sure the rest of /tmp/.org.chromium.Chromium.* directories
> > are cleaned up when their processes are?
> >
> > It looks like since about Friday night there's been a big bunch of
> > Chromium processes on motsugo, which ended up filling /tmp to 100%
> > with their temporary files. Which caused new logins to fail.
> >
> > About 208 directories and cache contents similar to this:
> > drwx------ 2 neromirt 40 Jan 21 21:17 /tmp/.org.chromium.Chromium.BOIwJw
> >
> > motsugo$ df -hT /tmp
> > Filesystem Type Size Used Avail Use% Mounted on
> > none tmpfs 2.0G 2.0G 0 100% /tmp
> >
> > Plus 700+ processes, e.g.:
> > neromirt 24035 1 0 14:13 pts/121 00:01:44 python3 debpw1.pyc 9
> > neromirt 24128 24035 4 14:13 pts/121 00:12:54 \_ chromedriver --port=50117
> > neromirt 24172 24128 4 14:13 pts/121 00:13:40 \_ /usr/lib/chromium/chromium --show-component-extension-options --
> > neromirt 24188 24172 0 14:13 pts/121 00:00:00 \_ /usr/lib/chromium/chromium --type=zygote --no-zygote-sandbox
> > [...]
> >
> > I've freed up a bit of space in /tmp - are you OK to clean up the rest?
> >
> > Thanks,
> > Nick.
>
> --
> Nick Bannon | "I made this letter longer than usual because
> nick-sig at rcpt.to | I lack the time to make it shorter." - Pascal
> _______________________________________________
> List Archives: http://lists.ucc.asn.au/pipermail/tech
>
> Unsubscribe here: https://lists.ucc.gu.uwa.edu.au/mailman/options/tech/bob%40ucc.gu.uwa.edu.au
>
More information about the tech
mailing list