From jimbo at ucc.asn.au Mon Jul 1 00:22:21 2019 From: jimbo at ucc.asn.au (James Arcus) Date: Mon, 1 Jul 2019 00:22:21 +0800 Subject: [tech] Server donation to UCC: HP Proliant DL360p (+intention to remain on wheel) In-Reply-To: <3a915c39-bd51-408d-9693-8526f6407028@email.android.com> References: <3a915c39-bd51-408d-9693-8526f6407028@email.android.com> Message-ID: That sounds quite exciting. We have two other as-yet-unused servers, a Cisco UCS that is currently being set up and a Dell PowerEdge R710. I believe the intention is to use one of those as an upgrade for Mooneye, which would leave the other one available for using as another Proxmox host. I have been told that an odd number of cluster hosts is ideal (for quorum reasons) and one of those + your donated HP would bring us up to 5. Cheers, James [MPT] On 30/6/19 10:36 pm, Dylan H wrote: > Hi All, > > I have recently acquired a HP Proliant DL360p 1RU server from work > (getting rid of old hardware), and I intend to donate it to UCC, and > set it up as part of our Proxmox cluster. > > This has 192GiB of DDR3-1333 ECC RAM, 2x 8c16t Xeon e5-2690 CPUs at > 2.9GHz, and no storage, so hopefully there's room and sufficient power > for this in the rack, because I think this could grow our cluster > quite well. > I don't yet know when I'll have time to bring this over and set it up, > but I'll probably drop it off one weekend in the next month or so, and > either set it up then or the next project night. > > Also, sorry I've been so uninvolved with wheel and more generally the > club lately; Hopefully this will spell the end of my hiatus. > > I should also mention, before I catch up on my other emails (there are > a lot), I do intend to remain an active wheel member, if UCC will > allow me. > > Kind regards, > Dylan Hicks [333] > > _______________________________________________ > List Archives: http://lists.ucc.asn.au/pipermail/tech > > Unsubscribe here: https://lists.ucc.gu.uwa.edu.au/mailman/options/tech/jimbo%40ucc.asn.au -------------- next part -------------- An HTML attachment was scrubbed... URL: https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190701/b8cf7bc0/attachment.htm From frekk at ucc.asn.au Wed Jul 10 21:19:59 2019 From: frekk at ucc.asn.au (Felix von Perger) Date: Wed, 10 Jul 2019 21:19:59 +0800 Subject: [tech] Maintenance 10am Friday - core switch upgrade Message-ID: Hi all, [MPT] and I will be doing network upgrades on the machine room switch - upgrading the WS-4507R chassis with the Supervisor IV modules to a faster WS-4506E chassis with a Supervisor 6E, copying over the configuration and moving over the ports. If anyone wants to help out or learn something about Cisco switch configuration, please come along and join us starting at 10am this Friday (2019-07-12). Note that the network will probably go up and down a bit during this time, and in order to ensure that nothing gets too badly broken some services may be shut down for the duration of the maintenance. We will hopefully be finished before the committee meeting at 2pm. As always let me know if you have any thoughts/questions. Best regards, Felix von Perger [FVP] From zanchey at ucc.gu.uwa.edu.au Thu Jul 11 18:22:02 2019 From: zanchey at ucc.gu.uwa.edu.au (David Adam) Date: Thu, 11 Jul 2019 18:22:02 +0800 (AWST) Subject: [tech] Backups Message-ID: As probably my last action as a Wheel member, I've sorted out something about the backup server that's been bugging me for ages. At present it has an SSD wedged in as a boot drive, and four 2 TB drives (WD Red, 2 x WD Black, Hitachi Deskstar). Until just now, three of these drives were in hardware RAID-5 with LVM and ext4 on top. The fourth is in a RAID-0 (single disk) with ZFS on top. I thought that the array could not be expanded without wiping it, and we only had three drives initially. I put ZFS on the new drive to get something out of compression, though we were only getting a compression ratio of 1.01, so it's hardly worth it. However, I discovered the PERC 5/i in the machine *can* resize the RAID - online, even! - and it is currently doing so using this awful series of commands: Delete the old single-disk array, virtual disk 1 on adapter 0 with: # megacli -CfgLdDel -L1 -a0 Start a reconfiguration operation on a RAID-5 array adding the new disk (disk numbers from the incredibly useful megaclisas-status tool) to virtual disk 0 on adapter 0: # megacli -LDRecon -Start -r5 -Add -PhysDrv '[8:2]' -L0 -a0 Watch paint dry^W^W the array rebuild with: # megacli -LDRecon -ProgDsply -L0 -a0 This is likely to take some hours so I have disabled the backups for now by commenting them out in the "backups" user crontab. Mollitz itself is not backed up (except locally), so /backups/conf probably needs putting into git or similar. Once it's done I will resize the partition, LVM volume and the filesystem. David Adam Soon-to-be-not UCC Wheel Member zanchey@ From trs80 at ucc.gu.uwa.edu.au Thu Jul 11 21:14:29 2019 From: trs80 at ucc.gu.uwa.edu.au (James Andrewartha) Date: Thu, 11 Jul 2019 21:14:29 +0800 (AWST) Subject: [tech] Backups In-Reply-To: References: Message-ID: Mollitz will probably be inaccessible from 4pm tomorrow until some time Saturday afternoon due to network maintance at CCGS. On Thu, 11 Jul 2019, David Adam wrote: > As probably my last action as a Wheel member, I've sorted out something > about the backup server that's been bugging me for ages. > > At present it has an SSD wedged in as a boot drive, and four 2 TB > drives (WD Red, 2 x WD Black, Hitachi Deskstar). Until just now, three of > these drives were in hardware RAID-5 with LVM and ext4 on top. The fourth > is in a RAID-0 (single disk) with ZFS on top. > > I thought that the array could not be expanded without wiping it, and we > only had three drives initially. I put ZFS on the new drive to get > something out of compression, though we were only getting a compression > ratio of 1.01, so it's hardly worth it. > > However, I discovered the PERC 5/i in the machine *can* resize the RAID - > online, even! - and it is currently doing so using this awful series of > commands: > > Delete the old single-disk array, virtual disk 1 on adapter 0 with: > # megacli -CfgLdDel -L1 -a0 > Start a reconfiguration operation on a RAID-5 array adding the new disk > (disk numbers from the incredibly useful megaclisas-status tool) to > virtual disk 0 on adapter 0: > # megacli -LDRecon -Start -r5 -Add -PhysDrv '[8:2]' -L0 -a0 > Watch paint dry^W^W the array rebuild with: > # megacli -LDRecon -ProgDsply -L0 -a0 > > This is likely to take some hours so I have disabled the backups for now > by commenting them out in the "backups" user crontab. > > Mollitz itself is not backed up (except locally), so /backups/conf > probably needs putting into git or similar. > > Once it's done I will resize the partition, LVM volume and the filesystem. > > David Adam > Soon-to-be-not UCC Wheel Member > zanchey@ > _______________________________________________ > List Archives: http://lists.ucc.asn.au/pipermail/tech > > Unsubscribe here: https://lists.ucc.gu.uwa.edu.au/mailman/options/tech/trs80%40ucc.gu.uwa.edu.au > -- # TRS-80 trs80(a)ucc.gu.uwa.edu.au #/ "Otherwise Bub here will do \ # UCC Wheel Member http://trs80.ucc.asn.au/ #| what squirrels do best | [ "There's nobody getting rich writing ]| -- Collect and hide your | [ software that I know of" -- Bill Gates, 1980 ]\ nuts." -- Acid Reflux #231 / From dylanh333 at ucc.asn.au Sat Jul 13 10:42:20 2019 From: dylanh333 at ucc.asn.au (Dylan H) Date: Sat, 13 Jul 2019 10:42:20 +0800 Subject: [tech] Status update on new servers: Mudkip and Magikarp Message-ID: An HTML attachment was scrubbed... URL: https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190713/8c7ff1f5/attachment.htm From jimbo at ucc.asn.au Sat Jul 13 12:29:23 2019 From: jimbo at ucc.asn.au (James Arcus) Date: Sat, 13 Jul 2019 12:29:23 +0800 Subject: [tech] Machine Room Switch Upgrade Message-ID: <63e3d67a-c269-b249-274a-2c6866df2c2b@ucc.asn.au> Hi All, I'm pleased to say that the Cisco switch chassis/supervisor upgrade on Friday was successful. We are now running a Cisco Catalyst 4506-E with supervisor 6-E and the latest firmware in place of the previous 4507R with supervisor IV. The new switch is named Kerosene and increases throughput per line card from 6 Gb/s to 24 Gb/s with the option of 48 Gb/s with the same chassis in the future. It also adds 10G capability and the ability to run the latest software version. Kerosene is set up nearly identical to Bitumen save for a few things: * A different management IP that is in DNS * The 2xGbE LACP link from Bitumen to Walnut has been replaced with 10G SR fibre * The connections to Bitumen's 2 24-port line cards have been moved onto a 48-port line card * Updated rommon firmware and Cisco iOS The ports have been minimally rearranged to fit the new line cards. Essentially, both switches physically consist of 4 rows of 24 ports, and the row/column position of each connection has been maintained. This is also the tagging system that was used when labelling the ends of the patch leads during the move. As the 48-port line card numbers ports in top/bottom pairs first (i.e. port 2 is the first port on the bottom row, not the second port on the top row), the interfaces in the switch configuration had to be renumbered appropriately. The configs of both Walnut and Kerosene were also altered to use the same settings for the 10G link as were previously applied to the copper pair. (Failing to remember to change Walnut as well was the source of the initial failure for the network to come back up.) Just a final note on the physical removal/installation of the switches: they are very heavy, definitely at least a two-person job. For future reference, we found that moving the patch panel underneath the switch out of the way allowed lifting it directly up into position while mounting, which was much easier. Thanks to [FVP] and [DAS] for their effort and help. If anyone has further questions about the process, don't hesitate to reply. Cheers, James Arcus [MPT] -------------- next part -------------- An HTML attachment was scrubbed... URL: https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190713/9a760750/attachment-0001.htm From jimbo at ucc.asn.au Sat Jul 13 12:35:54 2019 From: jimbo at ucc.asn.au (James Arcus) Date: Sat, 13 Jul 2019 12:35:54 +0800 Subject: [tech] Machine Room Switch Upgrade - TL;DR In-Reply-To: <63e3d67a-c269-b249-274a-2c6866df2c2b@ucc.asn.au> References: <63e3d67a-c269-b249-274a-2c6866df2c2b@ucc.asn.au> Message-ID: <56a93247-4b37-dd08-0b1a-9139a923ad45@ucc.asn.au> I know my previous email was quite long, so I thought I'd just summarize the main points here. * New switch was installed, is faster and newer and lets us use 10 Gb/s fibre * We now have a 10 Gb/s link from the machine room servers to the rest of our network * Everything else should be working the same, if it isn't contact me or [FVP] * Cisco switches are heavy and difficult to lift * Lifting them directly up into place, instead of moving them up and sideways to the rack from outside it, is much easier Cheers, James [MPT] On 13/7/19 12:29 pm, James Arcus wrote: > > Hi All, > > I'm pleased to say that the Cisco switch chassis/supervisor upgrade on > Friday was successful. We are now running a Cisco Catalyst 4506-E with > supervisor 6-E and the latest firmware in place of the previous 4507R > with supervisor IV. The new switch is named Kerosene and increases > throughput per line card from 6 Gb/s to 24 Gb/s with the option of 48 > Gb/s with the same chassis in the future. It also adds 10G capability > and the ability to run the latest software version. > > Kerosene is set up nearly identical to Bitumen save for a few things: > > * A different management IP that is in DNS > * The 2xGbE LACP link from Bitumen to Walnut has been replaced with > 10G SR fibre > * The connections to Bitumen's 2 24-port line cards have been moved > onto a 48-port line card > * Updated rommon firmware and Cisco iOS > > The ports have been minimally rearranged to fit the new line cards. > Essentially, both switches physically consist of 4 rows of 24 ports, > and the row/column position of each connection has been maintained. > This is also the tagging system that was used when labelling the ends > of the patch leads during the move. As the 48-port line card numbers > ports in top/bottom pairs first (i.e. port 2 is the first port on the > bottom row, not the second port on the top row), the interfaces in the > switch configuration had to be renumbered appropriately. > > The configs of both Walnut and Kerosene were also altered to use the > same settings for the 10G link as were previously applied to the > copper pair. (Failing to remember to change Walnut as well was the > source of the initial failure for the network to come back up.) > > Just a final note on the physical removal/installation of the > switches: they are very heavy, definitely at least a two-person job. > For future reference, we found that moving the patch panel underneath > the switch out of the way allowed lifting it directly up into position > while mounting, which was much easier. > > Thanks to [FVP] and [DAS] for their effort and help. > > If anyone has further questions about the process, don't hesitate to > reply. > > Cheers, > > James Arcus [MPT] > > > _______________________________________________ > List Archives: http://lists.ucc.asn.au/pipermail/tech > > Unsubscribe here: https://lists.ucc.gu.uwa.edu.au/mailman/options/tech/jimbo%40ucc.asn.au -------------- next part -------------- An HTML attachment was scrubbed... URL: https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190713/3bbc356c/attachment.htm From zanchey at ucc.gu.uwa.edu.au Thu Jul 18 09:43:19 2019 From: zanchey at ucc.gu.uwa.edu.au (David Adam) Date: Thu, 18 Jul 2019 09:43:19 +0800 (AWST) Subject: [tech] Backups In-Reply-To: References: Message-ID: On Thu, 11 Jul 2019, David Adam wrote: > However, I discovered the PERC 5/i in the machine *can* resize the RAID - > online, even! - and it is currently doing so using this awful series of > commands: > > Delete the old single-disk array, virtual disk 1 on adapter 0 with: > # megacli -CfgLdDel -L1 -a0 > Start a reconfiguration operation on a RAID-5 array adding the new disk > (disk numbers from the incredibly useful megaclisas-status tool) to > virtual disk 0 on adapter 0: > # megacli -LDRecon -Start -r5 -Add -PhysDrv '[8:2]' -L0 -a0 > Watch paint dry^W^W the array rebuild with: > # megacli -LDRecon -ProgDsply -L0 -a0 This finally finished - it took about five days to first rebuild then initialise. > This is likely to take some hours so I have disabled the backups for now > by commenting them out in the "backups" user crontab. I have reversed this change and the backups will run as usual at 0200. > Mollitz itself is not backed up (except locally), so /backups/conf > probably needs putting into git or similar. I have not done this. > Once it's done I will resize the partition, LVM volume and the filesystem. I've done this. I rescanned the device with: # echo 1 > /sys/block/sda/device/rescan The partition table then needs fixing for the new size; https://serverfault.com/a/833738/ was helpful here as fdisk spat out the error "MyLBA mismatch with real position at backup header." I installed gdisk, ran `gdisk /dev/sda`, entered expert mode ('x'), ran the command to 'relocate backup data structures to the end of the disk' ('e'), and wrote the partition table with 'w'. (I also changed the partition type from "Microsoft basic data" (0700) to "Linux LVM" (8e00), though that wasn't essential). Then I ran partprobe as # partprobe I unmounted the /backups array, used parted to resize the partition, and then ran the LVM command to expand the PV, LV and filesystem: # pvresize /dev/sda1 # lvresize --extents +100%FREE --resizefs backups/uccbackups Then mounted it again, and restored the backups crontab. David Adam zanchey at ucc.gu.uwa.edu.au From jimbo at ucc.asn.au Sat Jul 20 11:09:08 2019 From: jimbo at ucc.asn.au (James Arcus) Date: Sat, 20 Jul 2019 11:09:08 +0800 Subject: [tech] Logins Working on Catfish Message-ID: Hi tech, I finished the setup on Catfish's Linux Mint on Thursday (while sitting at a cafe, yay for remote admin). The only step left was to join it to the domain, and to ensure the login manager accepted free-form usernames. Catfish can now be logged in to by all members under both Windows and Linux Mint. Cheers, James [MPT] From jimbo at ucc.asn.au Sat Jul 20 11:17:00 2019 From: jimbo at ucc.asn.au (James Arcus) Date: Sat, 20 Jul 2019 11:17:00 +0800 Subject: [tech] Making admin logins easier on Linux desktops Message-ID: Hi tech (again), Previously on the Linux desktops, anyone on wheel or sprocket could use sudo to do administrative tasks with their own password. However, if they were using a graphical interface, they would usually be asked for the root password for the machine. The graphical admin dialog is handled through software called Polkit, and it had not been configured to recognise any UCC groups as administrators. I have pushed out a configuration change to polkit for all of the clubroom desktops. Now if you are a member of either wheel or sprocket, you will be given the chance to authenticate using your own password instead of having to look up/know the machine's root password. Cheers, James [MPT] From bob at ucc.asn.au Thu Jul 25 21:43:04 2019 From: bob at ucc.asn.au (Bob Adamson) Date: Thu, 25 Jul 2019 21:43:04 +0800 Subject: [tech] Downtime & R.I.P. Maltair In-Reply-To: References: <000d01d42fe7$c03d9840$40b8c8c0$@ucc.asn.au> Message-ID: <002f01d542ee$ec82c360$c5884a20$@ucc.asn.au> Hi All, Just updating this old thread for the benefit of all the people who are coming across it on the internet (I've had a few emails now). The photo below will hopefully make it into the list archives - sorry it's blurry but it does the job. The VT261 is in area circled in the picture, just to the side (left or right, can't remember) of the large, rectangular, dark-grey inductor near the internal SAS card PCIe slot. Regarding our experience with trying to swap out the chip with an ebay/aliexpress one. alas, no luck. The lack of datasheet means we could not even check if we had the right chip or that it was actually broken. I've done a fair bit of surface mount rework, but this chip takes the cake - miniscule pads, in a non-standard layout, all under the chip. So damn hard to solder, and then basically impossible to inspect. If I were to do it again, I would try and figure out some way of creating a tiny single-chip solder stencil to do it with, because just sticking on a bit of solder paste didn't cut it. Regardless, I believe the chip sits on some sort of comms bus for control and may not work without some pre-programming anyway - again not easy to work that out without a data sheet or a working one to probe. Since my last post, the club replaced the M4 with another M4, and even though we did the firmware upgrade, it died a few months later. We also got donated another few M4's.same thing again. Basically, the M4 is a lemon, cut your losses. The most cost-effective response we found was to buy a barebones second-hand HP server of a similar generation and transplant the disks, CPU's and RAM across. I would love to be proven wrong on this by the way - if anyone in the club wants to have another crack at one (even as just a learning experience), we have several to try on! Cheers, Bob -----Original Message----- From: tech-bounces+bob=ucc.gu.uwa.edu.au at ucc.gu.uwa.edu.au On Behalf Of bob at ucc.gu.uwa.edu.au Sent: Tuesday, 14 August 2018 6:26 PM To: Bob Adamson Cc: tech at ucc.asn.au Subject: Re: [tech] Downtime & R.I.P. Maltair Update: I managed to find the VT261 on the mobo last night. I looks like the one in the aliexpress link in my last email. I've ordered a couple off aliexpress, but they will take a few weeks to get here. When they arrive, we have some Damn Finnicky soldering to do (it's surrounded by 0402 sized components). Oh, and [TPG] had a chat to a rep from Maxim, and apparently datasheets for the Volterra VT261 were never made public, so we kinda just have to hope that this chip is the thing that's broken. Andrew Adamson bob at ucc.asn.au |"If you can't beat them, join them, and then beat them." | | ---Peter's Laws | On Thu, 9 Aug 2018, Bob Adamson wrote: > Felix and I de-racked maltair tonight and I pulled its mobo out. The > Lenovo page lists only a "VT261" 5V regulator as probably being > damaged, so I figured we should just be able to find and replace it. Famous last words. > > Google turns up VT261WFQR-ADJ as (the only) possible candidate for > what > VT261 refers to. Unfortunately, googling further for the VT261WFQR-ADJ > datasheet only shows up a Maxim datasheet, which makes sense since > they bought out Volterra in 2013. Just to make things really > interesting, the kynix site (the only result that has a datasheet) > links to an Intersil > datasheet: https://www.kynix.com/uploadfiles/pdf8827/ICL7660ACBA-T.pdf . > The maxim site was a bit more forthcoming once I knew a newer part > number ( > https://datasheets.maximintegrated.com/en/ds/ICL7660-MAX1044.pdf ), but I didn't have any luck looking for 7660 on any of the mobo chips. > > More googling later, and even turning to countries that have a robust > market for *ahem* aftermarket goods, shows up this: > https://ru.aliexpress.com/item/VT261WF-VT261MF-VT261WFQX-ADJ-QFN-1-int > egrate d-circuit/32818058390.html , which is possibly-maybe the thing > we should be looking for on the mobo. There were a few shiny chips on > the board, but I need to return at a later date with my shiny new USB > microscope to check further. > > If anyone else wants to take a look at it, please be careful about > flexing the board while handling (it's very big) and also be careful > not to knock off any components (they're very small, and I mean like >.< this big). > > Oh, and I manually migrated all network-stored VM's to medico today, > and I believe Felix did the remaining locally stored VM's this evening. > > --Bob > > -----Original Message----- > From: tech-bounces+bob=ucc.gu.uwa.edu.au at ucc.gu.uwa.edu.au > [ mailto:tech-bounces+bob=ucc.gu.uwa.edu.au at ucc.gu.uwa.edu.au] On > Behalf Of Felix von Perger > Sent: Wednesday, 8 August 2018 11:51 PM > To: tech at ucc.asn.au > Subject: [tech] Downtime & R.I.P. Maltair > > Dear tech subscribers, > > For those of you who have not been following the committee discussions > of the last week or so, there was a total service outage this morning > between > 8:00 and 10:00 which was due to RCD testing in Cameron Hall. > Apologies for any inconvenience. > > Sadly, in the process of turning things back on after the power was > restored, an IMM2 firmware bug on Maltair seems to have rendered it > permanently unbootable (see > https://support.lenovo.com/au/en/solutions/ht118532). [CFE] performed > a firmware upgrade this evening to the latest version (v6.8) from v4.3 > however it seems like the damage has already been done and either the > entire motherboard or the builtin 5V voltage regulator will need to be > replaced or repaired. > > Due to Maltair being presently out of action, additional downtime may > be experienced for certain services that were previously hosted on Maltair. > Since Maltair accounted for most of our RAM availability, member VMs > with large RAM requirements may remain powered off for the time being > or have their maximum RAM reduced. > > Any suggestions for replacement hardware for Maltair are welcome. The > existing server is a 1RU IBM System x3550 M4 (7914/7915), and it is > likely that the majority of its parts (CPU, RAM, RAID, 10Gb NIC, PSUs) > are still functional despite the system board being fried. > > Best regards, > > Felix von Perger [FVP] > UCC Secretary & Wheel Member > > _______________________________________________ > List Archives: http://lists.ucc.gu.uwa.edu.au/pipermail/tech > > Unsubscribe here: > http://lists.ucc.gu.uwa.edu.au/mailman/options/tech/bob%40ucc.gu.uwa.e > du.au > > _______________________________________________ > List Archives: http://lists.ucc.gu.uwa.edu.au/pipermail/tech > > Unsubscribe here: > http://lists.ucc.gu.uwa.edu.au/mailman/options/tech/bob%40ucc.gu.uwa.e > du.au > _______________________________________________ List Archives: http://lists.ucc.gu.uwa.edu.au/pipermail/tech Unsubscribe here: http://lists.ucc.gu.uwa.edu.au/mailman/options/tech/bob%40ucc.gu.uwa.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190725/4d05e2b4/attachment-0001.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 135941 bytes Desc: not available Url : https://lists.ucc.gu.uwa.edu.au/pipermail/tech/attachments/20190725/4d05e2b4/attachment-0001.jpeg