One of our ironic baremetal nodes was suffering a cleaing failure. Fixing it was easy…once we knew the cause/
Continue readingCategory Archives: Provisioning
Select only the Jades
Some custom jq for RegEx selection of OpenStack Ironic baremetal nodes. Our Server types show up in their names. I want to be able to build lists of only the Mt. Jade Servers, which have names that look like this:
jade09-r097
openstack baremetal node list --sort provision_state:asc -c UUID -c Name -f json | jq '.[] | select(.Name | test("jade."))' |
What Nodes are broken?
While I tend to think about the nodes in OpenStack term, the people that physically move the servers around are more familiar with their IPMI address. We have several nodes that are not responding to IPMI requests. Some have been put into the manageable state, some are in error.
Here’s the query I used to list them.
Legible Error traces from openstack server show
If an OpenStack server (Ironic or Nova) has an error, it shows up in a nested field. That field is hard to read in its normal layout, due to JSON formatting. Using jq to strip the formatting helps a bunch
The nested field is fault.details.
The -r option strips off the quotes.
Continue readingDebugging a Clean Failure in Ironic
Table of contents
My team is running a small OpenStack cluster with reposnsibility for providing bare metal nodes via Ironic. Currently, we have a handful of nodes that are not usable. They show up as “Cleaning failed.” I’m learning how to debug this process.
Continue readingIPMI triggering a PXE install
To reinstall a machine that is managed by IPMI, you tell it to PXE boot and then power cycle. Here are my notes.
Continue readingQuerying hostnames from beaker
If you have requested a single host from beaker, the following one liner will tell the hostname for it.
bkr job-results $( bkr job-list -o $USER --unfinished | jq -r ".[]" ) | xpath -q -e string\(/job/recipeSet/recipe/roles/role/system/@value\) |
This requires jq and xpath, as well as the beaker command line packages.
For me on Fedora 33 the packages are:
- perl-XML-XPath-1.44-7.fc33.noarch
- jq-1.6-5.fc33.x86_64
- python3-beaker-1.10.0-9.fc33.noarch
- beaker-redhat-0.2.1-2.fc33eng.noarch
- beaker-common-28.2-1.fc33.noarch
- beaker-client-28.2-1.fc33.noarch
Merging root and home filesystems
Yocto takes up a lot of space when it builds. If the /home partition is 30 GB or smaller, I am going to fill it up. The systems I get provisioned from Beaker are routinely splitting their disks between / and /home. These are both logical volumes in the same volume group. This is easy to merge.
In order to merge them I find myself performing the following steps.
umount /home/ mkdir /althome |
I then modify /etc/fstab so that the /home entry is now pointing to /althome. If I have done any work in /home/ayoung (almost always) I have to copy it to the new /home partition
mount /alhome/ cp /althome/ayoung /home/ayoung |
Once the home volume has been cleared, I can reclaim the space. The following lines will vary depending on the name of the machine.
lvremove /dev/rhel_hpe-moonshot-02-c07/home lvresize -L +32.48G /dev/rhel_hpe-moonshot-02-c07/root |
I am explicitly reclaiming the size of the /home volume, which in this case is 32.48 GB.
A little bit of foresight can obviously avoid this problem; properly allocate the disks according to the workload. Requesting a machine with more disk is also an option.
But sometimes we have to fix mistakes.
Note that I use the lvdisplay command to see the names of the volumes.
In order to make use of the new space, I have to resize the file system. Since it is XFS, I use the xfs_grow command. I want the full size, so I don’t need to pass a parameter.
xfs_growfs /dev/mapper/rhel_hpe--moonshot--02--c07-root |
Jamulus Server with a Low Latency Kernel on F33
I’m trying to run a Jamulus server . I got it running, but the latency was high. My first step was to add the real time kernel from CCRMA.
CCRMA no longer ships a super-package for core. The main thing missing seems to be the rtirq package.
- installed the ccrma repo file.
- installed the real time Kernel
- Set the RT kernel as the default.
- installed the rtirq scripts rpm
- enabler the systemd module for rtirq
- rebooted
- cloned the Jamulus repo from git
- configure, built, and installed Jamulus from the sources
- added a systemd module for Jamulus
- set selinux to permissive mode (starting Jamulus failed without this)
- started Jamulus
- ensured I could connect to it
- stopped jamulus
- set selinux to enforcing mode
- restarted Jamulus from systemctl
- connected from my desktop to the Jamulus server
- Jammed
It does not seem to have much impact on the latency I am seeing. I think that is bound more by network.
Setting the Default Kernel on Fedora 33
I have a server that I want to run the Real Time Kernel from CCRMA. Once I followed the steps to get the kernel installed, I had to reboot to use it.
Rebooting on a server with a short timeout for grub is frustrating.
Since the Fedora Kernel is installed, and I want to be able to run it as a backup kernel, I had to figure out how to change the default Kernel for Grub2. Most of the docs out there assume that you can list the menu-items in the grub2 config file, but that is a thing of the past. The lines are now auto-generated from a regex match of the places where one might place the vmlinuz files.
I ended up booting the machine and looking at the grub menu, which showed three Kernels installed; two Fedora Kernels and the RT from CCRMA. The RT Kernel was the second one on the list. But Grub is 0 relative, so to set the default Kernel:
sudo grub2-set-default 1 |
The next time it booted, it was set to the RT kernel;
$ uname -r 5.10.2-200.rt20.1.fc33.ccrma.x86_64+rt |