Ironic Clean PXE failure

Posted on October 15, 2021 by Adam Young

One of our ironic baremetal nodes was suffering a cleaing failure. Fixing it was easy…once we knew the cause/

Select only the Jades

Posted on October 1, 2021 by Adam Young

Some custom jq for RegEx selection of OpenStack Ironic baremetal nodes. Our Server types show up in their names. I want to be able to build lists of only the Mt. Jade Servers, which have names that look like this:

jade09-r097

openstack baremetal node list  --sort provision_state:asc   -c UUID -c Name -f json | jq '.[] | select(.Name | test("jade."))'

What Nodes are broken?

Posted on September 30, 2021 by Adam Young

While I tend to think about the nodes in OpenStack term, the people that physically move the servers around are more familiar with their IPMI address. We have several nodes that are not responding to IPMI requests. Some have been put into the manageable state, some are in error.

Here’s the query I used to list them.

Continue reading →

Legible Error traces from openstack server show

Posted on September 30, 2021 by Adam Young

If an OpenStack server (Ironic or Nova) has an error, it shows up in a nested field. That field is hard to read in its normal layout, due to JSON formatting. Using jq to strip the formatting helps a bunch

The nested field is fault.details.

The -r option strips off the quotes.

Continue reading →

Debugging a Clean Failure in Ironic

Posted on September 30, 2021 by Adam Young

My team is running a small OpenStack cluster with reposnsibility for providing bare metal nodes via Ironic. Currently, we have a handful of nodes that are not usable. They show up as “Cleaning failed.” I’m learning how to debug this process.

Continue reading →

IPMI triggering a PXE install

Posted on August 16, 2021 by Adam Young

To reinstall a machine that is managed by IPMI, you tell it to PXE boot and then power cycle. Here are my notes.

Continue reading →

Querying hostnames from beaker

Posted on April 5, 2021 by Adam Young

If you have requested a single host from beaker, the following one liner will tell the hostname for it.

bkr job-results   $( bkr job-list  -o $USER  --unfinished | jq -r  ".[]" )   | xpath -q -e string\(/job/recipeSet/recipe/roles/role/system/@value\)

This requires jq and xpath, as well as the beaker command line packages.

For me on Fedora 33 the packages are:

perl-XML-XPath-1.44-7.fc33.noarch
jq-1.6-5.fc33.x86_64
python3-beaker-1.10.0-9.fc33.noarch
beaker-redhat-0.2.1-2.fc33eng.noarch
beaker-common-28.2-1.fc33.noarch
beaker-client-28.2-1.fc33.noarch

Merging root and home filesystems

Posted on March 24, 2021 by Adam Young

Yocto takes up a lot of space when it builds. If the /home partition is 30 GB or smaller, I am going to fill it up. The systems I get provisioned from Beaker are routinely splitting their disks between / and /home. These are both logical volumes in the same volume group. This is easy to merge.

In order to merge them I find myself performing the following steps.

umount /home/
mkdir /althome

I then modify /etc/fstab so that the /home entry is now pointing to /althome. If I have done any work in /home/ayoung (almost always) I have to copy it to the new /home partition

mount /alhome/
cp  /althome/ayoung /home/ayoung

Once the home volume has been cleared, I can reclaim the space. The following lines will vary depending on the name of the machine.

lvremove /dev/rhel_hpe-moonshot-02-c07/home
lvresize  -L   +32.48G  /dev/rhel_hpe-moonshot-02-c07/root

I am explicitly reclaiming the size of the /home volume, which in this case is 32.48 GB.

A little bit of foresight can obviously avoid this problem; properly allocate the disks according to the workload. Requesting a machine with more disk is also an option.

But sometimes we have to fix mistakes.

Note that I use the lvdisplay command to see the names of the volumes.

In order to make use of the new space, I have to resize the file system. Since it is XFS, I use the xfs_grow command. I want the full size, so I don’t need to pass a parameter.

xfs_growfs /dev/mapper/rhel_hpe--moonshot--02--c07-root

Jamulus Server with a Low Latency Kernel on F33

Posted on March 18, 2021 by Adam Young

I’m trying to run a Jamulus server . I got it running, but the latency was high. My first step was to add the real time kernel from CCRMA.

CCRMA no longer ships a super-package for core. The main thing missing seems to be the rtirq package.

installed the ccrma repo file.
installed the real time Kernel
Set the RT kernel as the default.
installed the rtirq scripts rpm
enabler the systemd module for rtirq
rebooted
cloned the Jamulus repo from git
configure, built, and installed Jamulus from the sources
added a systemd module for Jamulus
set selinux to permissive mode (starting Jamulus failed without this)
started Jamulus
ensured I could connect to it
stopped jamulus
set selinux to enforcing mode
restarted Jamulus from systemctl
connected from my desktop to the Jamulus server
Jammed

It does not seem to have much impact on the latency I am seeing. I think that is bound more by network.

Setting the Default Kernel on Fedora 33

Posted on March 18, 2021 by Adam Young

I have a server that I want to run the Real Time Kernel from CCRMA. Once I followed the steps to get the kernel installed, I had to reboot to use it.

Rebooting on a server with a short timeout for grub is frustrating.

Since the Fedora Kernel is installed, and I want to be able to run it as a backup kernel, I had to figure out how to change the default Kernel for Grub2. Most of the docs out there assume that you can list the menu-items in the grub2 config file, but that is a thing of the past. The lines are now auto-generated from a regex match of the places where one might place the vmlinuz files.

I ended up booting the machine and looking at the grub menu, which showed three Kernels installed; two Fedora Kernels and the RT from CCRMA. The RT Kernel was the second one on the list. But Grub is 0 relative, so to set the default Kernel:

sudo grub2-set-default 1

The next time it booted, it was set to the RT kernel;

$ uname -r
5.10.2-200.rt20.1.fc33.ccrma.x86_64+rt

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Category Archives: Provisioning

Ironic Clean PXE failure

Select only the Jades

What Nodes are broken?

Legible Error traces from openstack server show

Debugging a Clean Failure in Ironic

Table of contents

IPMI triggering a PXE install

Querying hostnames from beaker

Merging root and home filesystems

Jamulus Server with a Low Latency Kernel on F33

Setting the Default Kernel on Fedora 33