How not to waste time developing long-running processes

Developing long running tasks might be my least favorite coding activity. I love writing and debugging code…I’d be crazy to be in this profession if I did not. But when a task takes long enough, your attention wanders and you get out of the zone.

Building the Linux Kernel takes time. Even checking the Linux Kernel out of git takes a non-trivial amount of time. The Ansible work I did back in the OpenStack days to build and tear down environments took a good bit of time as well. How do I keep from getting out of the zone while coding on these? It is hard, but here are some techniques.

Continue reading

Keeping the CI logic in bash

As much as I try to be a “real” programmer, the reality is that we need automation, and setting up automation is a grind. A necessary grind.

One thing that I found frustrating was that, in order to test our automation, I needed to kick off a pipeline in our git server (gitlab, but the logic holds for others) even though the majority of the heavy lifting was done in a single bash script.

In order to get to the point where we could run that script in a gitlab runner, we needed to install a bunch of packages (Dwarves, Make, and so forth) as well as do some SSH Key provisioning in order to copy the artifacts off to a server. The gitlab-ci.yml file ended up being a couple doze lines long, and all those lines were bash commands.

So I pulled the lines out of gitlab-ci.yml and put them into the somewhat intuitively named file workflow.sh. Now my gitlab-ci.yml file is basically a one liner that calls workflow.sh.

But I also made it so workflow.sh can be called from the bash command line of a new machine. This is the key part. By doing this, I am creating automation that the rest of my team can use without relying on gitlab. Since the automation will be run from gitlab, no one can check in a change that breaks the CI, but they can make changes that will make life easier for them on the remote systems.

The next step is to start breaking apart the workflow into separate pipelines, due to CI requirements. To do this, I do three things:

  • Move the majority of the logic into functions, and source a functions.sh file. This lets me share across top-level bash scripts
  • Make one top-level function for each pipeline.
  • replace workflow.sh with a script per pipeline. These are named pipeline_<stage>. These scripts merely change to the source directory, and then call top level functions in functions.sh.

The reason for the last split is to keep logic from creeping into the pipeline functions. They are merely interfaces to the single set of logic in functions.sh.

The goal of having the separate functions source-able is to be able to run interior steps of the overall processing without having to run the end-to-end work. This is to save the sitting-around time for waiting for a long running process to complete….more on that in a future article.

Remotely checking out from git using ssh key forwarding.

Much of my work is done on machines that are only on load to me, not permanently assigned. Thus, I need to be able to provision them quickly and with a minimum of fuss. One action I routinely need to do is to check code out of a git server, such as gitlab.com. We use ssh keys to authenticate to gitlab. I need a way to do this securely when working on a remote machine. Here’s what I have found

Continue reading

Print the line after a match using AWK

We have an internal system for allocating hardware to developers on a short term basis. While the software does have a web API, it is not enabled by default, nor in our deployment. Thus, we end up caching a local copy of the data about the machine. The machine names are a glom of architecture, and location. So I make a file with the name of the machine, and a symlink to the one I am currently using.

Continue reading

XPath for libvirt external snapshop path

The following xmllint XPath query will pull out the name of the backing file for a VM named fedora-server-36 and an external snapshot named fedora-36-post-install,

virsh snapshot-dumpxml fedora-server-36 fedora-server-36-post-install | xmllint --xpath "string(//domainsnapshot/disks/disk[@snapshot='external']/source/@file)" -

The string function extracts the attribute value.

This value can be used in the process of using or deleting the snapshot.

Date format suitable for file names

It is rare that you want to write something without later wanting to be able to read it back. One common way of organizing files that are generated regularly is by time stamp. If you want to add a timestamp to a file name, you can do so using the date command.

In order for the filenames to sort in the right order, you want the name to go from largest unit to smallest.

Here is an example that creates a filename-suitable string formed Year->second. I remove all unnecessary formatting characters.

date --rfc-3339=seconds | sed -E -e 's! |-|:!!g'

The date command reads the current date/time on the local system. –rfc-3339=seconds produces output that looks like this:

$ date --rfc-3339=seconds 
2021-11-03 10:57:14-04:00

In order to keep the regular expression concise inside the sed command, the -E switch tells it to use extended regular expressions, including the alternation character ‘|’ . Thus, the regex ‘ |-|:’ matches a space, a dash, and a colon.

Querying hostnames from beaker

If you have requested a single host from beaker, the following one liner will tell the hostname for it.

bkr job-results   $( bkr job-list  -o $USER  --unfinished | jq -r  ".[]" )   | xpath -q -e string\(/job/recipeSet/recipe/roles/role/system/@value\)

This requires jq and xpath, as well as the beaker command line packages.

For me on Fedora 33 the packages are:

  • perl-XML-XPath-1.44-7.fc33.noarch
  • jq-1.6-5.fc33.x86_64
  • python3-beaker-1.10.0-9.fc33.noarch
  • beaker-redhat-0.2.1-2.fc33eng.noarch
  • beaker-common-28.2-1.fc33.noarch
  • beaker-client-28.2-1.fc33.noarch