As much as I try to be a “real” programmer, the reality is that we need automation, and setting up automation is a grind. A necessary grind.
One thing that I found frustrating was that, in order to test our automation, I needed to kick off a pipeline in our git server (gitlab, but the logic holds for others) even though the majority of the heavy lifting was done in a single bash script.
In order to get to the point where we could run that script in a gitlab runner, we needed to install a bunch of packages (Dwarves, Make, and so forth) as well as do some SSH Key provisioning in order to copy the artifacts off to a server. The gitlab-ci.yml file ended up being a couple doze lines long, and all those lines were bash commands.
So I pulled the lines out of gitlab-ci.yml and put them into the somewhat intuitively named file workflow.sh. Now my gitlab-ci.yml file is basically a one liner that calls workflow.sh.
But I also made it so workflow.sh can be called from the bash command line of a new machine. This is the key part. By doing this, I am creating automation that the rest of my team can use without relying on gitlab. Since the automation will be run from gitlab, no one can check in a change that breaks the CI, but they can make changes that will make life easier for them on the remote systems.
The next step is to start breaking apart the workflow into separate pipelines, due to CI requirements. To do this, I do three things:
- Move the majority of the logic into functions, and source a functions.sh file. This lets me share across top-level bash scripts
- Make one top-level function for each pipeline.
- replace workflow.sh with a script per pipeline. These are named pipeline_<stage>. These scripts merely change to the source directory, and then call top level functions in functions.sh.
The reason for the last split is to keep logic from creeping into the pipeline functions. They are merely interfaces to the single set of logic in functions.sh.
The goal of having the separate functions source-able is to be able to run interior steps of the overall processing without having to run the end-to-end work. This is to save the sitting-around time for waiting for a long running process to complete….more on that in a future article.