Ansible exists to help automate the time consuming repeated tasks that technologist depend upon. One very common jobs is to create and tear down a virtual machine. While cloud technologies have made this possible to perform remotely, there are many times when I’ve needed to setup and tear down virtual machines on systems that were stand alone Linux servers. In this case, the main interfaces to the machine are ssh and libvirt. I recently worked through an Ansible role to setup and tear down an virtual machine via libvirt, and I’d like to walk through it, and record my reasons for some of the decisions I made.
Table of contents
Constant Refactoring
Always work from success. Change one thing at a time, so that you know what broke when things don’t work. Thus, when I work something out, the first iteration is hard coded. I get it to work, and then I clean it up. The most common refactoring is to introduce a variable. For example, If I am working with a file such as:
/var/lib/libvirt/images/rhel-server-7.5-x86_64-kvm.qcow2 |
I’ll use exactly that line in the Ansible play to start such as
- name: push base vm image to hypervisor copy: src: /var/lib/libvirt/images/rhel-server-7.5-x86_64-kvm.qcow2 dest: /var/lib/libvirt/images/rhel-server-7.5-x86_64-kvm.qcow2 owner: qemu group: qemu mode: u=rw,g=r,o=r |
Once I get that to work, I’ll clean it up to something like:
- name: push base vm image to hypervisor copy: src: "{{ source_image_dir }}/{{ source_image_file }}" dest: "{{ target_image_dir }}/{{ source_image_file }}" owner: qemu group: qemu mode: u=rw,g=r,o=r |
With the definition of the variables going into the role’s default/main.yml file.
Customizing the VM Backing Store image
The Backing store from the virtual machine is created by copying the original VM image file to a new file, and then using virt-customize to modify the image. This is a little expensive in terms of disk space; I could, instead, clone the original file and use the qcow aspect of it to provide the same image base to all of the VMs generated from it. I might end up doing that in the long run. However, that does put cross file dependencies in place. If I do something to the original file, I lose all of the VMs built off it. If I want to copy the VM to a remote machine, I would have to copy both files and keep them in the same directory. I may end up doing some of that in the future, if disk space becomes an issue.
The virtual machine base image
The above code block shows how I copy the raw image file over to the hypervisor. I find that I am often creating multiple VMs off of the same base file. While I could customize this file directly, it would then no longer match the fingerprint of the file I downloaded from the Red Hat site, and I would have no way to confirm I was using a safe image. Also, copying the file to the remote machine is one of the longest tasks in this playbook, so I do not remove it in the cleanup task list.
Templatization of files to modify the base image
Before I can modify the VM image, I need to copy templated files from the Ansible host to the remote system. This two step process is necessary, as I cannot fill in a template during the call to virt-customize. Thus, all templatization is done in the template tasks. For This script, I use the /tmp directory as the interim location. This could be problematic, and I would be better off creating a deliberate subdirectory under /home/cloud-user or another known location. That would be safer, and less likely to have a conflict.
Network card access and Static IP Addresses
The virtual machine I am building is going to have to work with both a PXE service and also be available to physical machines outside the cluster. As such, I want it to have a network interface linked to a physical one on its host, and to assign that interface a static IP address. The physical passthrough is handled by making the device into a macvtap device. The XML Fragment for it looks like this:
<interface type='direct'> <mac address='52:54:00:26:29:db'/> <source dev='em1' mode='bridge'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </interface> |
The flat naming of the variable ethernet_device will be problematic over time, and I will probably make it a dictionary value under the with_items collection.
To assign this device a static IP address, I copied an ifcfg-eth1 file and templatized it.
Multiple Inventories
I have a fairly powerful laptop that is supposed to work as a portable Demo machine. I want to be able to use the same playbook to deploy VMs on the laptop as I do on the workstation I’;ve been testing this on. On my laptop, I typically run with sshd disabled, and only enable it when I want to run this or similar Ansible playbooks.
Part of the constant refactoring is moving variables from the tasks, to defaults, to the inventory files.
More and more, my inventory setup is starting to look like ansible tower. Eventually, I expect to have something like the template mechanism to be able to track “run this playbook with that inventory and these specific values.”
Creating servers in a loop
While my current task requires only a single VM, eventually I am going to want two or more. This means that I need to create the set of servers in a loop. This actually ends up flowing into all tasks that modify the base image. This is one case where constant refactoring comes in, but also where I show I can easily break thee multi-inventory set up. For example, the addresses that are hard coded into the XML fragment above really need to vary per host. Thus, that fragment should look like this:
<interface type='direct'> <mac address='{{ item.mac }}'/> <source dev='em1' mode='bridge'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </interface> |
And the ethernet configuration should look like this:
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=none IPADDR={{ item.static_ip_address }} PREFIX=24 GATEWAY=10.127.0.1 DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=eth1 DEVICE=eth1 ONBOOT=yes ZONE=public DNS1=10.127.0.7 PEERDNS=no UUID= {{ item.uuid }} |
…and that still hard codes some values. The collection that I iterate through to create the servers now needs these additional keys. Thus, my default file should look like this:
--- cluster_hosts: - {name: passimian, uuid: 9c92fad9-6ecb-3e6c-eb4d-8a47c6f50c0, static_ip_address: 10.127.0.3, mac: 52:54:00:26:29:db } |
The task for copying in the network configuration currently looks like this:
- template: src: ifcfg-eth1.j2 dest: '{{ hypervisor_keystore_dir }}/ifcfg-eth1' |
It will have to be modified to:
- template: src: ifcfg-eth1.j2 dest: '{{ hypervisor_keystore_dir }}/{{ item.name }}-ifcfg-eth1' with_items: "{{ cluster_hosts }}" |
And the configuration of the VM image would also have to reflect this. Currently the call is:
-command 'id -u cloud-user &>/dev/null || /usr/sbin/useradd -u 1000 cloud-user' --ssh-inject cloud-user:file:/tmp/authorized_keys --hostname {{ item.name }}.home.younglogic.net --copy-in {{ hypervisor_keystore_dir }}/ifcfg-eth1:/etc/sysconfig/network-scripts --selinux-relabel with_items: "{{ cluster_hosts }}" |
The flag would need to be updated to
--copy-in {{ hypervisor_keystore_dir }}/{{ item.name }}-ifcfg-eth1:/etc/sysconfig/network-scripts |
Since I start by making the changes in default/main.yml, I only have to make them once. Once I push the cluster_hosts definition to the inventory files, refactoring gets harder: I cannot atomically make a change without breaking one of the configurations. Once I have more than one system using this playbook, adding parameters this way introduces a non-backwards compatible change.
Conclusion
Like much system administration work, this task is going to be used before it is completed. It is perpetually a work-in-progress. This is healthy. As soon as we start getting afraid of our code, it calcifies and breaks. Even worse, we treat the code as immutable, and build layers around it, making simple things more complicated. These notes serve to remind me (and others) why things look the way they do, and where things are going. Hopefully, when time comes to change things in the future, these notes will help this code base grow to match the needs I have for it.