Deploying Kubevirt on Origin Master

Now that I have a functional OpenShift Origin built from source, I need to deploy KubeVirt on top of it.

Here are my notes. This is rough, and not production quality yet, but should get you started.

Prerequisites

As I said in that last post, in order to Build Kubevirt, I had to upgrade to a later version of go (rpm had 1.6, now I have 1.8).

Docker

In order to build the manifests with specific versions, I over-rode some config options as I described here.

Specifically, I used docker_tag=devel to make sure I didn’t accidentally download the released versions from Docker hub, as well as set the master_ip address.

To generate the docker images:

make docker

make manifests

Config Changes

In order to make the configuration changes, and have them stick:

oc cluster down

edit
sudo vi /var/lib/origin/openshift.local.config/master/master-config.yaml

under
 networkConfig:
 clusterNetworkCIDR: 10.128.0.0/14

add

externalIPNetworkCIDRs: ["0.0.0.0/0"]

The bring the cluster up with:

oc cluster up --use-existing-config --loglevel=5 --version=413eb73 --host-data-dir=/var/lib/origin/etcd/ | tee /tmp/oc.log 2&amp;gt;&amp;amp;1

Networking

Got an error showing
$ Ensure that access to ports tcp/8443, udp/53 and udp/8053 is allowed on 192.168.122.233.

[ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8443/tcp
 [sudo] password for ayoung:
 [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8053/udp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=53/udp

Those won’t persists as is, so:

[ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8443/tcp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=53/udp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8053/udp
 success

Log in as admin

Redeploy the cluster and then

$ oc login -u system:admin
 Logged into "https://127.0.0.1:8443" as "system:admin" using existing credentials.
 
You have access to the following projects and can switch between them with 'oc project ':
 
default
 kube-public
 kube-system
 * myproject
 openshift
 openshift-infra
 
Using project "myproject".
 [ayoung@drifloon kubevirt]$ oc project kube-system
 Now using project "kube-system" on server "https://127.0.0.1:8443".

Deploying Manifests

Need Updated Manifests from this branch.

To generate alternate values for the manifest:

$ for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done

message: ‘No nodes are available that match all of the following predicates::
MatchNodeSelector (1).’
reason: Unschedulable

Head node is not schedulable.

oc adm manage-node localhost --schedulable=true

And now….

for MANIFEST in `ls manifests/*yaml` ; do kubectl delete -f $MANIFEST ; done

wait a bit

for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done

$ kubectl get pods
 NAME READY STATUS RESTARTS AGE
 haproxy-858199412-m78n5 0/1 CrashLoopBackOff 8 16m
 kubevirt-cockpit-demo-4250553349-gm8qm 0/1 CrashLoopBackOff 224 18h
 spice-proxy-1193136539-gr7b3 0/1 CrashLoopBackOff 225 18h
 virt-api-4068750737-j7bwj 1/1 Running 0 18h
 virt-controller-3722000252-bsbsr 1/1 Running 0 18h

why are those three crashing? Permissions

$ kubectl logs kubevirt-cockpit-demo-4250553349-gm8qm
 cockpit-ws: /etc/cockpit/ws-certs.d/0-self-signed.cert: Failed to open file '/etc/cockpit/ws-certs.d/0-self-signed.cert': Permission denied
 kubectl logs haproxy-858199412-hqmk1
 &lt;7&gt;haproxy-systemd-wrapper: executing /usr/local/sbin/haproxy -p /haproxy/run/haproxy.pid -f /usr/local/etc/haproxy/haproxy.cfg -Ds
 [ALERT] 228/180509 (15) : [/usr/local/sbin/haproxy.main()] Cannot create pidfile /haproxy/run/haproxy.pid
 [ayoung@drifloon kubevirt]$ kubectl logs spice-proxy-1193136539-gr7b3
 FATAL: Unable to open configuration file: /home/proxy/squid.conf: (13) Permission denied

virt-api runs as a strange user:

1000050+ 14464 14450 0 Aug16 ? 00:00:01 /virt-api –port 8183 –spice-proxy 192.168.122.233:3128
1000050+ is, I am guessing, a uid made up by kubernetes

looks like I am tripping over the fact that Openshift security policy by default prohibits you from running as known users. (thanks claytonc)

Pull Request is merged for changing these in KubeVirt

Service Accounts:

oc create serviceaccount -n kube-system kubevirt
oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt

modify the libvirt and virt-handler manifests like so (this is in the version from the branch above):

     spec:
+       serviceAccountName: kubevirt
       containers:
       - name: virt-handler

and

     spec:
+       serviceAccountName: kubevirt
       hostNetwork: true
       hostPID: true
       hostIPC: true
       securityContext:
         runAsUser: 0

OK a few more notes: Need manifest changes so that the various resources end up in the kube-system namespace as well as run as the appropriate kubevirt or kubevirt-admin users. See this pull request.

SecurityContextConstraints

Have to manually apply the permissions.yaml file, then add scc to get the daemon pods to schedule:

oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt
oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt-admin

You could also run as default serviceAccountuser and just run:

oc  adm policy add-scc-to-user privileged -n kube-system -z default

But that is not a good long term strategy.

In order to launch a VM, it turns out we need an eth1:Â The default network setup by the libvirt image assumes it is there.Â The easiest way to get one is to modify the VM to use a second network card.Â That requires restarting the cluster.Â You can also set the name of the interfaceÂ in config-local.sh to the appropriate network device/connection for your system using the primary_nic value.

Disable SELinux

Try to deploy vm with

kubectl apply -f cluster/vm.yaml

Check the status using:

 kubectl get vms testvm -o yaml

VM failed to deploy.

Check libvirt Container log

oc logs libvirt-ztt0w -c libvirtd 
2017-08-25 13:03:07.253+0000: 5155: error : qemuProcessReportLogError:1845 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/local/bin/qemu-system-x86_64: Permission denied

For now, disable SELinux. In the future, we’ll need a customer SELinux policy to allow this.

 sudo setenforce 0

iSCSI vs PXE

Finally, the iscsi pod defined in the manifests trips over a bunch of OpenShift permissions hardening issues.Â Prior to working those out, I just wanted to run a PXE bootable VM, so I copied the vm.yaml to vm-pxe.yaml and applied that.

Lesson Learned: SecurityContextConstraints can’t be in manifests.

Using the FOR loop for the manifests won’t work long term. We’ll need to apply the permissions.yaml file first, then run the two oc commands to add the Users to the scc, and finally run the rest of the manifests. Adding users to sccs cannot be done via manifest apply, as it has to modify an existing resource. The ServiceAccounts need to be created prior to any of the daemonset or deployment manifests and added to the sccs, or the node selection criteria will not be met, and no pods will be scheduled.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things