Now that I have a functional OpenShift Origin built from source, I need to deploy KubeVirt on top of it.
Here are my notes. This is rough, and not production quality yet, but should get you started.
Table of contents
Prerequisites
As I said in that last post, in order to Build Kubevirt, I had to upgrade to a later version of go (rpm had 1.6, now I have 1.8).
Docker
In order to build the manifests with specific versions, I over-rode some config options as I described here.
Specifically, I used docker_tag=devel to make sure I didn’t accidentally download the released versions from Docker hub, as well as set the master_ip address.
To generate the docker images:
make docker |
make manifests
Config Changes
In order to make the configuration changes, and have them stick:
oc cluster down |
edit
sudo vi /var/lib/origin/openshift.local.config/master/master-config.yaml
under networkConfig: clusterNetworkCIDR: 10.128.0.0/14 |
add
externalIPNetworkCIDRs: ["0.0.0.0/0"] |
The bring the cluster up with:
oc cluster up --use-existing-config --loglevel=5 --version=413eb73 --host-data-dir=/var/lib/origin/etcd/ | tee /tmp/oc.log 2>&1 |
Networking
Got an error showing
$ Ensure that access to ports tcp/8443, udp/53 and udp/8053 is allowed on 192.168.122.233.
[ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8443/tcp [sudo] password for ayoung: [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8053/udp success [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=53/udp |
Those won’t persists as is, so:
[ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8443/tcp success [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=53/udp success [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8053/udp success |
Log in as admin
Redeploy the cluster and then
$ oc login -u system:admin Logged into "https://127.0.0.1:8443" as "system:admin" using existing credentials. You have access to the following projects and can switch between them with 'oc project ': default kube-public kube-system * myproject openshift openshift-infra Using project "myproject". [ayoung@drifloon kubevirt]$ oc project kube-system Now using project "kube-system" on server "https://127.0.0.1:8443". |
Deploying Manifests
Need Updated Manifests from this branch.
To generate alternate values for the manifest:
$ for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done |
message: ‘No nodes are available that match all of the following predicates::
MatchNodeSelector (1).’
reason: Unschedulable
Head node is not schedulable.
oc adm manage-node localhost --schedulable=true |
And now….
for MANIFEST in `ls manifests/*yaml` ; do kubectl delete -f $MANIFEST ; done |
wait a bit
for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done |
$ kubectl get pods NAME READY STATUS RESTARTS AGE haproxy-858199412-m78n5 0/1 CrashLoopBackOff 8 16m kubevirt-cockpit-demo-4250553349-gm8qm 0/1 CrashLoopBackOff 224 18h spice-proxy-1193136539-gr7b3 0/1 CrashLoopBackOff 225 18h virt-api-4068750737-j7bwj 1/1 Running 0 18h virt-controller-3722000252-bsbsr 1/1 Running 0 18h |
why are those three crashing? Permissions
$ kubectl logs kubevirt-cockpit-demo-4250553349-gm8qm cockpit-ws: /etc/cockpit/ws-certs.d/0-self-signed.cert: Failed to open file '/etc/cockpit/ws-certs.d/0-self-signed.cert': Permission denied kubectl logs haproxy-858199412-hqmk1 <7>haproxy-systemd-wrapper: executing /usr/local/sbin/haproxy -p /haproxy/run/haproxy.pid -f /usr/local/etc/haproxy/haproxy.cfg -Ds [ALERT] 228/180509 (15) : [/usr/local/sbin/haproxy.main()] Cannot create pidfile /haproxy/run/haproxy.pid [ayoung@drifloon kubevirt]$ kubectl logs spice-proxy-1193136539-gr7b3 FATAL: Unable to open configuration file: /home/proxy/squid.conf: (13) Permission denied |
virt-api runs as a strange user:
1000050+ 14464 14450 0 Aug16 ? 00:00:01 /virt-api –port 8183 –spice-proxy 192.168.122.233:3128
1000050+ is, I am guessing, a uid made up by kubernetes
looks like I am tripping over the fact that Openshift security policy by default prohibits you from running as known users. (thanks claytonc)
Pull Request is merged for changing these in KubeVirt
Service Accounts:
oc create serviceaccount -n kube-system kubevirt oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt |
modify the libvirt and virt-handler manifests like so (this is in the version from the branch above):
spec: + serviceAccountName: kubevirt containers: - name: virt-handler |
and
spec: + serviceAccountName: kubevirt hostNetwork: true hostPID: true hostIPC: true securityContext: runAsUser: 0 |
OK a few more notes: Need manifest changes so that the various resources end up in the kube-system namespace as well as run as the appropriate kubevirt or kubevirt-admin users. See this pull request.
SecurityContextConstraints
Have to manually apply the permissions.yaml file, then add scc to get the daemon pods to schedule:
oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt-admin |
You could also run as default serviceAccountuser and just run:
oc adm policy add-scc-to-user privileged -n kube-system -z default |
But that is not a good long term strategy.
In order to launch a VM, it turns out we need an eth1: The default network setup by the libvirt image assumes it is there. The easiest way to get one is to modify the VM to use a second network card. That requires restarting the cluster. You can also set the name of the interface in config-local.sh to the appropriate network device/connection for your system using the primary_nic value.
Disable SELinux
Try to deploy vm with
kubectl apply -f cluster/vm.yaml |
Check the status using:
kubectl get vms testvm -o yaml |
VM failed to deploy.
Check libvirt Container log
oc logs libvirt-ztt0w -c libvirtd 2017-08-25 13:03:07.253+0000: 5155: error : qemuProcessReportLogError:1845 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/local/bin/qemu-system-x86_64: Permission denied |
For now, disable SELinux. In the future, we’ll need a customer SELinux policy to allow this.
sudo setenforce 0 |
iSCSI vs PXE
Finally, the iscsi pod defined in the manifests trips over a bunch of OpenShift permissions hardening issues. Prior to working those out, I just wanted to run a PXE bootable VM, so I copied the vm.yaml to vm-pxe.yaml and applied that.
Lesson Learned: SecurityContextConstraints can’t be in manifests.
Using the FOR loop for the manifests won’t work long term. We’ll need to apply the permissions.yaml file first, then run the two oc commands to add the Users to the scc, and finally run the rest of the manifests. Adding users to sccs cannot be done via manifest apply, as it has to modify an existing resource. The ServiceAccounts need to be created prior to any of the daemonset or deployment manifests and added to the sccs, or the node selection criteria will not be met, and no pods will be scheduled.