Kubestack: dynamic jenkins slaves using Kubernetes

In this article I will show KubeStack, a python daemon and command line tool to spin un dynamic Jenkins slaves using Kubernetes:

https://github.com/kubestack/kubestack

How to install it

KubeStack is currently a POC, so it's not still published as a package. To install it, you can clone from the upper url. KubeStack project is divided on three directories:

ansible: a set of scripts and documentation explaining how to setup a Kubernetes cluster on OpenStack
images: Dockerfile and scripts used to generate Jenkins slaves
app: KubeStack application to interact with Kubernetes and Jenkins

To install the KubeStack application, you need to go the the app/kubestack directory, and install the requirements listed on requirements.txt file.
You can run the daemon using:
python kubestack/cmd/kubestackd.py (-d if you don't want as a daemon).

Also a CLI tool is available on kubestack/cmd/kubestackcmd.py

How to configure

KubeStack relies on a configuration file, that lives on /etc/kubestack/config.yaml
The file has the following format:

demand-listeners:
    - name: jenkins-queue
      type: jenkins_queue
destroy-listeners:
    - name: zmq
      type: zmq
      host: zmq_host
      port: 8888
jenkins:
    external_url: 'http://canonical_jenkins_url'
    internal_url: 'http://internal_jenkins_url'
    user: jenkins_user
    pass: jenkins_pass
kubernetes:
    url: 'http://kubernetes_master_url:8080'
    api_key: kubernetes_key
labels:
    - name: dummy-image
      image: yrobla/jenkins-slave-swarm-infra
      cpu: 250m
      memory: 512Mi

demand-listeners

KubeStack has configurable listeners to be aware of the demand of slaves. Currently two types of listeners are available: jenkins_queue and gearman.
To configure jenkins_queue there is no extra configuration needed, it will use the settings defined in jenkins section to listen to jenkins queue, to get aware of the demand.
To configure gearman you need to provide host and port settings, that need to point to the Gearman server holding the demand of jobs.

destroy-listeners

KubeStack has configurable listeners to be aware of job completion, to disconnect and destroy Jenkins slaves when a job is finished.
Currently only zmq listener is available, that will interact with Jenkins ZMQ plugin (https://github.com/openstack-infra/zmq-event-publisher) . This plugin publishes status of jobs to a ZMQ queue, and KubeStack can listen to this queue, to know about job completion and react according to it.
To configure zmq listener, only host and port of ZMQ need to be provided.
In the future, more destroy-listeners will be added.

jenkins

This section allows to define the jenkins master settings, where all the jobs will be run, and kubernetes Jenkins slaves will be attached. Following settings need to be provided:

external_url -> jenkins url, to be able to interact with jenkins api
internal_url -> jenkins internal ip (if it's different from external), that jenkins slaves will be using to connect with jenkins masters
user -> username for the jenkins user, to connect to Jenkins API
password -> password for the jenkins user, to connect to jenkins API

kubernetes

This section allows to define the settings to interact with Kubernetes API. The url and the api key used to connect to the API are needed.

labels

In this section we will define all the labels (image types) that will be used by our system. Jenkins allows to define labels where certain jobs can be run (for example images of different operating systems, different flavors...)
In this section we can define the same labels as needed by Jenkins. Each label is defined by:

name -> name of the label, that needs to match with Jenkins label name
image -> name of the docker image that is used by this label. This needs to be based on a jenkins swarm slave image
cpu and memory (optional) -> if no resource constraint is defined, kubernetes will spin up pods without constraints, and this can affect performance. To define a jenkins slave, is better to define a minimum cpu and memory needed, that guarantees a proper performance on each slave.

How does it work

KubeStack is a daemon that interacts with Kubernetes and Jenkins, listening to demand of jenkins slaves and generating this slaves using a Kubernetes cluster, attaching them to Jenkins.
So each time a new job is spinned up on Jenkins, a demand of an specific label is generated:

Labels can be associated with a given job, using NodeLabel plugin (https://wiki.jenkins-ci.org/display/JENKINS/NodeLabel+Parameter+Plugin):

Once KubeStack is aware of that demand, it generates a pod, with the image specified on the config.yaml settings, and with the cpu and memory restrictions specified.
It attaches to Jenkins using Swarm plugin (https://wiki.jenkins-ci.org/display/JENKINS/Swarm+Plugin) , connecting it as slave, and making it available to run jobs.

Once that Jenkins is aware of the given slave is only, it will execute the job that requested that slave.
Jenkins will publish the status of the job (started, finished...) on ZMQ queue. KubeStack will listen to that queue, and will disconnect the slave and destroy the POD once finished, giving room for more slaves.

KubeStack in the future

This project is mostly a POC at the moment. More work in terms of configuration, new listeners and reliability need to happen.
Following ideas are in the ROADMAP:

Scale: KubeStack now relies on a fixed size Kubernetes cluster. This makes it difficult to scale. The idea is to monitor Kubernetes cluster load, and spin up / remove minions depending on needs.
Jenkins multi-master: currently only one Jenkins master is supported, you need to run different daemons for different masters. Adding multi-master support is a feature scheduled for the future
Configure jenkins slaves connection: currently the only way to attach jenkins slaves is based on Swarm plugin. In the future, more ways of adding jenkins slaves, giving the ability to configure them flexibility, will be created.
More demand and destroy listeners: currently we limit to a subset of demand listeners (gearman, jenkins queue) and destroy (zmq). Jenkins has a wide range of plugins that can provide demand and notification of jobs, so more listeners should be added to support it
Not only jenkins... KubeStack relies on jenkins as the platform to spin up jobs and attach slaves to it. But there are more ways to execute jobs (Ansible, Travis CI... or any custom system you need). KubeStack should support these systems, and be flexible enough to add new systems on demand.

Want to contribute?

Any ideas for improvement, contribution... are welcome. Please reach me on email (info@ysoft.biz) or IRC (yolanda on irc.freenode.net) for any comments.

Credits

KubeStack structure was mostly inspired on Nodepool (http://docs.openstack.org/infra/system-config/nodepool.html) . Thanks to #openstack-infra for all the knowledgebase and expertise provided to create this project.

Create and restore external backups of virtual machines with libvirt

A common need for deployments in production, is to have the possibility of taking backups of your working virtual machines, and export them to some external storage. Although libvirt offers the possibility of taking snapshots and restore them, those snapshots are intended to be managed locally, and are lost when you destroy your virtual machines. There may be the need to just trash all your environment, and re-create the virtual machines from an external backup, so this article offers a procedure to achieve it. First step, create an external snapshot So the first step will be taking an snapshot from your running vm. The best way to take an isolated backup is using blockcopy virsh command. So, how to proceed? 1. First you need to extract all the disks that your vm has. This can be achieved with domblklist command: DISK_NAME=$(virsh domblklist {{domain}} --details | grep 'disk' | awk '{print $3}') This will extract the name of the device that the vm is using ...

Technology articles

Search This Blog