ClearContainers
Clear Containers is taking a different track, with proposing to start kvm based virtual machines as fast as Linux containers. Each container brings its own version of Linux kernel, and does not make use of popular cgroups and namespace to isolate the application using containers.There has been a long debate to make "Kernel Virtual Machine" or popularly known as "KVM" to ba made part of Linux Kernel. Very recently, it a strip down version of KVM, called kvmtool was introduced as integral part of Linux kernel. For very long time, Linus Torvalds did not want it to be part of the kernel but I guess finally gave up on demand. A utility called "lkvm"[1] can be used to start these small sized virtual machines, living inside the kernel.
lkvm (Linux KVM Tool)
This new native Linux KVM tool was announced at the end of March and was written by Pekka Enberg, Cyrill Gorcunov, and Asias He. As per him, "The goal of this tool is to provide a clean, from-scratch, lightweight KVM host tool implementation that can boot Linux guest images with no BIOS dependencies and with only the minimal amount of legacy device emulation." This tool is still in development but is only around 5,000 lines of C code that is capable of booting a Linux guest image while leveraging the Linux KVM. The tool can be launched with simple cli options, userspace image and link to linux kernel bootable bzImage [2].Virtual Devices (virtio)
The virtio or virtual IO lets Linux Kernel virtualization to share the device. Rather than have a variety of device emulation mechanisms (for network, block, and other drivers), virtio provides a common front end for these device emulations to standardize the interface and increase the reuse of code across the platforms. This design allows hypervisor to support common set of emulated devices, through a predefined common set of api's. The guess operating implements front end drivers, which make use of virtio API's and back end drivers are implemented in hypervisors [3].The virtio has developed interfaces for - block, network, pci, ballon, console and 9p. Most of these are quite self explanatory, the one, which catches my attention is 9p or virtFS. VirtFS is a new paravirtualized filesystem interface designed for improving passthrough technologies in the KVM environment. It is based on the VirtIO framework and uses the 9P protocol. The containers created using ClearLinux, make use of virtFS or virtio 9p protocol foe exposing volumes from underlying system inside the virtual machine.
Direct access (DAX)
The non volatile memory (NVM) devices are going to provide fast terabytes of persistence storage at RAM speeds. It is easy to wrap block device around portions or whole of NVM device. Direct acces or DAX is patch introduce in kernel replacing XIP code, which files from direct addressable devices to be mapped into user space. This is a feature, which is used by Clear Containers to map rootfs inside the virtual machine image.Clear Containers & Dockers
The Intel engineers have extended the Docker orchestration system to launch clear containers using docker cli. This code is still though not part of main docker repository but can be easily installed reading instructions from the clear linux website[4]. The installation brings all necessary components needed for running clear containers using the Docker cli. All the commands, which work normally with the Docker, do work here as is.After the installation, it can be verified that this modified version of the Docker
' sudo docker -v'
docker version 1.8.1-clear-containers, build d12ea79-clear-containers
Now, try running a ubuntu server
'sudo docker run -d ubuntu sleep 5000'
Check to see if lkvm is running, do check out the parameters by which it is invoked
lkvm run -c 6 -m 1024 --name 99ed6cf540e3e93baa9a5d452dbeb7df38ebb19441454c16781bc33a33b87c07 --console virtio --kernel /usr/lib/kernel/vmlinux.container --params root=/dev/plkvm0p1 rootfstype=ext4 rootflags=dax,data=ordered init=/usr/lib/systemd/systemd systemd.unit=container.target rw tsc=reliable systemd.show_status=false no_timer_check rcupdate.rcu_expedited=1 console=hvc0 quiet ip=172.17.0.19::172.17.42.1::99ed6cf540e3::off --shmem 0x200000000:0:file=/var/lib/docker/clear-4740-containers.img:private --network mode=tap,script=none,tapif=tb-99ed6cf540e3,guest_mac=02:42:ac:11:00:13 --9p /var/lib/docker/aufs/mnt/99ed6cf540e3e93baa9a5d452dbeb7df38ebb19441454c16781
Some observations:
a) The root file-system is exposed using DAX
b) The filesystem of mounts from host is given as virtFS using 9p protocol
c) The init system is coming through, 'systemd"
References
[1] https://github.com/penberg/linux-kvm/tree/master/tools/kvm & https://lkml.org
[2] https://lkml.org/lkml/2011/3/31/406
[3] http://www.ibm.com/developerworks/library/l-virtio/
[4] https://clearlinux.org/blogs/clear-containers-docker-engine