How to use Kata Containers with virtiofs

Update for Kata Containers 1.7 and later

This HowTo is obsolete as of Kata Containers 1.7. virtiofs has been included in Kata Containers and can be enabled as described in the official Kata Containers documentation. It is no longer necessary to build from virtiofs repositories since mainline Kata Containers now includes virtiofs.

Overview

This document describes how to set up Kata Containers with virtiofs. Container images will be exposed to the sandbox VM using virtiofs.

Use this guide if you wish to test or benchmark Kata Container workloads. It is easier to develop and debug virtiofs standalone without Kata due to the smaller number of components involved. A guide for manually running QEMU with virtiofs is available here.

Kata Containers is an OCI runtime that runs containers inside a virtual machine for better isolation. Docker and Kubernetes (CRI-O) can be configured to launch Kata Containers instead of their default OCI runtimes.

Each VM is called a sandbox. In Kubernetes a sandbox can be thought of the same thing as a pod.

Containers run inside sandboxes. Each sandbox contains an agent process that communicates with the Kata runtime on the host. Commands include creating new containers, executing processes inside containers, querying sandbox information, etc.

Prerequisites

Components

The following components need to be built:

  1. A guest kernel with virtiofs support
  2. A QEMU with virtiofs support
  3. The example virtiofs daemon (virtiofsd)
  4. Kata Containers with virtiofs support
  5. A Kata Containers initramfs with virtiofs support

The instructions assume that you already have available a Linux host on which you can build and run the components.

The guest kernel

Note that an upstream Linux 5.4 kernel or later can be used as long as the DAX feature is not used.

On the host, download the virtiofs kernel tree by:

git clone https://gitlab.com/virtio-fs/linux.git
git checkout virtio-fs-dev
    

Configure and build this kernel with the following .config file:

wget -O .config https://gitlab.com/virtio-fs/linux/snippets/1846957/raw
make -j$(nproc)

Building QEMU

On the host, download and install the virtiofs QEMU tree by:

git clone https://gitlab.com/virtio-fs/qemu.git
cd qemu
./configure --target-list=x86_64-softmmu
make -j$(nproc)
and in the same tree, the virtiofs daemon needs to be built:
make -j$(nproc) virtiofsd

Building Kata Containers

Setting up environment variables

Decide where you want Go to put source code and packages:

export GOPATH=$HOME/go

Decide where you want to build the sandbox root filesystem:

export ROOTFS_DIR=$GOPATH/src/github.com/kata-containers/osbuilder/rootfs-builder/rootfs-ClearLinux

Cloning Kata Containers repositories

Kata consists of several components, each with its own git repository. You need them all.

go get -d -u github.com/kata-containers/runtime
go get -d -u github.com/kata-containers/agent
go get -d -u github.com/kata-containers/osbuilder
go get -d -u github.com/kata-containers/proxy
go get -d -u github.com/kata-containers/shim

Ensure you are on Kata 1.6.1:

(cd $GOPATH/src/github.com/kata-containers/agent && git checkout 1.6.1)
(cd $GOPATH/src/github.com/kata-containers/osbuilder && git checkout 1.6.1)
(cd $GOPATH/src/github.com/kata-containers/proxy && git checkout 1.6.1)
(cd $GOPATH/src/github.com/kata-containers/shim && git checkout 1.6.1)

Use the virtiofs repositories for Kata components that require virtiofs integration:

    cd $GOPATH/src/github.com/kata-containers/runtime
    git remote add virtio-fs https://gitlab.com/virtio-fs/runtime.git
    git fetch virtio-fs
    git checkout virtio-fs/virtio-fs
    cd -

Building the runtime

The runtime presents an OCI-compliant runtime interface to Docker and CRI-O. This is where sandbox setup happens and containers are orchestrated.

cd $GOPATH/src/github.com/kata-containers/runtime/
make

Building the proxy

The proxy forwards communications between the host and the sandbox.

cd $GOPATH/src/github.com/kata-containers/proxy
make

Building the shim

The shim is a placeholder process on the host that forwards terminal I/O and signals to the actual container process running inside the sandbox.

cd $GOPATH/src/github.com/kata-containers/shim
make

Building the Kata Containers initramfs

Build the Fedora root file system:

cd $GOPATH/src/github.com/kata-containers/osbuilder/rootfs-builder
./rootfs.sh clearlinux

Now build the sandbox initramfs:

cd $GOPATH/src/github.com/kata-containers/osbuilder/initrd-builder
./initrd_builder.sh $ROOTFS_DIR

Configuring Kata

The Kata configuration file controls the behavior of all Kata components.

wget -O /etc/kata-containers/configuration.toml https://gitlab.com/virtio-fs/runtime/snippets/1846963/raw

Set the following [hypervisor.qemu] variables:

Set the following [proxy.kata] variables:

Set the following [shim.kata] variables:

Also set all enable_debug variables to true for verbose output.

Add the Kata OCI runtime to Docker:

# mkdir -p /etc/systemd/system/docker.service.d
# cat >/etc/systemd/system/docker.service.d/kata-containers.conf
[Service]
Type=simple
ExecStart=
ExecStart=/usr/bin/dockerd-current -D \
          --add-runtime oci=/usr/libexec/docker/docker-runc-current \
          --add-runtime kata-runtime=YOUR_GOPATH/src/github.com/kata-containers/runtime/kata-runtime \
          --default-runtime=oci \
          --containerd /run/containerd.sock \
          --userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
          --init-path=/usr/libexec/docker/docker-init-current \
          $OPTIONS \
          $DOCKER_STORAGE_OPTIONS \
          $DOCKER_NETWORK_OPTIONS \
          $ADD_REGISTRY \
          $BLOCK_REGISTRY \
          $INSECURE_REGISTRY \
          $REGISTRIES
^D
# sed -i "s%YOUR_GOPATH%$GOPATH%g" /etc/systemd/system/docker.service.d/kata-containers.conf
# systemctl daemon-reload
# systemctl restart docker

Ensure there are enough hugepages available on the host for the Kata sandbox VM. The default sandbox RAM size is 2G, so reserve 1024 * 2MB hugepages. This is necessary since QEMU is launched with the memory options (share=on) required by vhost-user.

sysctl vm.nr_hugepages=1024

Running containers

Launch a container:

docker run --runtime kata-runtime -it busybox sh