Porting Linux to Nabla Containers
This is an introduction of Linux Kernel Library ported to Nabla Containers.
runnc is an OCI runtime that runs process-level isolated unikernels. It is built on the top of Solo5, a sandbox for unikernels, and several unikernels (MirageOS, IncludeOS, Rumprun) run on it. The original runnc uses Rumprun, a NetBSD based unikernel. However, as Docker is started from Linux, it is needed to have system call level compatibility with Linux. Therefore, I ported Linux Kernel Library (LKL) and musl libc to Solo5 and put together with runnc.
frankenlibc on Solo5
frankenlibc is a set of tools to run Rump unikernels in various environments. It has a fork that ported LKL and some libraries. I used this frankenlibc fork and added Solo5 platform support.
Building frankenlibc
Clone the repository and checkout solo5
branch.
$ git clone https://github.com/retrage/frankenlibc.git
$ cd frankenlibc
$ git checkout solo5
Clone full Solo5 repository to avoid build failure and update submodules.
$ git clone https://github.com/Solo5/solo5.git
$ git submodule update --init
Apply some patches.
$ for file in `find patches/solo5/ -maxdepth 1 -type f` ; do patch -p1 < $file ; done
Finally, run the build script.
$ ./build.sh -k linux notests solo5
You can find libraries and toolchain wrappers in rump
directory
after building successfully.
Testing
Even if notests
specified, build.sh
builds simple tests to rumpobj/tests
.
Create a tap100
tap device.
$ sudo ip tuntap add tap100 mode tap
$ sudo ip addr add 10.0.0.1/24 dev tap100
$ sudo ip link set dev tap100 up
Create disk.img
disk image.
As LKL/frankenlibc creates directories on initialization,
some operations fail if read-only ISO image is used.
To avoid this issue, we use the Ext4 file system image.
$ dd if=/dev/zero of=disk.img bs=1024 count=20480
$ mkfs.ext4 -F disk.img
Note that Solo5 requires an application manifest on build time,
which is embedded in a unikernel binary.
In current frankenlibc Solo5 support, the manifest is common across binaries
and specifies rootfs
block device and tap
network device.
We have to provide these devices even not used in the applications.
Run hello
test.
$ RUMP_VERBOSE=1 ./rump/bin/rexec rumpobj/tests/hello rootfs:disk.img tap:tap100
In the Linux platform, rexec
provides a sandbox environment for unikernels
using seccomp like Solo5’s tenders.
In the Solo5 platform, it is just a shell script wrapper for spt
tender.
LKL Nabla Containers
Now, it’s time to integrate with Nabla Containers. Since the original runnc imports older version of Solo5, I updated it and adapted the runnc code base.
Updating Supplied Arguments
Below is the original code that creates arguments for Solo5 tender.
var args []string
if mac != "" {
args = []string{r.NablaRunBin,
"--x-exec-heap",
"--mem=" + strconv.FormatInt(r.Memory, 10),
"--net-mac=" + mac,
"--net=" + r.Tap,
"--disk=" + disk,
r.UniKernelBin,
unikernelArgs}
} else {
args = []string{r.NablaRunBin,
"--x-exec-heap",
"--mem=" + strconv.FormatInt(r.Memory, 10),
"--net=" + r.Tap,
"--disk=" + disk,
r.UniKernelBin,
unikernelArgs}
}
In the latest Solo5 (frankenlibc Solo5 platform uses),
--net-mac
option is removed and we can specify multiple block devices
and network devices with --block:
and --net:
options.
Ideally, it should support multiple devices.
However, as described before, it can specify rootfs
and tap
only.
So, the port ends up with the support of these devices like this.
var args []string
args = []string{r.NablaRunBin,
"--mem=" + strconv.FormatInt(r.Memory, 10),
"--net:tap=" + r.Tap,
"--block:rootfs=" + disk,
r.UniKernelBin}
Creating Disk Image
I added CreateExt4()
function and llmodules/fs/ext4_storage.go
to create Ext4 rootfs.
// CreateExt4 creates ext4 raw disk image from the dir argument
func CreateExt4(dir string, target *string) (string, error) {
var fname string
if target == nil {
f, err := ioutil.TempFile("/tmp", "nabla")
if err != nil {
return "", err
}
fname = f.Name()
if err := f.Close(); err != nil {
return "", err
}
} else {
var err error
fname, err = filepath.Abs(*target)
if err != nil {
return "", errors.Wrap(err, "Unable to resolve abs target path")
}
}
absDir, err := filepath.Abs(dir)
if err != nil {
return "", errors.Wrap(err, "Unable to resolve abs dir path")
}
cmd := exec.Command("virt-make-fs", "-F", "raw", "-t", "ext4",
absDir, fname)
err = cmd.Run()
if err != nil {
return "", errors.Wrap(err, "Unable to run virt-make-fs command")
}
return fname, nil
}
virt-make-fs
, a part of libguestfs
has similar interface with genisoimage
.
It would be better to switch NewISOFsHandler()
and NewExt4FsHandler()
on run time.
Building and Installing runnc
Same as original.
$ git clone https://github.com/retrage/runnc.git
$ mkdir -p $GOPATH/github.com/retrage
$ ln -sf $PWD/runnc $GOPATH/github.com/retrage/runnc
$ cd runnc
$ git apply patches/0001-solo5-elf-segment-align-workaround.patch
$ make build
$ make install
Testing with Docker Images
I provided a set of Makefiles build LKL Nabla Container base Docker images. It builds Solo5 and frankenlibc, and Docker images.
I also pushed pre-built Docker images to Docker Hub.
You can use images like this.
$ sudo docker run --rm --runtime=runnc retrage/lkl-nabla-python3-base:
latest -c "print(\'hello\')"
[sudo] password for akira:
nabla-run arg [/opt/runnc/bin/nabla-run --mem=512 --net:tap=tap28157ba5950e --bl
ock:rootfs=/var/run/docker/runtime-runnc/moby/28157ba5950e3e84824bd843fd1dafb06eccc7de2020a0619d6a5b463e5f2c2b/rootfs.img /var/lib/docker/overlay2/3d36c19950e53eefded8e1933f3d7e51990fc4c7b065be6c00776eeab8fb3136/merged/python3.nabla __RUMP_FDINFO_NET_tap=4 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=28157ba5950e PYTHONHASHSEED=1 PYTHONHOME=/usr/local HOME=/ -- -c print(\'hello\')]
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Bindings version v0.6.4-6-g756accf-dirty
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x889fff)
Solo5: rodata @ (0x88a000 - 0xb4cfff)
Solo5: data @ (0xb4d000 - 0xe7dfff)
Solo5: heap >= 0xe7e000 < stack < 0x20000000
sleeping 50000 usec
hello
Solo5: solo5_exit(0) called
Conclusion
In this post, I introduced a brief of LKL Nabla Containers. It is still in an early stage and has room for improvement, but already runs practical applications like Python. I would like to measure the performance and evaluate the pros/cons.
Below is the TODO list:
- Replace workaround for Solo5
- Flexible
manifest.json
handling on build time Passlkl.json
through run time arguments- Do not pass
__RUMP_FDINFO_NET_tap=4
environment variable on run time
Update: May 1st, 2020
After wrote this post, I found that LKL must use network information created by the container runtime. Otherwise, the network does not work properly. I added the 3rd feature described in the above TODO list to frankenlibc and runnc.
The OCI runtime builds and passes JSON config for LKL at startup. LKL parses it along with environment variables and arguments.
Now, popular network applications Nginx and redis work on LKL Nabla Containers. They are available as base Docker Images.