retrage.github.io

Librarizing Linux kernel for Unikernels

I ported the Linux kernel to Unikraft as an external library. This makes it possible to reuse the rich functions of the Linux kernel for Unikernel with less functionality. In this blog post, I describe the overview of the library.

Background

Linux Kernel Library

The Linux kernel is a well-maintained mature open source OS kernel. Recently, there have been researches that propose reuse its components. The Linux Kernel Library (LKL) is one of them, which uses the Linux kernel as a form of Library OS with minimum modifications. LKL is not currently official Linux project, but it is actively being developed (v4.19 is latest). Below is the architecture of LKL.

The architecture of LKL

LKL has a host-independent architecture named lkl, and actual host-dependent code is separated from the arch. The independent and dependent code is placed under arch/lkl and tools/lkl correspondingly. For POSIX host environment, tools/lkl/lib/posix-host.c plays the role.

Unikraft

Unikraft is an experimental project by Xen Project which aims to build small and lightweight Unikernel images by providing divided Unikernel functions. The following is the architecture.

The architecture of Unikraft

Unikraft itself consists of three parts: main libs, platform libs, and architecture libs. main libs contain architecture and platform independent libraries. platform libs provide platform dependent code as libraries. It currently supports Xen, KVM, and Linux userspace as platforms. architecture libs are libraries for architecture; x86, arm, and arm64. A user has to specify the target architecture and platform by Kconfig when building Unikraft application. The Unikraft build system generates a Unikernel image corresponding to each target.

Unikraft supports external libraries along with internal libraries. There are several official external libraries in public such as newlib, lwip, compiler-rt, eigen, libcxx, libcxxabi, libunwind, and libuuid.

For more details about porting external libraries to Unikraft, see External Library Development.

LKL on Unikraft

Since Unikraft is still at an early stage, it does not have a mature network stack or file system. To tackle this issue, we ported LKL as an external library for Unikraft. Here we introduce two types of the port.

LKL on Unikraft v1

First of all, we integrated LKL to Unikraft build system. The architecture is shown below.

The architecture of LKL on Unikraft v1

In this version of the port, we added new architecture and platform libraries for LKL. They are just stub and only used when specified in Kconfig. The disadvantage of this design is that it is impossible to build LKL for other architectures or platforms. In addition, it can not cooperate with other Unikraft libraries.

The code is available at:

LKL on Unikraft v2

Next, we introduce v2 which can be used as an actual library. Below is the overview.

The architecture of LKL on Unikraft v2

By this design, we can choose any architecture and platform in concept since LKL is separated from architecture and platform. In addition, other Unikraft libraries can use LKL functions.

For implementation, we added a new host-dependent code called uk-host.c to support Unikraft as a new LKL host environment. The LKL host-dependent code requires some primitives such as mutex, semaphore, thread, timer, etc. on the host side, however, Unikraft main libs cannot satisfy the requirements because of the lack of functionality. Therefore, we ported these primitives from littlekernel, an embedded kernel to LKL. The only LKL host has to do is that calling callback functions periodic time to run the preemptive scheduler. The port of littlekernel is independent of Unikraft. Here is the implementation.

Implementation

The port of LKL supports only x86_64 architecture and KVM platform for some reasons.

Unikraft has two types of libc implementation, newlib, and nolibc. newlib is an official external library that supports full libc functions. On the other hand, nolibc provides minimal libc functions so that general libc functions can be used even if newlib does not exist. This version of LKL port is designed to work with nolibc to reduce dependencies. However, since nolibc have enough functions and constants required by LKL, we added the following functions and constants to nolibc.

  • stdbool
  • fputc, putchar
  • STD{IN,OUT,ERR}_FILENO
  • strncat
  • strtok_r
  • setjmp/longjmp

The modified LKL expects the callback function is called periodically as mentioned above, but Unikraft does not provide an interface to register a callback function. Fortunately, it starts a periodic timer at startup, so we just added the interface.

In the KVM platform, a final image is generated using the custom linker script. Since The LKL binary liblkl.o has symbols that are not referred explicitly, they are deleted or hidden by linking using the linker script. For this reason, the linker script is modified so that the symbols from LKL are kept correctly.

The modified Unikraft and the port of LKL is here:

Demonstration

To demonstrate LKL on Unikraft, we ported tools/lkl/tests/boot from LKL as uk-lkl/boot.

$ mkdir unikraft && cd unikraft
$ git clone git@github.com:uk-lkl/unikraft.git --branch=retrage/lkl-v2
$ mkdir libs && cd libs
$ git clone --recursive git@github.com:uk-lkl/lkl.git
$ cd ..
$ mkdir apps && cd apps
$ git clone git@github.com:uk-lkl/boot.git
$ cd boot
$ make menuconfi

Select x86 architecture and KVM guest platform. Save and exit Kconfig. Then, run make.

$ make

To run the final image, run run.sh.

$ ./run.sh

It will output as follow:

[    0.721134] ERR:  [libukboot] boot.c @ 88   : Failed to initialize bus driver 0x56b060: -1
Welcome to  _ __             _____
 __ _____  (_) /__ _______ _/ _/ /_
/ // / _ \/ /  '_// __/ _ `/ _/ __/
\_,_/_//_/_/_/\_\/_/  \_,_/_/ \__/
                  Titan 0.2~ebcb42a
1..33 # boot
* 1 mutex
ok 1 mutex
 ---
 time_us: 0
 log: |
 ...
* 2 semaphore
ok 2 semaphore
 ---
 time_us: 0
 log: |
 ...
* 3 join
ok 3 join
 ---
 time_us: 1
 log: |
  joined 7909384
 ...
* 4 start_kernel
ok 4 start_kernel
 ---
 time_us: 9281
 log: |
  [    0.000000] Linux version 4.19.0+ (akira@akira-Z270) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11)) #1 Thu May 30 23:10:09 JST 2019
  [    0.000000] bootmem address range: 0x1c001000 - 0x1d000000
  [    0.000000] On node 0 totalpages: 4095
  [    0.000000]   Normal zone: 56 pages used for memmap
  [    0.000000]   Normal zone: 0 pages reserved
  [    0.000000]   Normal zone: 4095 pages, LIFO batch:0
  [    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
  [    0.000000] pcpu-alloc: [0] 0 
  [    0.000000] Built 1 zonelists, mobility grouping off.  Total pages: 4039
  [    0.000000] Kernel command line: mem=16M loglevel=8
  [    0.000000] Dentry cache hash table entries: 2048 (order: 2, 16384 bytes)
  [    0.000000] Inode-cache hash table entries: 1024 (order: 1, 8192 bytes)
  [    0.000000] Memory available: 16088k/16380k RAM
  [    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
  [    0.000000] NR_IRQS: 4096
  [    0.000000] lkl: irqs initialized
  [    0.000000] clocksource: lkl: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
  [    0.198000] lkl: time and timers initialized (irq1)
  [    2.093000] pid_max: default: 4096 minimum: 301
  [   15.926000] Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
  [   18.024000] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes)
  [ 2722.817000] console [lkl_console0] enabled
  [ 2732.709000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
  [ 2760.874000] random: get_random_u32 called from bucket_table_alloc.isra.6+0x9b/0x250 with crng_init=0
  [ 2774.823000] NET: Registered protocol family 16
  [ 3121.319000] clocksource: Switched to clocksource lkl
  [ 3181.520000] NET: Registered protocol family 2
  [ 3261.648000] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes)
  [ 3263.125000] TCP established hash table entries: 512 (order: 0, 4096 bytes)
  [ 3264.198000] TCP bind hash table entries: 512 (order: 0, 4096 bytes)
  [ 3265.141000] TCP: Hash tables configured (established 512 bind 512)
  [ 3286.743000] UDP hash table entries: 128 (order: 0, 4096 bytes)
  [ 3287.830000] UDP-Lite hash table entries: 128 (order: 0, 4096 bytes)
  [ 3456.331000] workingset: timestamp_bits=62 max_order=12 bucket_order=0
  [ 4041.077000] SGI XFS with ACLs, security attributes, no debug enabled
  [ 6032.987000] jitterentropy: Initialization failed with host not compliant with requirements: 2
  [ 6038.248000] io scheduler noop registered
  [ 6038.908000] io scheduler deadline registered
  [ 6067.383000] io scheduler cfq registered (default)
  [ 6068.093000] io scheduler mq-deadline registered
  [ 6068.764000] io scheduler kyber registered
  [ 7894.266000] NET: Registered protocol family 10
  [ 7979.259000] Segment Routing with IPv6
  [ 7988.492000] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
  [ 8097.859000] Warning: unable to open an initial console.
  [ 8106.456000] This architecture does not have kernel memory protection.
  [ 8107.119000] Run /init as init process
  lkl_start_kernel(&lkl_host_ops, "mem=16M loglevel=8") = 0 
 ...
* 5 getpid
ok 5 getpid
 ---
 time_us: 5
 log: |
  lkl_sys_getpid() = 1 
 ...
* 6 syscall_latency
ok 6 syscall_latency
 ---
 time_us: 126
 log: |
  avg/min/max: lkl:107822000/104000000/450000000 native:6788000/6000000/77000000
 ...
* 7 umask
ok 7 umask
 ---
 time_us: 0
 log: |
  lkl_sys_umask(0777) = 18 
 ...
* 8 umask2
ok 8 umask2
 ---
 time_us: 0
 log: |
  lkl_sys_umask(0) = 511 
 ...
* 9 creat
ok 9 creat
 ---
 time_us: 9
 log: |
  lkl_sys_creat("/file", access_rights) = 0 
 ...
* 10 close
ok 10 close
 ---
 time_us: 0
 log: |
  lkl_sys_close(0) = 0 
 ...
* 11 failopen
ok 11 failopen
 ---
 time_us: 9
 log: |
  lkl_sys_open("/file2", 0, 0) = -2 No such file or directory
 ...
* 12 open
ok 12 open
 ---
 time_us: 7
 log: |
  lkl_sys_open("/file", LKL_O_RDWR, 0) = 0 
 ...
* 13 write
ok 13 write
 ---
 time_us: 4
 log: |
  lkl_sys_write(0, wrbuf, sizeof(wrbuf)) = 5 
 ...
* 14 lseek_cur
ok 14 lseek_cur
 ---
 time_us: 0
 log: |
  lkl_sys_lseek(0, 0, LKL_SEEK_CUR) = 5 
 ...
* 15 lseek_end
ok 15 lseek_end
 ---
 time_us: 0
 log: |
  lkl_sys_lseek(0, 0, LKL_SEEK_END) = 5 
 ...
* 16 lseek_set
ok 16 lseek_set
 ---
 time_us: 0
 log: |
  lkl_sys_lseek(0, 0, LKL_SEEK_SET) = 0 
 ...
* 17 read
ok 17 read
 ---
 time_us: 1
 log: |
  lkl_sys_read=5 buf=test
 ...
* 18 fstat
ok 18 fstat
 ---
 time_us: 1
 log: |
  lkl_sys_fstat=0 mode=100721 size=5
 ...
* 19 mkdir
ok 19 mkdir
 ---
 time_us: 8
 log: |
  lkl_sys_mkdir("/mnt", access_rights) = 0 
 ...
* 20 stat
ok 20 stat
 ---
 time_us: 7
 log: |
  lkl_sys_stat("/mnt")=0 mode=40721
 ...
* 21 pipe2
ok 21 pipe2
 ---
 time_us: 9
 log: |
 ...
* 22 epoll
ok 22 epoll
 ---
 time_us: 5
 log: |
 ...
* 23 mount_fs_proc
ok 23 mount_fs_proc
 ---
 time_us: 18
 log: |
  lkl_mount_fs("proc") = 0 
 ...
* 24 chdir_proc
ok 24 chdir_proc
 ---
 time_us: 7
 log: |
  lkl_sys_chdir("proc") = 0 
 ...
* 25 open_cwd
ok 25 open_cwd
 ---
 time_us: 7
 log: |
 ...
* 26 getdents64
ok 26 getdents64
 ---
 time_us: 6
 log: |
  4 . .. fs bus irq net sys tty kmsg maps misc stat iomem crypto driver 
  mounts uptime vmstat cmdline cpuinfo devices ioports loadavg meminfo version 
  consoles kallsyms slabinfo softirqs zoneinfo buddyinfo diskstats interrupts 
  partitions timer_list 
 ...
* 27 close_dir_fd
ok 27 close_dir_fd
 ---
 time_us: 0
 log: |
  lkl_sys_close(dir_fd) = 0 
 ...
* 28 chdir_root
ok 28 chdir_root
 ---
 time_us: 7
 log: |
  lkl_sys_chdir("/") = 0 
 ...
* 29 umount_fs_proc
ok 29 umount_fs_proc
 ---
 time_us: 11
 log: |
  lkl_umount_timeout("proc", 0, 1000) = 0 
 ...
* 30 lo_ifup
ok 30 lo_ifup
 ---
 time_us: 44
 log: |
  lkl_if_up(1) = 0 
 ...
* 31 gettid
ok 31 gettid
 ---
 time_us: 0
 log: |
  7893000
 ...
* 32 many_syscall_threads
ok 32 many_syscall_threads
 ---
 time_us: 26
 log: |
 ...
* 33 stop_kernel
ok 33 stop_kernel
 ---
 time_us: 531
 log: |
  [145272.871000] reboot: Restarting system
  lkl_sys_halt() = 0 
 ...

Issues

The current LKL on Unikraft has the following issues.

  • It needs modifications to Unikraft.
  • It does not support disk operations and networks.
  • It hangs under some certain situations.

The first issue is described above in details. The second issue is because of the lack of file system and network support by Unikraft. The third issue is not only on Unikraft, but also with retrage/linux:retrage/fiber.

As a design issue, the use of Linux kernel does not match with Unikraft policy. Unikraft aims to build ‘slim’ Unikernel images by building the necessary libraries, but LKL reuses existing Linux kernel which is ‘fat’. The size of the final image tends to be large if LKL is integrated.

References