DEV Community

Cover image for Building a KVM Virtual Machine in Rust: Memory Setup
Stjepan
Stjepan

Posted on

Building a KVM Virtual Machine in Rust: Memory Setup

Recap

This is a continuation of my previous article which dealt
with reverse-engineering QEMU with strace to learn how KVM works. Now it's
time to try and follow the steps we got from the strace logs to build our own
KVM-based virtual machine in Rust.

KVM Headers

I haven't actually used existing KVM libraries written specifically for Rust but
opted to use the libc crate which provides the required ioctl bindings and
helper macros. The main reason is that I want a complete understanding of what
is happening under the hood. Now for each of these KVM ioctl calls we can use
the Linux headers for reference. For example, to find out how to construct
ioctl number for KVM_CREATE_VM we simply can do:

$ grep -Rn 'KVM_CREATE_VM' /usr/include/linux/
/usr/include/linux/kvm.h:855:/* machine type bits, to be used as argument to KVM_CREATE_VM */
/usr/include/linux/kvm.h:882:#define KVM_CREATE_VM             _IO(KVMIO,   0x01) /* returns a VM fd */
Enter fullscreen mode Exit fullscreen mode

Luckily in libc Rust crate we have macros for _IO (and the like), but we
still need KVMIO macro:

$ grep -Rn 'define KVMIO' /usr/include/linux/
/usr/include/linux/kvm.h:853:#define KVMIO 0xAE
Enter fullscreen mode Exit fullscreen mode

We can now construct the ioctl number for KVM_CREATE_VM:

use libc::{_IOW, _IO, _IOR};

const KVMIO : u32 = 0xae;
const KVM_CREATE_VM : u64 = _IO(KVMIO, 0x01);
Enter fullscreen mode Exit fullscreen mode

Note that exact integer type depends on the platform and libc definitions.
Then, in the main we can do:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = OpenOptions::new()
        .read(true)
        .write(true)
        .open("/dev/kvm")
        .expect("failed to open /dev/kvm");

    let fd = file.as_raw_fd();
    let vm_fd = unsafe { libc::ioctl(fd, KVM_CREATE_VM, 0usize) };
    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

This is the procedure we will follow for each relevant ioctl call. In fact,
we can "reverse-engineer" our own program and then compare it with the original,
to make sure we are doing the right thing:

$ strace cargo run

     ### omitted irrelevant strace output ###

openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 3
ioctl(3, KVM_CREATE_VM, 0)              = 4
Enter fullscreen mode Exit fullscreen mode

Important note

In production code we would immediately check for a negative return value and
convert errno into a Rust error. To keep the example focused, I am omitting
proper error handling in this article.

Setting memory region

Now, to recall, the next step is setting up the memory region used by both the
KVM guest and the host. This region will be the memory of our virtual machine, a
place where we will load our binary:

140900 mmap(NULL, 1075838976, 0 /* PROT_NONE */, 0x22 /* MAP_PRIVATE|MAP_ANONYMOUS */, -1, 0) = 0x7768b3e00000
140900 mmap(0x7768b3e00000, 1073741824, 0x3 /* PROT_READ|PROT_WRITE */, 0x32 /* MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS */, -1, 0) = 0x7768b3e00000
140900 ioctl(9<anon_inode:kvm-vm>, 0x4020ae46 /* KVM_SET_USER_MEMORY_REGION */, {slot=0, flags=0, guest_phys_addr=0, memory_size=1073741824, userspace_addr=0x7768b3e00000}) = 0
Enter fullscreen mode Exit fullscreen mode

Recreating mmap call

So, first thing we need to do is follow the same logic for mmap which is also
available in the libc Rust crate. After creating the virtual machine, we could
simply recreate our own mmap calls based on the strace output. However,
notice that QEMU first reserves a larger address range with PROT_NONE and then
maps only the portion it actually intends to use. For our prototype we do not
actually need to mimic this exact reservation pattern.

let mem_size: u64 = 256 * 1024;
let mem = unsafe {
    libc::mmap(ptr::null_mut(),
               mem_size as usize,
               libc::PROT_READ|libc::PROT_WRITE,
               libc::MAP_PRIVATE|libc::MAP_ANONYMOUS,
               -1,
               0)
};
Enter fullscreen mode Exit fullscreen mode

For this experiment we are only allocating 256 kilobytes because we are not yet
booting a full operating system and therefore need very little guest memory.
Also note that, as with ioctl, production code should check whether mmap()
returned MAP_FAILED.

Recreating ioctl call

First we need to see the definition of the KVM_SET_USER_MEMORY_REGION:

$ grep -Rn 'define KVM_SET_USER' /usr/include/linux/ -A1
/usr/include/linux/kvm.h:1433:#define KVM_SET_USER_MEMORY_REGION _IOW(KVMIO, 0x46, \
/usr/include/linux/kvm.h-1434-                                  struct kvm_userspace_memory_region)
Enter fullscreen mode Exit fullscreen mode

We see that for this one, we need struct kvm_userspace_memory_region. The
values we need are visible in the strace output, but copying struct definition
to our Rust program is really not advisable. Luckily, Rust has bindgen which
we can use to get this KVM struct (and others) from Linux headers. For this
purpose we have a separate build.rs file which will contain:

use bindgen;
use std::path::PathBuf;
use std::env;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let out_path = PathBuf::from(env::var("OUT_DIR")?);

    bindgen::Builder::default()
        .header("/usr/include/linux/kvm.h")
        .allowlist_type("kvm_userspace_memory_region")
        .generate_comments(false)
        .generate()?
        .write_to_file(out_path.join("kvm-bindings.rs"))?;

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Now in the main.rs we can import these bindings with:

include!(concat!(env!("OUT_DIR"), "/kvm-bindings.rs"));
Enter fullscreen mode Exit fullscreen mode

Then, after mmap() calls, we set the memory region, also imitating what
QEMU is doing in the strace output:

let region = kvm_userspace_memory_region {
    slot : 0,
    flags : 0,
    guest_phys_addr : 0x0,
    memory_size : mem_size,
    userspace_addr : mem as u64
};

let _ret = unsafe { libc::ioctl(vm_fd, KVM_SET_USER_MEMORY_REGION, &region) };
Enter fullscreen mode Exit fullscreen mode

We register this region starting at guest physical address 0x0, meaning the
first byte of our allocated host memory will appear as physical address 0 inside
the guest. Also note that the KVM_SET_USER_MEMORY_REGION call does not copy
memory, but rather tells KVM that guest physical address will be backed by a
specific userspace memory region.

Running the code

Next thing to do is to run it. We still haven't added any checks after mmap
and ioctl calls, but for this prototype we can again simply use strace our
own code:

$ strace -yy -X verbose -e trace=ioctl,mmap,openat,read,write cargo run

                ### omitted irrelevant strace output ###

openat(-100 /* AT_FDCWD */</home/stjepan/Develop/KVM/rust>, "/dev/kvm", 0x80002 /* O_RDWR|O_CLOEXEC */) = 3</dev/kvm<char 10:232>>
ioctl(3</dev/kvm<char 10:232>>, 0xae01 /* KVM_CREATE_VM */, 0) = 4<anon_inode:kvm-vm>
mmap(NULL, 262144, 0x3 /* PROT_READ|PROT_WRITE */, 0x22 /* MAP_PRIVATE|MAP_ANONYMOUS */, -1, 0) = 0x7d63fdfa2000
ioctl(4<anon_inode:kvm-vm>, 0x4020ae46 /* KVM_SET_USER_MEMORY_REGION */, {slot=0, flags=0, guest_phys_addr=0, memory_size=262144, userspace_addr=0x7d63fdfa2000}) = 0
Enter fullscreen mode Exit fullscreen mode

We can see our output is fine and no errors were reported. Note that a full
working example with proper checking and Rust idiomatic approaches can be found
on my GitHub page:

https://github.com/StjepanPoljak/kvm-rust/tree/kvm-part2-code

Next steps

At this point we have a VM object and guest memory, but nothing is actually executing yet. In the next part we will create a vCPU, initialize its state, load a small binary into guest memory and enter the first KVM_RUN loop.

Top comments (0)