Notes
This article assumes a single-core RV64I RISC-V CPU with OpenSBI firmware.
Getting booted by OpenSBI
OpenSBI (the most common firmware used by RISC-V machines) typically loads itself at 0x80000000, anything below being reserved for MMIO1 devices. To ensure your kernel's load address doesn't overlap with OpenSBI code, you would typically load your kernel at an address such as 0x80200000 (Bonus: as your kernel is 2MB-aligned, when you set up paging, you can use large pages2 to minimize TLB misses, which makes code faster).
1: MMIO, or Memory-mapped I/O, is the protocol through which you communicate with external devices via writing to memory. While CPU architectures such as x86-64 use port I/O (dedicated inb/outb) instructions, RISC-V prefers MMIO for simplicity.
2: Pages generally come in sizes of 4KB, 2MB, and 1GB. These sizes are enforced by the MMU (Memory Management Unit). Large pages are 2MB pages, often used for large memory regions in order to minimise TLB misses.
Preparing to jmp into C code
Unfortunately, we have to set some things up ourselves via Assembly until we can run C code.
Setting up the stack
Setting up the stack is fairly simple. First, we have to define a section that we can use for the stack:
.section .stack
.global stack_top
.balign 16 # Use 128-bit alignment (yes this is necessary)
.space 16384 # 16KB stack
stack_top:
We're going to define the stack section later, in our linker file.
Preparing an entry stub in Assembly
This is what OpenSBI will jmp to.
.section .text
.global _start
.extern stack_top
.extern kernel_main
.section .text.entry # Practically tell the linker to place this function at the start of kernel memory
.balign 8 # 64-bit alignment
_start:
# Zero out the .bss section, 32 bits at a time
la t0, __bss_start
la t1, __bss_end
bss_loop:
bge t0, t1, bss_done
sw zero, 0(t0)
addi t0, t0, 4
j bss_loop
bss_done:
# Load the stack pointer
la sp, stack_top
call kernel_main # This is what jumps to our C code
# If kernel_main returns, spin indefinitely
spin:
j spin
Making the linker file
This is going to look a lot like dark magic, but I am going to explain as much as possible using comments.
OUTPUT_ARCH(riscv) /* Tell ld that we want to use RISC-V */
ENTRY( _start ) /* Tell ld that our program should start at _start */
MEMORY
{
RAM (rwxa) : ORIGIN = 0x80200000, LENGTH = 1G /* This is where usable memory starts. rwxa makes it readable, writable, executable, and allocateable */
STACKRAM (rw) : ORIGIN = 0xc0200000, LENGTH = 16K /* Separate the stack from other data (practically inserting guard pages if you ever set up paging). By keeping it non-allocateable, leftover memory from .bss can't flow into STACKRAM */
}
SECTIONS
{
. = 0x80200000; /* Remember that magic load address? */
.text : {
/* This is our code */
*(.text.entry) /* make sure _start is first */
*(.text .text.*)
} > RAM
.rodata : {
/* This is our read-ony data */
. = ALIGN(16);
*(.rodata .rodata.*)
} > RAM
.data : {
. = ALIGN(16); /* Indeed more alignment (RISC-V faults on misaligned access) */
*(.data .data.*)
} > RAM
/* Uninitialised variables */
.bss : {
. = ALIGN(16);
__bss_start = .; /* Tell it where to start */
*(.bss .bss.*)
*(COMMON)
. = ALIGN(4);
__bss_end = .; /* Tell it where to end */
} > RAM
/* Our stack section from earlier */
.stack (NOLOAD) : { /* (NOLOAD) makes the .stack section not be loaded from within the binary */
*(.stack)
} > STACKRAM
}
Making a C entrypoint
Now that we got all the setup issues out of the way, let's make a basic hello world program in C.
Making an UART driver
// uart.h
typedef unsigned char uchar;
void uart_putc(uchar c);
void uart_puts(const uchar* s);
// uart.c
#include <stdint.h>
#include "uart.h"
#define UART_BASE 0x10000000 // Note: This can change between machines
#define UART_THR 0x00 // Write Offset
#define UART_LSR 0x05 // Is it occupied?
#define LSR_TX_IDLE 0x20
#define UART_REG(reg) ((volatile uint8_t *)(UART_BASE + reg)) // A helper to get UART addresses as volatile (writes immediately)
void uart_putc(uchar c) {
// Wait until we can write to UART again
while ((*UART_REG(UART_LSR) & LSR_TX_IDLE) == 0);
__asm__ volatile("fence o, o" :::"memory");
*UART_REG(UART_THR) = c;
}
void uart_puts(const uchar *s) {
while (*s) {
uart_putc(*s);
s++;
}
}
(Finally) a hello world program
#include "uart.h"
void kernel_main() {
uart_puts("Hello from Kernel World!");
}
Congrats! You've now written your first hello world program in kernel space, with only ~100 lines of code.
You can now run it in a QEMU VM to see if it works! Make sure to give it something like 2GB of RAM, or to update the linker script for a lower ram requirement.
If you're stuck, here is how I built and ran it:
riscv64-unknown-elf-gcc main.c uart.c entry.S stack.S -T linker.ld -o kernel.elf -mcmodel=medany -ffreestanding -nostdlib -nostartfiles
qemu-system-riscv64 -machine virt -cpu rv64 -m 2G -nographic -bios default -kernel kernel.elf
Next Steps
Now that you have a Hello World program up and running, here are a few challenges left for the reader:
- Making a printf() implementation
- Adding support for paging
- Adding a basic fault handler
Top comments (0)