DEV Community

Cover image for Writing An Operating System - The Boot Process (Part 1)
Farhan
Farhan

Posted on

24 4 1

Writing An Operating System - The Boot Process (Part 1)

This post was originally published here

My journey on learning to build a simple OS

How many times have you read an OS book but not been able to code one?Operating System (OS) books are tedious, but only theory makes it hard to understand how an OS actually works. Here is my attempt to write a simple OS and document some of the concepts learned.

Before You Start

On a mac, install Homebrew and then brew install qemu nasm
On some systems qemu is split into multiple binaries. You may want to call qemu-system-x86_64 binfile

QEmu

For testing these low-level programs without continuously having to reboot a machine or risk scrubbing your important data off a disk, we will use a CPU emulator QEmu.
I'm working on a Mac (with M1 chip). QEmu has some issues with M1 chip, so you can run these experiements inside a docker container. docker run -it ubuntu bash.
Run QEmu with -nographic and -curses arguments inside docker container to display the VGA output when in text mode

NASM

NASM is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs.


When we start our computer, initially, it has no notion of an operating system. Somehow, it must load the operating system --- whatever variant that may be --- from some permanent storage device that is currently attached to the computer (e.g. a floppy disk, a hard disk, a USB dongle, etc.).

The Boot Process

Booting an operating system consists of transferring control along a chain of small programs, each one more “powerful” than the previous one, where the operating system is the last “program”.

BIOS

When the PC is turned on, the computer will start a small program that adheres to the Basic Input Output System (BIOS) [16] standard. This program is usually stored on a read only memory chip on the motherboard of the PC. BIOS is a collection of software routines that are initially loaded from a chip into memory and initialised when the computer is switched on. BIOS provides auto-detection and basic control of your computer’s essential devices, such as the screen, keyboard, and hard disks.

Note: Modern operating systems do not use the BIOS’ functions, they use drivers that interact directly with the hardware, bypassing the BIOS. Today, BIOS mainly runs some early diagnostics (power-on-self-test) and then transfers control to the bootloader.

‌Boot Sector‌


BIOS cannot simply load a file that represents your operating system from a disk, since BIOS has no notion of a file- system. BIOS must read specific sectors of data (usually 512 bytes in size) from specific physical locations of the disk devices, such as Cylinder 2, Head 3, Sector 5.
So, the easiest place for BIOS to find our OS is in the first sector of one of the disks (i.e. Cylinder 0, Head 0, Sector 0), known as the boot sector. To make sure that the "disk is bootable", the BIOS checks that bytes 511 and 512 of the alleged boot sector are bytes 0xAA55. If so, the BIOS loads the first sector to the address 7C00h, set the program counter to that address and let the CPU executing code from there. This is the simplest boot sector ever:

e9 fd ff 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 29 more lines with sixteen zero-bytes each ]
00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa
Enter fullscreen mode Exit fullscreen mode


Note that, in the above boot sector, the three important features are:

1) The initial three bytes, in hexadecimal as 0xe9, 0xfd and 0xff, are actually machine code instructions, as defined by the CPU manufacturer, to perform an endless jump.
2) The last two bytes, 0x55 and 0xaa, make up the magic number, which tells BIOS that this is indeed a boot block and not just data that happens to be on a drive’s boot sector. (in little-endian format)
3) The file is padded with zeros (’*’ indicates zeros omitted for brevity), basically to position the magic BIOS number at the end of the 512 byte disk sector.

The first sector is called Master Boot Record, or MBR. The program in the first sector is called MBR Bootloader.

So, BIOS loops through each storage device (e.g. floppy drive, hard disk, CD drive, etc.), reads the boot sector into memory, and instructs the CPU to begin executing the first boot sector it finds that ends with the magic number. This is where we seize control of the computer.

‌The Bootloader

The BIOS program will transfer control of the PC to a program called a bootloader. A bootloader loads an OS, or an application that runs and communicate directly with hardware. To run an OS, the first thing to write is a bootloader. Here is a simple bootloader.

;
; A simple boot sector program that loops forever. ;
9

; Define a label, "loop", that will allow ; us to jump back to it, forever.
; Use a simple CPU instruction that jumps
; to a new memory address to continue execution. ; In our case, jump to the address of the current ; instruction.



loop:
    jmp loop


; When compiled, our program must fit into 512 bytes,
; with the last two bytes being the magic number,
; so here, tell our assembly compiler to pad out our
; program with enough zero bytes (db 0) to bring us to the ; 510th byte.


times 510-($-$$) db 0


; Last two bytes (one word) form the magic number, ; so BIOS knows we are a boot sector.

dw 0xaa55
Enter fullscreen mode Exit fullscreen mode


We compile the code with nasm and write it to a bin file:

nasm -f bin boot_sect_simple.asm -o boot_sect_simple.bin

Let's try it out, so let's do it:

qemu boot_sect_simple.bin

On some systems, you may have to run

You will see a window open which says "Booting from Hard Disk..." and nothing else. There you go, a simple boot loader is ready!

Continue reading more here

Sorry, copy pasting it from ghost was tough!

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (3)

Collapse
 
jameslivesey profile image
James Livesey
Comment hidden by post author
Collapse
 
arriqaaq profile image
Farhan • Edited

The link which you've posted on GitHub is a summary by cfenollosa of a book and not his code. The code I've written is a simple boot sector which is built on interrupts, something which you can find in various books, and is something basic.

Yes, I will reference the textbooks I am referring to in the upcoming post for further research.

Collapse
 
jameslivesey profile image
James Livesey

I see! Didn't know that, my bad. I was just checking; attributions are quite important tbh. Good to hear that you'll add references to posts in the future 👍

Some comments have been hidden by the post's author - find out more

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay