So Microsoft just dropped what they're calling the earliest DOS source code ever discovered, and naturally, my first instinct was to try building it. If you had the same idea and hit a wall of cryptic assembler errors, you're not alone.
Here's how I got it compiling and running on a modern machine, and what I learned about 8086 assembly along the way.
The Problem: Ancient Assembly Meets Modern Toolchains
The DOS source code is written in 8086 assembly — we're talking circa early 1980s, before most of us were born (or at least before we were writing code). You can't just clone the repo and run make. The original toolchain doesn't exist anymore, modern assemblers choke on the syntax, and even if you get a binary out, you need something to actually run it on.
The three walls you'll hit:
- No compatible assembler — the original code was built with Microsoft's MASM from the early '80s, which expects syntax that modern MASM versions handle differently
-
Missing build infrastructure — no Makefile, no build scripts, just raw
.ASMfiles and maybe a batch file or two - No hardware to run it on — unless you've got an original IBM PC in your garage
Let's solve each one.
Step 1: Get the Source Code
The code is available on Microsoft's GitHub. If you've looked at their previous DOS releases (MS-DOS 1.25 and 2.0, open-sourced back in 2018), this follows the same pattern — MIT license, public repository.
# Clone the repository
git clone https://github.com/microsoft/MS-DOS.git
cd MS-DOS
# Check what we're working with
find . -name "*.ASM" -o -name "*.asm" | head -20
You'll see a collection of .ASM files. The core of DOS is surprisingly small — the entire operating system fits in a handful of assembly source files. The key files to look for are the BIOS interface layer, the command processor, and the DOS kernel itself.
Step 2: Set Up a Working Assembler
Here's where most people get stuck. If you try using NASM directly, you'll get syntax errors everywhere because this code uses Intel/MASM syntax with macros that NASM doesn't understand.
Your best options:
Option A: JWasm / UASM (recommended)
JWasm is an open-source MASM-compatible assembler that handles vintage syntax surprisingly well. UASM is its actively-maintained fork.
# On macOS with Homebrew
brew install jwasm
# On Linux (build from source)
git clone https://github.com/Baron-von-Riedesel/JWasm.git
cd JWasm
make -f GccUnix.mak
# Try assembling a source file
jwasm -bin -Fo output.com COMMAND.ASM
Option B: DOSBox + period-correct MASM
For maximum authenticity, you can run an old version of MASM inside DOSBox. This is the nuclear option but it sidesteps all compatibility issues.
# Install DOSBox
brew install dosbox # macOS
sudo apt install dosbox # Debian/Ubuntu
# Mount your working directory
# In dosbox.conf or at the DOSBox prompt:
mount c /path/to/dos-source
c:
masm COMMAND;
You'll need to source a copy of MASM 1.x or 2.x — these are abandonware at this point and findable in vintage software archives.
Step 3: Handle the Inevitable Assembly Errors
Even with a compatible assembler, you'll likely hit issues. Here are the common ones I ran into:
Missing include files
Early DOS source files reference includes that may be in separate directories or missing entirely.
; You might see something like:
INCLUDE DOSSYM.ASM
INCLUDE DEVSYM.ASM
These files define constants and structures used throughout the kernel. Make sure they're in the same directory as the file you're assembling, or adjust the include paths.
Segment ordering issues
The 8086 memory model is segment-based, and the original code assumes a specific segment layout. If your assembler complains about segment ordering:
; The original code might assume segments are ordered like this:
CODE SEGMENT
ASSUME CS:CODE, DS:CODE, ES:CODE, SS:CODE
; ... code here
CODE ENDS
Make sure you're not accidentally using flat memory model flags. For JWasm, avoid the -model flag and let the source file's own directives control the memory model.
Obsolete pseudo-ops
Some very early MASM pseudo-ops were deprecated decades ago. If you see errors on directives you don't recognize, check whether your assembler has a compatibility mode:
# JWasm has options for older syntax compatibility
jwasm -Zm # enable MASM 5.1 compatibility mode
Step 4: Actually Run the Thing
Once you have a binary, you need somewhere to run it. You're not going to boot this on bare metal (please don't), so here are your emulator options:
PCem or 86Box — these are cycle-accurate PC emulators that simulate the original IBM PC hardware. They're your best bet for running something this old.
# 86Box is available on GitHub
# https://github.com/86Box/86Box
# You'll need:
# 1. A ROM image for an IBM PC 5150
# 2. A floppy disk image containing your built DOS binary
# Create a floppy image with mtools
dd if=/dev/zero of=dos_boot.img bs=512 count=720
mformat -i dos_boot.img -f 360 ::
mcopy -i dos_boot.img COMMAND.COM ::
QEMU also works for slightly less vintage DOS versions, though for the very earliest code, the hardware emulation might not be precise enough:
qemu-system-i386 -fda dos_boot.img -boot a -m 640K
Note the -m 640K — because 640K ought to be enough for anybody.
What's Actually Interesting in the Source
Once you get past the build challenges, the code itself is fascinating from a systems programming perspective. A few things that stood out to me:
- The entire OS is tiny. We're talking kilobytes, not megabytes. The whole DOS kernel fits in less space than a single modern favicon.
-
No abstraction layers. The code talks directly to hardware through BIOS interrupts. There's no HAL, no driver model, no nothing.
INT 21his the API. -
String handling is brutal. Everything is done byte-by-byte with
LODSBandSTOSBinstructions. Looking at this code makes you appreciate even C'sstrcmp. - The file system code is where the complexity lives. FAT12 handling is the most intricate part of the codebase, and honestly it's impressively well-structured for the era.
Prevention Tips (For Your Own Retro-Computing Projects)
If you're planning to work with more vintage source code — and Microsoft has been on a trend of releasing these — here's how to save yourself time:
- Keep a DOSBox environment ready with period-correct tools. It takes 10 minutes to set up and saves hours of fighting with cross-assembler compatibility.
- Read the segment directives first before trying to assemble anything. Understanding the memory model the code expects will prevent 80% of build errors.
- Use version control on your build attempts. Sounds obvious, but when you're tweaking assembler flags and patching includes, you'll want to know what combination actually worked.
- Join the retro-computing community. The folks on forums like Vogons and the 86Box Discord have collectively debugged more vintage code than anyone. They're incredibly helpful.
Why This Matters Beyond Nostalgia
Look, I get it — building 40-year-old operating system code isn't exactly a resume builder. But there's genuine value here for modern developers.
Studying DOS source teaches you what an operating system actually is at its most fundamental level. No abstractions, no frameworks, no layers of indirection. Just a program that loads other programs and talks to hardware. If you've ever been confused by how modern OS concepts like system calls, file descriptors, or memory mapping work, seeing the primitives they evolved from makes everything click.
Plus, reading well-structured assembly from this era is a masterclass in writing code under extreme constraints. When your entire OS needs to fit in 64K, every byte matters. That kind of discipline doesn't hurt, even when you're working with modern hardware that has memory to spare.
The source is sitting on GitHub right now. Go clone it, break your assembler, fix it, and see what computing looked like before we had the luxury of complaining about npm install times.
Top comments (0)