I've been using Git for years. Committing, pushing, pulling, occasionally panicking when things break. But if you'd asked me what actually happens when I run git commit
, I'd have given you some vague answer about "saving changes" and hoped you wouldn't ask more questions.
That bothered me. So I built Veridian, my own version control system in Rust. Not because the world needs another Git, but because I needed to understand the one we already have.
Turns out, Git is way simpler than I thought.
Why Everyone Finds Git Confusing
We learn Git backwards. We memorize commands without understanding what they do. git add
stages files. Okay, but what does staging actually mean? git commit
saves your work. Cool, but saves it where and how?
I spent years just following commands. I could use Git, but I couldn't understand it. Building Veridian changed that.
Here's What Git Actually Is
Strip away all the commands and features, and Git is just a content-addressable storage system. Sounds complicated, but it's not.
You have a storage system where things are saved by their content, not their name. Put in the same content twice? Same storage location. Change one character? Different location.
That's Git. The "location" is a SHA-1 hash. The "storage" is the .git/objects
folder.
Three Types of Objects
Blobs are file contents. Take your file, add a header like blob <size>\0
, hash it, compress it, store it. Done. Git doesn't care about filenames here. Only content.
Trees are directory listings. They say "here's what this folder looked like" by listing files and folders with their hashes. Trees point to blobs and other trees.
Commits are snapshots with context. Each commit points to a tree (what your project looked like), points to parent commits (what came before), and has metadata like author, time, and message.
Three object types. That's the whole system.
Why the Hash System is Smart
Same file in multiple commits? Stored once. Changed one line in a big file? Only the new version gets stored. Want to check if two files are identical? Compare hashes, instant answer.
Git isn't making copies of your project over and over. It's storing unique pieces and building snapshots from them. That's why repos with hundreds of commits aren't huge.
Branches Are Just Files
A branch is literally a file with a commit hash in it.
The file .git/refs/heads/main
has 40 characters in it, the hash of your latest commit. When you make a branch, Git writes a new file with the current commit hash. When you commit, Git updates the file with the new hash.
No copying. Just updating a small text file. That's why branches are "lightweight."
The Compression Part is Cool
Git uses zlib to compress everything before storing it. So your object files aren't just raw content, they're compressed. When I was building Veridian, I had to handle this compression and decompression for every read and write.
Here's what happens: Git takes your blob (with header), compresses it, then stores it in .git/objects/ab/cdef123...
where ab
is the first two characters of the hash and cdef123...
is the rest. The two-character split is just to avoid having thousands of files in one directory, which would slow down file systems.
Reading it back means finding the file, decompressing with zlib, parsing the header to check object type and size, then giving you the content. Rust's standard library doesn't have zlib built in, so I used the flate2
crate for this. Took like 5 lines of code.
What Building Veridian Taught Me
I thought building a version control system would be hard. It wasn't.
Building in Rust was interesting because Rust makes you think about memory and ownership. When you're hashing files and building trees, you need to handle errors properly (what if the file doesn't exist?) and manage buffers carefully (you can't just load a 5GB file into memory).
But honestly, the version control logic itself is simple. Most of my code is just reading files, computing SHA-1 hashes, and writing compressed data. The hard part wasn't the algorithm, it was understanding what Git was actually doing.
Init Command
Make a .veridian
folder. Add subfolders for objects and refs. Create a HEAD file. Done, you have a repo.
Hash-Object Command
Read file, add header, hash it with SHA-1, compress with zlib, write to .veridian/objects/
. Return the hash. That's how files enter the system.
Write-Tree Command
Go through a directory. Hash every file (making blobs). Put all the names and hashes into a tree object. Hash that tree. Now you have a snapshot of your directory.
One thing I learned: tree entries need to be sorted by filename. If you don't sort them, the same directory structure produces different hashes depending on the order you process files. Git sorts them to keep hashes consistent. Small detail, but it matters.
Commit-Tree Command
Take a tree hash. Add parent commit hash if there is one. Add author info and time. Add message. Hash it all. Write to objects. Update branch pointer. Update HEAD. That's a commit.
The implementation is surprisingly small.It works like Git because Git is actually this simple.
Fun fact: Git stores timestamps as Unix timestamps (seconds since 1970) with timezone info. So a commit object has something like 1760211794 +0530
which is the timestamp and timezone offset. When you git commit
, it grabs your system time and timezone. I used Rust's chrono
crate for this, but you could do it with any language.
Things That Finally Clicked
Why Git is fast: It compares hashes, not file contents. 40 character strings. Super quick.
Why detached HEAD happens: HEAD normally points to a branch file, which points to a commit. Check out a commit directly? HEAD points at the commit, skipping the branch. You're detached because you're not on a branch, you're on a specific commit.
Why you can recover deleted commits: They're still in .git/objects
. Just unreferenced. Use git reflog
, find the hash, get it back. Only garbage collection deletes them for real.
Why merge conflicts exist: Two commits have the same parent but different changes to the same file. Git can't pick which one wins. It needs you to decide.
What I Learned
Git isn't hard because it's complicated. It's hard because we learn it wrong.
Once you get that Git is a key-value store where the key is a content hash, and there are three types of values (blob, tree, commit), everything makes sense. Branches are pointers. Merging combines trees. Rebasing replays commits on a different parent.
I used Git for years without getting it. Then I built Veridian in a week and suddenly Git made sense. Not because building is magic, but because it forces you to understand what's happening.
Why You Should Try This
You don't need to build something perfect. Just start.
Make a repo. Store a file as a blob. Build a tree. Make a commit. Do those four things and you'll understand Git better than most developers.
Building Veridian took maybe a week. Now when I use Git, I actually know what's happening. It's just data structures and file operations. Nothing complicated.
Veridian isn't perfect. It's missing features. Probably has bugs. But it taught me how Git works, and that was the point.
If you want to actually learn Git, not just use it, build something. Even if it's small. Even if it breaks. You'll learn more building for a week than reading docs for months.
Check out Veridian on GitHub. Break it, fix it, learn from it. That's how this works.
Top comments (2)
Amazinggggg(Made an account just to like this😛)
lots of love for you <3333