I’d been using Git comfortably for months before getting uncomfortable with how little I actually understood.
git add. git commit. git push were memory at this point. But ask me what Git was actually doing under the hood? I couldn’t tell you. I had some (mostly wrong) assumptions but that was about it.
Then I discovered the .git/ directory…
Git Is Not What You Think It Is
Many people think of Git as a “track changes” tool. That’s the way I thought of it. Like the version history in a Google Doc. That’s not wrong, but it’s not the whole picture.
Git’s creator, Linus Torvalds, described it more precisely:
“Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.”
So, what is a content-addressable filesystem?
Instead of storing files by name or location, Git stores them by their content. Every piece of data gets run through a hashing algorithm (SHA-1) that produces a unique 40-character fingerprint. That fingerprint is the file’s identity in Git’s database. The same content will always return the same hash, always.
Understanding this immediately explained much of what felt confusing about Git.
The .git/ Directory Is Your Entire Repository
Open any Git project and run this:
ls .git/
You’ll see something like:
HEAD config description hooks info objects refs
.git/ is a hidden directory created when a Git repository is initialized.
This directory is your entire repository. Every commit, every version of every file, the full history — all of it lives inside .git/. Your working directory (the files you actually edit) is just a projection of what’s in there.
You can copy .git/ somewhere else, and you’ll have everything. On the other hand, if you delete .git/ you lose your entire history. Not just your latest changes. Everything.
It’s hidden for a good reason, but it is hugely helpful to be aware of it.
Two very important directories:
-
objects/— Git’s content database. Every blob, tree, and commit lives here. -
refs/— Pointers. Branch names, tags, remotes. All of them are just pointers to commits inobjects/.
Three Objects. One Chain.
Git stores everything as one of three object types. Understanding these three types makes everything else click.
Blob — The File
A blob (Binary Large OBject) is an immutable object with a header. It contains the raw file content and nothing more.
It does not store the filename or its location. It has no reference to itself or any other files.
If one character in a file changes, Git creates a brand new blob with a completely different SHA. The old one stays untouched.
Run this on any blob SHA to see for yourself:
git cat-file -p <blob SHA>
You’ll see the raw file content. No filename. No path. Just the content.
Tree — The Snapshot
A tree object represents a directory. It’s a list of entries — each one containing a file mode, object type, SHA, and filename.
This is where filenames live. The tree connects a name to a blob. Run git cat-file -p on a tree SHA and you’ll see something like:
100644 blob b7aec520dec0a7516c18eb4c68b64ae1eb9b5a5e README.md
Mode, type, SHA, filename. The tree is the snapshot of your project at a given moment.
Commit — The Pointer
A commit object is simpler than most people expect. Run git cat-file -p on a commit SHA:
tree 7102e6ffc1a508e552d53f28bbeb0a976124d7e6
parent d48a0fbe5c378c69ad9522883d1d2f8dbe5ebd69
author Gmcjr <gmcostajr@gmail.com> 1779654747 -0500
committer Gmcjr <gmcostajr@gmail.com> 1779654747 -0500
second commit
That’s it. A commit is:
- A pointer to a tree (the snapshot)
- A pointer to its parent commit (the history)
- Metadata that includes the author (who made changes/wrote the code) and committer (who made the commit), as well as when the commit was made, and the commit message.
The chain is always:
commit → tree → blob(s)
There’s actually a fourth object type — the annotated tag. I won’t go into detail here, but the annotated tag wraps a commit with its own metadata and SHA, and is often used for marking official releases.
The History Is a Graph, Not a Line
The chain — commit → tree → blob — implies something linear. It isn’t.
Every commit points to its parent. Most commits have one. Merge commits have two. Over time, those parent-child relationships form a DAG: a Directed Acyclic Graph.
Directed means the edges go one way — a commit points to its parent, never the reverse. Acyclic means no loops — you can’t follow the chain and end up back where you started. Graph means the structure branches and rejoins freely.
This is why git log can look like a tree but isn’t one. Trees don’t rejoin. A DAG can. Every time you merge two branches, the merge commit has two parent pointers — two incoming edges — and the graph folds back on itself.
Understanding that Git’s history is a DAG explains a lot: why git log --graph draws forks and merges the way it does, why reachability matters (an object is “live” if you can reach it by following edges from a named ref), and why Git’s garbage collector targets objects with no incoming path from any branch or tag.
The Stuff That Trips People Up
A Branch Is a Text File
Run this:
cat .git/refs/heads/main
You’ll get a single SHA. That’s your main branch. A 41-character text file. That’s the entire concept of a branch in Git — a file containing the SHA of the commit it currently points to.
Creating a branch creates a file. Switching branches changes which file HEAD points to. Deleting a branch deletes the file.
Which brings us to something important.
Deleting a Branch Doesn’t Delete Your Commits
When you delete a branch, you delete the pointer. You do not delete the objects in .git/objects/. The commits, trees, and blobs that branch pointed to are still in the database. They’re just unreachable by name.
You can prove it. Delete a branch, then run git cat-file -p on a commit SHA that was on it. It’s still there.
Git’s garbage collector (git gc) will eventually clean up truly unreachable objects — but not for at least two weeks by default. Which means if you accidentally delete a branch, git reflog can save you.
Staging Already Writes to the Database
Most people think git add just “queues” a file for the next commit. It does more than that.
When you run git add, Git immediately:
- Hashes your file content
- Creates a blob object
- Writes it to
.git/objects/ - Updates the index (staging area) to track it
The blob is already stored before you ever run git commit. The commit just formalizes it — creating the tree and commit objects that make the blob part of your permanent history.
git reset Moves a Pointer. It Doesn’t Delete Data.
git reset --hard <SHA> feels destructive. It’s not — at least not immediately.
What it actually does: moves the branch pointer back to the specified commit. The commits you “reset away” are still in .git/objects/. They’re just unreachable from your current branch.
Run git reflog after a reset and you’ll see every place HEAD has pointed. Your “lost” commits are listed there, recoverable by SHA.
git revert works differently. Instead of moving a pointer, it creates a brand new commit that applies the inverse of a previous commit’s changes. History is preserved. This is why git revert is safe on shared branches and git reset is not — reset rewrites history, revert adds to it.
Merge Commits Have Two Parents
When you merge two branches and there’s a conflict, Git compares three blobs: the common ancestor, your version, and their version. This is called a three-way merge.
If the same lines changed on both sides, Git writes conflict markers to your working directory and waits for you to resolve them. Once you do, git add creates a new blob. Then git commit creates a merge commit — the only commit type with two parent pointers, one for each branch tip.
That two-parent structure is exactly what the DAG looks like in the object database.
Why This Mental Model Matters
You don’t need to know this stuff to use Git day-to-day. But once you do, a few things happen:
- Detached HEAD stops being scary. It just means HEAD is pointing directly at a commit SHA instead of a branch name.
- Merge conflicts make sense. Two commits changed the same lines. Git can’t pick — it’s asking you to resolve a pointer conflict.
- “Lost” commits stop feeling lost. The objects are almost always still there.
git reflogis your safety net. - You stop being afraid of
git reset,git rebase, and other commands that feel dangerous. -
git log --graphstops being decorative. It’s a literal rendering of the DAG — every fork is a branch tip, every convergence is a merge commit.
Git isn’t magic. It’s a content-addressable filesystem with a great interface. A few text files, a database of hashed objects, and a set of rules for how they point to each other.
Open .git/ and look around. It’s less mysterious than you think.
Further reading:
Top comments (0)