DEV Community

Cover image for How Git Works Internally: Understanding the .git Folder
SATYA SOOTAR
SATYA SOOTAR

Posted on

How Git Works Internally: Understanding the .git Folder

Many people use Git every day, yet only a few truly grasp what happens behind the scenes. This post aims to build that mental model.

If you truly understand the .git folder, Git stops feeling like magic and starts feeling predictable.

What Is the .git Folder?

The .git folder is the heart of a Git repository. It is the hidden directory that Git uses to store all the information necessary for managing your project's version control history. This folder is where Git keeps track of all the changes made to your files, allowing you to revert to previous versions, collaborate with others, and maintain a complete history of your project.

Why Does the .git Folder Exist?

Git needs a place to →

  1. Store snapshots of files
  2. Track relationships between commits
  3. Ensure data integrity using hashes
  4. Move between versions instantly

The .git folder exists so Git can do all of this locally, without relying on a server.

Structure of the .git Folder

When you run git init or clone a repository, Git creates the .git folder at the root of your project.

Git folder structure showing directories and files including HEAD, config, description, hooks, info, objects, and refs

Core Files Inside .git

HEAD

  1. HEAD is a very small but very important file.
  2. It tells Git where you are right now.

HEAD file content showing ref: refs/heads/main indicating the current branch pointer

This means:

  • You are on the main branch
  • HEAD points to refs/heads/main

If you open refs/heads/main, you will see a SHA-1 hash:

33a0481b440426f0268c613d036b820bc064cdea
Enter fullscreen mode Exit fullscreen mode

That hash represents the latest commit on this branch.

config

The .git/config file is the repository-specific configuration file that controls Git's behavior for that particular repository. It contains settings that override both system-wide and global Git configurations.

Common sections include →

  • core: Core repository behavior
  • remote: Remote repository URLs
  • branch: Branch tracking rules
  • user: Local user identity (if set)

This is how Git knows where to push, pull, and how to behave for this repo only.

Git config file showing core settings, remote origin URL, and branch tracking configuration

description

This file is mostly used by GitWeb and similar tools.

Default content looks like this:

Unnamed repository; edit this file 'description' to name the repository.
Enter fullscreen mode Exit fullscreen mode

For most developers, this file is rarely touched.

index

This file is also called the Staging area. It is a binary file. It contains the snapshot of the working directory's content and is used to prepare the next commit.

Whenever you run:

git add
Enter fullscreen mode Exit fullscreen mode

Git updates the index file with the current files and makes it ready to save.

Core Directories Inside .git

objects

The objects directory stores all actual data of your repository. Every file, directory, and commit is stored here as an immutable object.

Structure looks like this:

objects/
├── 01/
│   └── 23456789abcdef0123456789abcdef0123456
├── 02/
│   └── 3456789abcdef0123456789abcdef012345678
├── a1/
│   ├── b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t
│   └── 23456789abcdef0123456789abcdef012345678
├── ...
├── info/
│   ├── packs
│   └── alternates
└── pack/
    ├── pack-a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0.idx
    ├── pack-a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0.pack
    └── pack-e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0a1b2c3d4.pack
Enter fullscreen mode Exit fullscreen mode

Git uses the first two characters of a hash as the folder name, and the rest as the filename. This keeps the filesystem efficient.

Git's Three Core Object Types

Git has three core object types:

  • Blob
  • Tree
  • Commit

Understanding these three means you understand Git.

Blob

A blob stores the raw content of a file.

  • No filename
  • No directory info
  • No metadata

When you run:

git add file.txt
Enter fullscreen mode Exit fullscreen mode

What git does is:

  • Compresses the file content
  • Hashes it using SHA-1
  • Stores it as a blob in .git/objects

The blob is identified only by its hash.

Tree

Trees are small files with directory listings. The files in it are called blobs.

It contains:

  • File names
  • Permissions
  • References to blobs
  • References to other trees
$ git cat-file -p 9f83ee7550919867e9219a75c23624c92ab5bd83
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 .gitignore
100644 blob 665c637a360874ce43bf74018768a96d2d4d219a hello.py
040000 tree 24420a1530b1f4ec20ddb14c76df8c2c48f76a6 lib
Enter fullscreen mode Exit fullscreen mode

Diagram:

Git tree structure diagram showing how a tree object references blobs for files and other trees for directories

Commit

A commit ties everything together.

It contains:

  • A reference to a tree
  • Parent commit(s)
  • Author and committer info
  • Commit message

Example:

$ git cat-file -p 1093da429f08e0e54cdc2b31526159e745d98ce0
tree 9f83ee7550919867e9219a75c23624c92ab5bd83
parent 33a0481b440426f0268c613d036b820bc064cdea
author satya <satya@example.com> 1706120622 -0500
committer satya <satya@example.com> 1706120622 -0500

add hello.py
Enter fullscreen mode Exit fullscreen mode

A commit does not store files directly.
It stores a pointer to a tree, which points to blobs.

Git commit object structure showing commit pointing to tree, which points to blobs and other trees

What Happens Internally During git add

Let's say you modify index.js and run:

git add index.js
Enter fullscreen mode Exit fullscreen mode

Internal flow:

  • Git creates a blob for the file content
  • Git stores the blob in .git/objects
  • Git updates the index file

That's it.
No commit yet. No history yet.
You have only prepared the snapshot.

Mental Model:

git add means:

"Store this file safely and prepare it for the next commit."

What Happens Internally During git commit

When you run:

git commit -m "Initial commit"
Enter fullscreen mode Exit fullscreen mode

Git does the following:

  1. Creates a tree object from the staging area
  2. Creates a commit object
  3. Links the commit to its parent
  4. Moves the branch pointer to the new commit
  5. Updates HEAD

Now the history officially exists.

How Git Uses Hashes for Integrity

Every Git object is identified by a SHA-1 hash.

This means:

  • Any change to content changes the hash
  • Objects cannot be silently modified
  • History is tamper-evident

Git is not just version control.
It is a content-addressed database.

Top comments (0)