I always loved git, In this series of blogs I am documenting about git internals that I learned along the way.
In this blog I am writing about Git Objects, Simply put Object is the storage Unit of Git. All the files, commits, trees and tags in a git repository is stored as Objects. If you are new to Git this will seem a bit cloudy, don't worry, the weather will clear up as you go along with the blog.
Let's start off by creating a new git repo for us to experiment.
$ mkdir gitinternals && cd gitinternals # create and cd into a dir
$ git init # initialize git
Initialized empty Git repository in /Users/home/gitinternals/.git/
with the above commands you will have initialized a git repository locally. This will have created a .git
folder inside your repository. this is where git stores all its internal data related to this repository.
A quick tree -a
on the current folder will display the files and folders created by git.
$ tree -a
.
└── .git
├── HEAD
├── config
├── description
├── hooks
│ ├── applypatch-msg.sample
│ ├── commit-msg.sample
│ ├── fsmonitor-watchman.sample
│ ├── post-update.sample
│ ├── pre-applypatch.sample
│ ├── pre-commit.sample
│ ├── pre-merge-commit.sample
│ ├── pre-push.sample
│ ├── pre-rebase.sample
│ ├── pre-receive.sample
│ ├── prepare-commit-msg.sample
│ └── update.sample
├── info
│ └── exclude
├── objects
│ ├── info
│ └── pack
└── refs
├── heads
└── tags
we are interested in the .git/objects
directory, this is also referred to as the Object database of git. initially it just has the info
and pack
folders, no objects just yet.
Blob objects
Git considers all the physical files that user work with in the git repo be it text files, images, audio, video as BLOB(Binary Large OBject).
Let's create new file in the folder and check how git reacts to it.
$ echo "# Hello world" > README.md
$ tree -a .git/objects/
.git/objects/
├── info
└── pack
Running tree
on the objects directory reveals that there is no changes in the object storage, This is because git has not started tracking the file yet. you can confirm this by giving the command git status
$ git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
README.md
nothing added to commit but untracked files present (use "git add" to track)
README.md
file is listed under the untracked files yet. that means that git does not care what you do with the file. The git status
gives us info on how to make git track the file.
$ git add README.md
$ tree -a .git/objects/
.git/objects/
├── 71
│ └── 6ed1421c738a75abe6e0c4812ad4aacee0e11a
├── info
└── pack
at last we see some activity in the objects folder.
The string 716ed1421c738a75abe6e0c4812ad4aacee0e11a
is the SHA-1 hash of the README.md file.
the first 2 letters of the hash is used as a folder and remaining string is used as the file name.
we can verify the hash of the file with the hash-object
plumbing command
$ git hash-object README.md
716ed1421c738a75abe6e0c4812ad4aacee0e11a
we can also check the contents, type and size of a object with the cat-file
plumbing command
$ git cat-file -p 716ed1421c738a75abe6e0c4812ad4aacee0e11a
# Hello world
$ get cat-file -p 716e # we need not type the whole hash, a hash prefix which uniquely identifies the hash can also be used.
# Hello world
$ git cat-file -t 716e # checking type of object
blob
$ git cat-file -s 716e # checking size of object
14
Here we see the type of object is blob, which is expected for the file README.md
Commit Objects
Let's create a new commit in our repository and we will look at how the objects directory has reacted to our commit.
$ git commit -m "Add readme file"
[master (root-commit) 0948529] Add readme file
1 file changed, 1 insertion(+)
create mode 100644 README.md
$ tree -a .git/objects/
.git/objects/
├── 09
│ └── 4852928af802dfe0f463359c7ade3f7a21fffa
├── 71
│ └── 6ed1421c738a75abe6e0c4812ad4aacee0e11a
├── a5
│ └── ef91ee14be786131cbecfd2eb8c7fef8a2510d
├── info
└── pack
We see two more objects 0948
and a5ef
, let's check these type of objects with the cat-file
command.
$ git cat-file -t 0948
commit
$ git cat-file -t a5ef
tree
we that 0948
is a commit object and a5ef
is a tree object.
let's check the contents of the commit object
$ git cat-file -p 0948
tree a5ef91ee14be786131cbecfd2eb8c7fef8a2510d
author root <root@email.com> 1601801983 +0530
committer root <root@email.com> 1601801983 +0530
Add readme file
tree a5ef91ee14be786131cbecfd2eb8c7fef8a2510d
This is the tree that this commit is pointing to, more on Tree next :)
author root root@email.com 1601801983 +0530
committer root root@email.com 1601801983 +0530
The next two lines contain the author and committer info.
Add readme file
And then separated by a new line we have the commit message.
Tree Object
Tree objects stores a group of blob objects and tree objects. conceptually it makes it easier for us to think of tree object as snapshot of the folder.
Each folder can have multiple files and multiple folders within them, Likewise Tree object can have multiple blob objects or multiple trees in them.
Lets inspect the content of the tree object a5ef
$ git cat-file -p a5ef
100644 blob 716ed1421c738a75abe6e0c4812ad4aacee0e11a README.md
This tree object has only one file in it as we have added only one file to the git repository.
the tree object content has four parts
part | description |
---|---|
100644 | 100 means that it is a normal blob object and 644 is the file permission on the disk |
blob | type of object |
716ed1421c738a75abe6e0c4812ad4aacee0e11a | object name |
README.md | file name |
Tag Object
Adding a new tag to our current commit, let's observe how our object directory reacts.
$ git tag -a v0.0.1 -m "my version 0.0.1"
$ tree -a .git/objects/
.git/objects/
├── 08
│ └── a57a8e9c4b340f5674b96652595bf6727b35bd
├── 09
│ └── 4852928af802dfe0f463359c7ade3f7a21fffa
├── 71
│ └── 6ed1421c738a75abe6e0c4812ad4aacee0e11a
├── a5
│ └── ef91ee14be786131cbecfd2eb8c7fef8a2510d
├── info
└── pack
$ git cat-file -p 08a5
object 094852928af802dfe0f463359c7ade3f7a21fffa
type commit
tag v0.0.1
tagger root <root@email.com> 1601811513 +0530
my version 0.0.1
Upon checking the contents of the Tag object file 08a5
we see that the tag object points to the current commit 0948
. and it has the tag information along with the author who created the tag.
Hope this gave you some insight on internal working of git. Follow me for more git internal related blogs :)
Top comments (0)