loading...

git deep dive part 1: From initialize to first commit

kenakamu profile image Kenichiro Nakamura ・6 min read

Everyone uses git. It's a great tool for sure. However, even in year of 2020, I still hear that some people are not comfortable doing merge, cherry-pick, rebase or reset. The only way to fully comfortable doing these is to know how exactly git works.

If you are video person, then watch following video instead of reading my long blog series.

GOTO 2015 • Deep Dive into Git • Edward Thomson
Advanced GIT for Developers - Lorna Jane Mitchell - Laracon EU 2015

*I won't explain what is git in this article.

If you want to learn git with Visual Studio, please refer to Git and Visual Studio 2017 series. Even though it's VS2017 base, it is still relevant.

Initialize with git init

Let's initialize the git by git init command.

mkdir gitDeepDive
cd gitDeepDive
git init

The init command creates ".git" hidden folder:

.git
│  config
│  description
│  HEAD
├─hooks
│      applypatch-msg.sample
│      commit-msg.sample
│      fsmonitor-watchman.sample
│      post-update.sample
│      pre-applypatch.sample
│      pre-commit.sample
│      pre-merge-commit.sample
│      pre-push.sample
│      pre-rebase.sample
│      pre-receive.sample
│      prepare-commit-msg.sample
│      update.sample
├─info
│      exclude
├─objects
│  ├─info
│  └─pack
└─refs
    ├─heads
    └─tags

"HEAD" may sound familiar to many people, which is a "pointer" of current commit.

gitDeepDive> cat .\.git\HEAD
ref: refs/heads/master

But I don't have anything in the folder (yet).

Add files by git add

The first command we may use is git add after creating files and folders.

Create a file, git add and examine .git folder

Before trying git add, try git status to show current status.

>git status
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)

Once I added a file to the folder, git status shows Untracked files.

>echo 'hello git' > hello.txt

>git status
On branch master
No commits yet
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        hello.txt

nothing added to commit but untracked files present (use "git add" to track)

At this moment, nothing added or modified in .git folder. Now I run git add .. The added files are categorized as "Changes to be committed".

gitDeepDive> git add .
gitDeepDive> git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   hello.txt

When I look at the .git folder, several files are created. I will omit hook folder from now on as its just contains .sample files.

.git
│  config
│  description
│  HEAD
│  index
├─info
│      exclude
├─objects
│  ├─8d
│  │      0e41234f24b6da002d962a26c2495ea16a425f
│  ├─info
│  └─pack
└─refs
    ├─heads
    └─tags
  • index file was updated
  • 8d folder was created in objects folder
  • 0e41234f24b6da002d962a26c2495ea16a425f file was created in 8d folder

See index and files in objects folder

These files are git binary files and I need to use special command to see the data.

index holds working tree information. To see the content, I use ls-files, which shows hello.txt. So git tracks this file.

gitDeepDive> git ls-files -v
H hello.txt

The snapshot files are stored in objects folder and each file has its own id calculated by git. The first 2 letters becomes folder name and the rest becomes file name. I use cat-file to see the file.

I need to check the type of the file first. As you can see, the parameter is combination of folder name (8d) and the file name (0e412..).

gitDeepDive> git cat-file -t 8d0e41234f24b6da002d962a26c2495ea16a425f
blob

The command said it's blob type, so I can simply see the content by running following command. I can see "hello git" as its content as expected.

gitDeepDive> git cat-file blob 8d0e41234f24b6da002d962a26c2495ea16a425f
hello git

Modify the file

Before trying git commit, let's modify the current content.

Modify the file and see status

Run following command. Now I can see two entries for the same hello.txt file.

gitDeepDive> echo 'The second line' >> hello.txt
gitDeepDive> git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   hello.txt

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   hello.txt

But these is no change in .git directory. We already know that git already snapshot the hello.txt file when I did git add before, and now it can detect the same file has been modified.

I can see the file by running the command below. You see the line 8d0e412..20fe8be which means it compares with 8d0e412 which is the snapshot, and 20fe8be which is the modified file. Please note that git didn't create snapshot for this yet it just calculate the id on the fly.

gitDeepDive> git diff .\hello.txt
diff --git a/hello.txt b/hello.txt
index 8d0e412..20fe8be 100644
--- a/hello.txt
+++ b/hello.txt
@@ -1 +1,2 @@
 hello git
+The second line

Create another snapshot

Okay, run the add command again to see what git does. The result looks same as before.

gitDeepDive> git add .
gitDeepDive> git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   hello.txt`

However, it modifies files in .git folder. I already know the file id will be 20fe8be as I didn't change the file since I did git diff.

.git
│  config
│  description
│  HEAD
│  index
├─objects
│  ├─20
│  │      fe8be9820a49252e2a4dd37a60e678cd5cda14
│  ├─8d
│  │      0e41234f24b6da002d962a26c2495ea16a425f
│  ├─info
│  └─pack
└─refs
    ├─heads
    └─tags
  • updated index
  • Added 20 folder in objects
  • Added fe8be9820a49252e2a4dd37a60e678cd5cda14 file in 20 folder

Let's check them.

gitDeepDive> git cat-file -t 20fe8be9820a49252e2a4dd37a60e678cd5cda14
blob
gitDeepDive> git cat-file blob 20fe8be9820a49252e2a4dd37a60e678cd5cda14
hello git
The second line

Commit the file

Finally, lets commit the file to master and see what will happen.

What git does when commit

Run the following command to commit the file.

gitDeepDive> git commit -m "commit hello.txt"
[master (root-commit) 16f1fa8] commit hello.txt
 1 file changed, 2 insertions(+)
 create mode 100644 hello.txt
gitDeepDive> git status
On branch master
nothing to commit, working tree clean

There are bunch of changes in .git folder.

.git
│  COMMIT_EDITMSG
│  config
│  description
│  HEAD
│  index
├─info
│      exclude
├─logs
│  │  HEAD
│  │
│  └─refs
│      └─heads
│              master
├─objects
│  ├─16
│  │      f1fa822d53d12329e9a68c7463c5697bddc7d1
│  ├─20
│  │      fe8be9820a49252e2a4dd37a60e678cd5cda14
│  ├─8d
│  │      0e41234f24b6da002d962a26c2495ea16a425f
│  ├─a2
│  │      cf761ea993127a4aae5762806441cc18d730f5
│  ├─info
│  └─pack
└─refs
    ├─heads
    │      master
    └─tags
  • index was updated
  • COMMIT_EDITMSG was created
  • logs folder was created with child folders and files
  • f1fa822d53d12329e9a68c7463c5697bddc7d1 was created in 16 folder
  • cf761ea993127a4aae5762806441cc18d730f5 was created in a2 folder
  • heads\master was created in refs folder

That's a lot!! Let's examine one by one.

commit file

We all know that git commit creates a commit. But what is commit? It's just another file at the end. Run the following command to check the file under objects folder. As I listed above, I had two file created, so I just pick one of them first and figured out it was a commit.

gitDeepDive> git cat-file -t 16f1fa822d53d12329e9a68c7463c5697bddc7d1
commit
gitDeepDive> git cat-file commit 16f1fa822d53d12329e9a68c7463c5697bddc7d1
tree a2cf761ea993127a4aae5762806441cc18d730f5
author Kenichiro Nakamura <kenakamu@microsoft.com> 1588780516 +0900
committer Kenichiro Nakamura <kenakamu@microsoft.com> 1588780516 +0900

commit hello.txt

The commit file contains tree, author, committer and commit message. The tree section has id of another file created. Let's check it. I use cat-file to see the type, but I use ls-tree to see the tree type. (I can use cat-file with -p parameter, too)

A tree is a folder or directory. When I see the tree, it has blob in it which is a pointer to my modified hello.txt.

gitDeepDive> git cat-file -t a2cf761ea993127a4aae5762806441cc18d730f5
tree
gitDeepDive> git ls-tree a2cf761ea993127a4aae5762806441cc18d730f5
100644 blob 20fe8be9820a49252e2a4dd37a60e678cd5cda14    hello.txt

Let's run log command to see the commit history. I can see the information displayed is coming from the commit file.

gitDeepDive> git log
commit 16f1fa822d53d12329e9a68c7463c5697bddc7d1 (HEAD -> master)
Author: Kenichiro Nakamura <kenakamu@microsoft.com>
Date:   Thu May 7 00:55:16 2020 +0900

    commit hello.txt

References

I see there are bunch of other files created in addition to objects files. Most of them is just text file.

Start from refs folder. Do you remember there is nothing in this folder before? Now it has master file as root HEAD file points. It simply contains commit id. So this is the pointer of current HEAD of master branch.

gitDeepDive> cat .\.git\refs\heads\master
16f1fa822d53d12329e9a68c7463c5697bddc7d1

Logs folder has same information. Both contains logs info. I will come back to see how this changes when I do another commit.

gitDeepDive> cat .\.git\logs\HEAD
0000000000000000000000000000000000000000 16f1fa822d53d12329e9a68c7463c5697bddc7d1 Kenichiro Nakamura <kenakamu@microsoft.com> 1588780516 +0900    commit (initial): commit hello.txt
gitDeepDive> cat .\.git\logs\refs\heads\master
0000000000000000000000000000000000000000 16f1fa822d53d12329e9a68c7463c5697bddc7d1 Kenichiro Nakamura <kenakamu@microsoft.com> 1588780516 +0900    commit (initial): commit hello.txt

The last one is easy to guess, COMMIT_EDITMSG. It contains the commit comments.

gitDeepDive> cat .\.git\COMMIT_EDITMSG
commit hello.txt

Summary

I think that's quite enough for the first article. I will do several other commits to see how git controls them in the next article.

Go to next article

Discussion

pic
Editor guide