Everyone uses git. It's a great tool for sure. However, even in year of 2020, I still hear that some people are not comfortable doing merge, cherry-pick, rebase or reset. The only way to fully comfortable doing these is to know how exactly git works.
If you are video person, then watch following video instead of reading my long blog series.
*I won't explain what is git in this article.
If you want to learn git with Visual Studio, please refer to Git and Visual Studio 2017 series. Even though it's VS2017 base, it is still relevant.
Let's initialize the git by
git init command.
mkdir gitDeepDive cd gitDeepDive git init
The init command creates ".git" hidden folder:
.git │ config │ description │ HEAD ├─hooks │ applypatch-msg.sample │ commit-msg.sample │ fsmonitor-watchman.sample │ post-update.sample │ pre-applypatch.sample │ pre-commit.sample │ pre-merge-commit.sample │ pre-push.sample │ pre-rebase.sample │ pre-receive.sample │ prepare-commit-msg.sample │ update.sample ├─info │ exclude ├─objects │ ├─info │ └─pack └─refs ├─heads └─tags
"HEAD" may sound familiar to many people, which is a "pointer" of current commit.
gitDeepDive> cat .\.git\HEAD ref: refs/heads/master
But I don't have anything in the folder (yet).
The first command we may use is
git add after creating files and folders.
git add, try
git status to show current status.
>git status On branch master No commits yet nothing to commit (create/copy files and use "git add" to track)
Once I added a file to the folder,
git status shows Untracked files.
>echo 'hello git' > hello.txt >git status On branch master No commits yet Untracked files: (use "git add <file>..." to include in what will be committed) hello.txt nothing added to commit but untracked files present (use "git add" to track)
At this moment, nothing added or modified in .git folder. Now I run
git add .. The added files are categorized as "Changes to be committed".
gitDeepDive> git add . gitDeepDive> git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: hello.txt
When I look at the .git folder, several files are created. I will omit hook folder from now on as its just contains .sample files.
.git │ config │ description │ HEAD │ index ├─info │ exclude ├─objects │ ├─8d │ │ 0e41234f24b6da002d962a26c2495ea16a425f │ ├─info │ └─pack └─refs ├─heads └─tags
- index file was updated
- 8d folder was created in objects folder
- 0e41234f24b6da002d962a26c2495ea16a425f file was created in 8d folder
These files are git binary files and I need to use special command to see the data.
index holds working tree information. To see the content, I use ls-files, which shows hello.txt. So git tracks this file.
gitDeepDive> git ls-files -v H hello.txt
The snapshot files are stored in objects folder and each file has its own id calculated by git. The first 2 letters becomes folder name and the rest becomes file name. I use cat-file to see the file.
I need to check the type of the file first. As you can see, the parameter is combination of folder name (8d) and the file name (0e412..).
gitDeepDive> git cat-file -t 8d0e41234f24b6da002d962a26c2495ea16a425f blob
The command said it's blob type, so I can simply see the content by running following command. I can see "hello git" as its content as expected.
gitDeepDive> git cat-file blob 8d0e41234f24b6da002d962a26c2495ea16a425f hello git
git commit, let's modify the current content.
Run following command. Now I can see two entries for the same hello.txt file.
gitDeepDive> echo 'The second line' >> hello.txt gitDeepDive> git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: hello.txt Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: hello.txt
But these is no change in .git directory. We already know that git already snapshot the hello.txt file when I did
git add before, and now it can detect the same file has been modified.
I can see the file by running the command below. You see the line 8d0e412..20fe8be which means it compares with 8d0e412 which is the snapshot, and 20fe8be which is the modified file. Please note that git didn't create snapshot for this yet it just calculate the id on the fly.
gitDeepDive> git diff .\hello.txt diff --git a/hello.txt b/hello.txt index 8d0e412..20fe8be 100644 --- a/hello.txt +++ b/hello.txt @@ -1 +1,2 @@ hello git +The second line
Okay, run the add command again to see what git does. The result looks same as before.
gitDeepDive> git add . gitDeepDive> git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: hello.txt`
However, it modifies files in .git folder. I already know the file id will be 20fe8be as I didn't change the file since I did
.git │ config │ description │ HEAD │ index ├─objects │ ├─20 │ │ fe8be9820a49252e2a4dd37a60e678cd5cda14 │ ├─8d │ │ 0e41234f24b6da002d962a26c2495ea16a425f │ ├─info │ └─pack └─refs ├─heads └─tags
- updated index
- Added 20 folder in objects
- Added fe8be9820a49252e2a4dd37a60e678cd5cda14 file in 20 folder
Let's check them.
gitDeepDive> git cat-file -t 20fe8be9820a49252e2a4dd37a60e678cd5cda14 blob gitDeepDive> git cat-file blob 20fe8be9820a49252e2a4dd37a60e678cd5cda14 hello git The second line
Finally, lets commit the file to master and see what will happen.
Run the following command to commit the file.
gitDeepDive> git commit -m "commit hello.txt" [master (root-commit) 16f1fa8] commit hello.txt 1 file changed, 2 insertions(+) create mode 100644 hello.txt gitDeepDive> git status On branch master nothing to commit, working tree clean
There are bunch of changes in .git folder.
.git │ COMMIT_EDITMSG │ config │ description │ HEAD │ index ├─info │ exclude ├─logs │ │ HEAD │ │ │ └─refs │ └─heads │ master ├─objects │ ├─16 │ │ f1fa822d53d12329e9a68c7463c5697bddc7d1 │ ├─20 │ │ fe8be9820a49252e2a4dd37a60e678cd5cda14 │ ├─8d │ │ 0e41234f24b6da002d962a26c2495ea16a425f │ ├─a2 │ │ cf761ea993127a4aae5762806441cc18d730f5 │ ├─info │ └─pack └─refs ├─heads │ master └─tags
- index was updated
- COMMIT_EDITMSG was created
- logs folder was created with child folders and files
- f1fa822d53d12329e9a68c7463c5697bddc7d1 was created in 16 folder
- cf761ea993127a4aae5762806441cc18d730f5 was created in a2 folder
- heads\master was created in refs folder
That's a lot!! Let's examine one by one.
We all know that
git commit creates a commit. But what is commit? It's just another file at the end. Run the following command to check the file under objects folder. As I listed above, I had two file created, so I just pick one of them first and figured out it was a commit.
gitDeepDive> git cat-file -t 16f1fa822d53d12329e9a68c7463c5697bddc7d1 commit gitDeepDive> git cat-file commit 16f1fa822d53d12329e9a68c7463c5697bddc7d1 tree a2cf761ea993127a4aae5762806441cc18d730f5 author Kenichiro Nakamura <email@example.com> 1588780516 +0900 committer Kenichiro Nakamura <firstname.lastname@example.org> 1588780516 +0900 commit hello.txt
The commit file contains tree, author, committer and commit message. The tree section has id of another file created. Let's check it. I use cat-file to see the type, but I use ls-tree to see the tree type. (I can use cat-file with -p parameter, too)
A tree is a folder or directory. When I see the tree, it has blob in it which is a pointer to my modified hello.txt.
gitDeepDive> git cat-file -t a2cf761ea993127a4aae5762806441cc18d730f5 tree gitDeepDive> git ls-tree a2cf761ea993127a4aae5762806441cc18d730f5 100644 blob 20fe8be9820a49252e2a4dd37a60e678cd5cda14 hello.txt
Let's run log command to see the commit history. I can see the information displayed is coming from the commit file.
gitDeepDive> git log commit 16f1fa822d53d12329e9a68c7463c5697bddc7d1 (HEAD -> master) Author: Kenichiro Nakamura <email@example.com> Date: Thu May 7 00:55:16 2020 +0900 commit hello.txt
I see there are bunch of other files created in addition to objects files. Most of them is just text file.
Start from refs folder. Do you remember there is nothing in this folder before? Now it has master file as root HEAD file points. It simply contains commit id. So this is the pointer of current HEAD of master branch.
gitDeepDive> cat .\.git\refs\heads\master 16f1fa822d53d12329e9a68c7463c5697bddc7d1
Logs folder has same information. Both contains logs info. I will come back to see how this changes when I do another commit.
gitDeepDive> cat .\.git\logs\HEAD 0000000000000000000000000000000000000000 16f1fa822d53d12329e9a68c7463c5697bddc7d1 Kenichiro Nakamura <firstname.lastname@example.org> 1588780516 +0900 commit (initial): commit hello.txt gitDeepDive> cat .\.git\logs\refs\heads\master 0000000000000000000000000000000000000000 16f1fa822d53d12329e9a68c7463c5697bddc7d1 Kenichiro Nakamura <email@example.com> 1588780516 +0900 commit (initial): commit hello.txt
The last one is easy to guess, COMMIT_EDITMSG. It contains the commit comments.
gitDeepDive> cat .\.git\COMMIT_EDITMSG commit hello.txt
I think that's quite enough for the first article. I will do several other commits to see how git controls them in the next article.