This post is taken from my blog, so be sure to check it out for more up-to-date content.
Git is a Version Control System (VCS) probably known by all programmers. It makes managing your projects, files, and changes made to them much easier and more intuitive. But, as it's a big and complex system, people new to programming can have a hard time learning it. Setting up the repo is an easy part - it's what comes after it that can feel a bit tricky. That's why in today's post, we're going to go over the basic concepts and functionalities of Git from ground-up! We'll explore everything you need to know to best utilize Git on a daily basis. Let's get started!
Version Control System
Let's start by understanding the most basic concept behind Git, something that is often taken as granted - Version Control System (VCS). What is it and why Git is, in fact, one of its multiple implementations? To explain it with an example - let's say that you're creating a project with only one file - for simplicity. In such a scenario, version control helps you document and store changes you made to your file throughout time. Every time you change something in your file and save it, you commit a change. VCS also gives you the ability to add comments to your changes, to keep things organized and later know to what state you'd like to return your file to before given change happens.
So, described changes throughout the time that you can later come back to - simple. But, let's say that you want to work on your project together with your friend. As you have only one file to work on and do it at different times, managing such development can be a bit problematic. Don't worry - VCS can help here too! It helps keep track of changes done to specific file over time by specific users, so you can later decide how to combine them into one in the process called merging. And, in this way, VCS now stores data about specific change, the time it was committed and what user did it.
The concept introduced to you above is the very basic idea behind VCS. It's what likes of Git and other similar tools implement at its core. Of course, there's still plenty more data that Git stores, but I hope you get the idea.
Git
Now, let's get back to Git itself. As I mentioned, it's just one of the most popular and widely-used VCSs out there. Its popularity is definitely a result of some of its most important advantages. First off, it's distributed, meaning that you don't actually have an active internet connection, and just run Git on your local machine. And that's only thanks to its architecture and performance. Which, going down our thinking process, let's to using Git being relatively easy, compared to other implementation of VCS.
Getting started
With the theory out of the way, it's time to use Git for real. We'll do it on our local machine for sake of simplicity, so, if you haven't already, you'll have to install Git. Most Linux distros often come with Git built-in and, in other cases, you can download it (and check out the docs BTW) from its official website.
git init projectName
cd projectName
The above command will initialize the Git project in the directory specified by projectName
(will be created if doesn't exist already). If you don't specify any name, Git repo will be created in the current directory. After all that, we're going inside!
Before we go any further, I'd like to make a quick disclaimer. It's a fact that in real-world projects Git projects are often stored on platforms like GitHub with GUI web client or something like GitKraken. Still, here, to keep things simple, we're using commands-only. That's because I think it's the easiest way to grasp the basic concepts and related naming. The same can be later applied to GUI tools that only make things simpler.
Credentials
As said before, Git stores the information about the user who committed given change. But, for this, it needs to have some additional information about the contributor. That's why, before we start working with any files, we first have to set up our credentials - in most cases, name and email will be enough.
git config --global user.name "User"
git config --global user.email "user@user.com"
With above commands, we're able to set our name and email address globally, meaning that it will be used for any commit we made from our machine. user.name
sets the name of the user (one for online use) and user.email
- his public email. With that said, git config
command is often used to set different variables used by Git internally, but here (and in most cases) name and email is all we need to move on.
Working with files
Let's say that you've already had or just created a new file with some content that you'd like to commit to your Git repository. Here's how you can do this:
git add .
With above command we add all our files (you can specify them with whole directories or file by file) to somewhat container, i.e. we stage our changes, to make Git notice that they've been changed - created, updated or removed. This step is often managed automatically by modern code editors like VS Code with Git support built-in. You must know that our changes haven't been committed yet. For now, Git only knows that they've been made. To actually commit them, meaning save to our Git repo history, we'll have to use yet another command:
git commit -m "Commit message"
The git commit
command doesn't need to take any files or directories as an argument, as they're already known because of previous add command. Instead, here we can pass some additional information about our current commit itself. In this example, I'm using one of many helpful options (there are too many to go over all of them) - -m
(message) to actually add some documentation to my current commit, just like we've previously discussed. A nice trick is that, if you have some long message to write, you can use multiple -m options, as they'll be concatenated into separate paragraphs. But, this is certainly one example where GUI makes things much easier.
Collaboration
While using Git definitely helps you keep track of your personal projects, it's real power can be experienced when used in collaboration. Great examples of that are countless open source project available at GitHub and other platforms powered by Git. Let's see how it works...
Branches
The problem with multiple people working on the same project comes in the form of managing changes that overlap, while not disrupting one another. Git has a solution to that too - it's called branches!
Branches, thinking of Git as a tree, help people have their separate version to work on. Branches can be based on one another, with the main one called "master", created after your first "branch-less" commit.
git branch -c my-branch
You can manage your branches using git branch
command with some additional options. Examples of these include -c
, -m
, -d
, respectively standing for copy, move and delete. First two options, that makes command accept two arguments, use the current branch by default. If you want a completely new branch for whatever reason, you don't have to use any additional options.
Checkouts
After creating a new branch, you'll have to "move" to given branch, to be able to apply all the operations you commit from now to it. To change your current branch, to another, you'll need to use the git checkout
command.
git checkout my-branch
Merge
Finally, changes done by multiple people need to be somewhat merged together. To do this, Git provides an intuitively-called command - git merge
. With a simple call, you can merge provided branch with the current one. To make the process easier, Git will ask you every time a collision happens between file changes or proceed automatically if it's not needed.
git merge my-branch
Remote
While Git works locally really well, when working in collaboration, it's required to have your repository stored somewhere where each of project collaborators will have access to it, to make the development process easier. For this, you'll need a remote repository, stored on your Git server or a platform like GitHub or GitLab.
Cloning
The most common way of accessing a remote repo is by cloning it. For that, you can use the well-known git-clone
command.
git clone https://github.com/vuejs/vue.git
Such command automatically creates all remote branches and checkouts the main branch (most often named "master").
Remote
If you want to play with remotes a little more, you can use remote command for that.
git remote
With no additional sub-commands or options provided, the remote command will list all names of your remote origins, e.g. after cloning, the only accessible remote's name is "origin".
If you want to add given remote to some existing repository, instead of just cloning it, you can use git remote add
command. Just pass a short name for your new remote, its origin URL and you're done!
git remote add vue https://github.com/vuejs/vue.git
Push & pull
To access the data from remote origins, you'll need to fetch it first. A simple command for that is named git fetch
.
git fetch
As an argument, you can pass the name of a remote origin you'd like to fetch. By default, it's equal to "origin".
By using git fetch
you retrieve the changes and branches committed by other collaborators. You'll then have to (most likely) merge it with your own local changes by using git merge
. Turns out it is such common thing to do, that Git provides a shortcut for that, in the form of git pull
command, which automatically fetches the remote branch and merges it with your current local branch.
git pull
Accessing remote origins would be pointless if you wouldn't be able to apply changes to them. To push your changes from the current branch to remote one, you can use git push
command.
git push origin master
git push
takes two arguments - first points at the remote origin by its name and the second selects the branch from remote origin to push changes too. Pretty straight-forward, huh?
Personally, I think that getting started with remote origins can be a bit hard for beginners. It's important to see the clear difference between remotes and branches, which may be a bit hard at first. Just remember that remotes are... remote and contain branches within them.
Rebase
If you've already done a number of commits to your local copy, there might be a case in which you may want to apply all of them to the current code-base which was changed during that time by other collaborators. That's where git rebase
command comes in. It re-bases all commits that you've done in the given branch (current one by default) and applies them on top of the main one.
git rebase
Information
We know that Git stores a lot of data beyond just files, so, how do we access it? Git provides a lot of helpful commands for that too! Probably the most important one is git status
. By using it, you can receive basic information about the current status of your Git repo, like currently staged files. You can also pass a path (file or directory) as an argument to limit your results.
git status
git log
, on the other hand, outputs the whole history of changes made. This includes some important data like commit ID, which can be pretty useful down the road. As a result of this command can be quite huge, you can always pass the optional path argument to limit your results.
git log
When having access to previously mentioned commit ID or branch name and etc. you can use it with the git diff
command to display the actual difference between the current working tree and given entity. It's extremely useful and utilized by all Git GUI tools.
git diff my-branch
Finally, beyond the likes of branches, commits, and remotes, Git provides you with tags. Unlike branches, they aren't meant for handling concurrent workflow, but to document the given stateof your project. It's useful for tagging different builds, versions, and releases of your code-base. Your tags can have custom names, can be differentiated with git diff
and created with git tag
from the current working tree.
git tag custom-tag
Changes
Stashes
Naturally, beyond all the goodness of adding, storing and retrieving information mentioned above, Git allows you to do much, much more. This includes obvious changes like removing and editing, but also creating stashes. You can use them through git stash
command. They allow you to create temporary records of your current work (WIP), to later come back to it.
git stash
The most basic command above creates a new stash, which can be later seen with:
git stash list
To restore given stash (even on top of different commits) you'll need to use git stash apply
command and specify stash through reference ID, whole reflog syntax or in some other way (see docs). For information about stash reference and ID, you'll need to see the output of the git stash list
command.
git stash apply stash@{0}
# the same as
git stash apply 0
In general, stashes provide a great way of creating "dirty" snapshots of your current workflow, only to get back to them later. They can also be manipulated extensively, as, with commands above, we barely scratched the surface.
Reset & revert
Naturally, changes to your files, even their removal can be staged (with git add
) and committed normally, in a predictable way. But, you may end up in a situation where you may need to, instead of changing files, change some aspects of Git itself. I mean wanting e.g. to unstage your staged changes or something like that.
The easiest way to do that is just to reset your file to its previous state, effectively unstaging it in the process. To do that, we'll use git reset
command which will cancel all currently staged changes. Of course, you can also provide it with an argument in the form of the path, to make your operations more focused or a commit ID to reset your files to the given state.
git reset
Another time when Git proves to be really useful is when you need to get back in time and revert some commits that you've previously done. Obvious git revert
command allows you to do just that, just by specifying commit ID. Then, optionally, it'll walk you through the process of reverting the conflicting contents.
git revert fc95eb54362878829042ac2c6b3a34bd208b2c04
# example commit ID
Just remember that commits store changes done to the different files specifically - with the locations of different lines of code (LOCs) and other details included. In this way, reverting given commit may still result in a different file that it was before if more commits were done in this period of time.
Caveats
With this quick run-down of Git basics, I'd like to point out 2, in my opinion very important details about Git.
First, you need to know that Git, in comparison to many other VCS implementations is focused on content instead of files. This means that Git has a greater awareness of your code. It not only tracks the location of files but also the locations of LOCs inside them. It's something that's often taken as granted when using Git, but you must know that it's just an additional and very useful feature.
Second, you shouldn't look at Git and, in fact, any other VCS, only as a form of backup. Sure, they allow you to store the past versions of your files, but it's the precision of how these changes are stored that makes VCS unique. You know who, why and when made given change. This makes Git more similar to kind of journal than a sole backup.
Want more?
Git is an incredibly well-designed and interesting implementation of VCS. It's pretty easy to grasp for the beginners, while still providing tons of features for advanced pro users. With that said, even with a quite full introduction done in this article, there's still countless options, commands and a lot of different goods available within Git. In the world where projects are getting larger and larger, VCS is a requirement. Thankfully, Git makes that necessity only better, nicer, faster, more seamless and generally appreciated choice.
So, do you like this article? Let me know if you want to read more about Git on this blog in a similar form-factor or something, with a reaction below. Also, leave your thoughts in the comment section below. As always, please, share the article if you find it helpful and consider following me on Twitter, on my Facebook page and checking out my personal blog. Thanks for reading and have a great day!
Top comments (2)
Really nice reference page for anyone who uses git regularly too!
As a beginners point of view, you have written a very good article. it would be great if you would write about the Git squash and git lg command.