loading...
Cover image for Git Concepts I Wish I Knew Years Ago

Git Concepts I Wish I Knew Years Ago

g_abud profile image Gabriel Abud Updated on ・11 min read

The most used technology by developers is not Javascript.

It's not Python or HTML.

It hardly even gets mentioned in interviews or listed as a pre-requisite for jobs.

I'm talking about Git and version control of course.

Most of us get by with learning a few git concepts for far too long. We learn the minimum to get by working on a small team with a simple feature branch workflow. If you're anything like me, this will carry you for years into your career.

Alt Text

It's time we revisited Git and how important it is to master in order to advance in our careers. This guide serves as a reference for what I believe are necessary, yet lesser-known concepts. Mastering Git will make a huge difference in how you manage code and your own day-to-day workflow. Because Git commands are a bit archaic and hard to remember, this guide will be broken up by concepts and expected behavior instead.

If you don't have a firm grasp of general Git concepts, like the difference between your working directory, local repository, and remote repository, read this guide first. Likewise if you don't have the basic git commands mastered, start with the official documentation. This guide is not meant to take you from absolute beginner to professional and assumes you are already somewhat proficient using git.

Table of Contents

Git Commands

📋 Logging

What did I just do?

git log
git log --oneline # more succinct output
git log --graph # with a visual graph of branches

View your "undo" history

git reflog

Because sometimes git log doesn't cut it, especially for commands that don't show up in your commit history.

reflog is basically your safety net after running "scary" commands like git rebase. You'll be able to see not only the commits you made, but each of the actions that led you there. See this Atlassian article to learn more about how refs work.

View your current state + any merge conflicts

git status

While git status is a pretty basic command we all learn early on, its importance as a learning tool for internalizing git fundamentals bears repeating. It can also help you navigate through a complicated rebase or merge.

See the differences in your staged (or unstaged) changes

git diff --staged # for staged changes
git diff # for unstaged changes

See the differences between two branches

git diff branch1..branch2

🧭 Navigation

I want to see what I did before

git reset <commit-sha>

This will uncommit and unstage those changes but leave those files in the working directory.

I want to switch to another branch

git switch branch-name   # new syntax (as of Git 2.23)
git checkout branch-name # old syntax

git checkout can be a bit confusing because it can work on the file and branch level. As of Git 2.23, we now have git restore (checkout file) and git switch (checkout branch), which would be my suggestion if you're just starting out to avoid checkout confusion.

I want to go back to the branch I was on

git switch -

📝 Modifications

I dug myself into a rabbit hole, let's start over

git reset --hard HEAD

This will reset your local directory to match the latest commit and discard unstaged changes

I want to reset a file back to how it was

git restore <filename>     # new syntax (as of Git 2.23)
git checkout -- <filename> # old syntax

I want to undo the last commit and rewrite history

git reset --hard HEAD~1

I want to rewind back n commits

git reset --hard HEAD~n        # n is the last n commits
git reset --hard <commit-sha>  # or to a specific commit

There is an important distinction between soft, mixed, and hard resets. Basically:

  1. --soft: Uncommit changes but leave those changes staged
  2. --mixed (the default): Uncommit and unstage changes, but changes are left in the working directory
  3. --hard: Uncommit, unstage, and delete changes

I've rewritten history and now want to push those changes to the remote repository

git push -f

This is necessary anytime your local and remote repository diverge.

WARNING: Force pushing should be done with great care. In general, on shared branches you should avoid doing any force pushes. Limit force pushing to your own branches, before opening a pull request, so that you don't inadvertently mess up your teammates' git history.

I want to add a few more changes to the last commit

git commit --amend

I want to rewrite a bunch of commits locally

git rebase -i <commit hash> # where the commit hash is the one *before* all the changes you want to make

This will open up an interactive prompt where you can select which commits to keep, squash, or delete. You can also change commit messages here. This is very useful when cleaning up typo or linting commits, for example.

I found rebasing to be one of the more confusing topics when learning Git in depth. See the section on rebasing for more.

This rebase is a mess, let's scrap it

git rebase --abort

You can do this mid rebase. I often find that a rebase is way more trouble than it's worth, especially when rebasing two branches with a lot of similar changes. Until the rebase is complete, you can always abort it.

I want to bring in a commit from a different branch

# From the branch you want to bring the commit *into*
git cherry-pick <commit-sha>

I want to bring in a specific file from a different branch

git checkout <branch-name> <file-name>

I want to stop tracking a file in version control

git rm --cached <file-name>

I need to switch branches but my current state is broken

git stash # saves your changes to the top of the stash stack
git stash save "message to go along with changes"
git stash -u # stash untracked files as well

I want to see what's in my stash

git stash list

I want to bring back something in my stash

git stash pop # "pops" the most recent item added to the stash stack
git stash apply stash@{stash_index} # apply a given item in the stash (from git stash list)

I want to undo a commit without rewriting history

git revert HEAD # undo the last commit
git revert <commit hash> # for a specific commit

This will replay the inverse of the commit specified as a new commit, thereby undoing your changes without undoing history. This is a much safer way to undo a commit on shared branches, where rewriting history has much bigger consequences.

I want to find which commit caused a bug

git bisect start
git bisect bad           # Current version is bad
git bisect good v1.1     # v1.1 is known to be good

git help bisect          # For more

This one is a bit tricky so see the documentation for more.

🧹 Cleanup

Oh my god how do I have so many branches?

git branch --no-color --merged | command grep -vE "^(\+|\*|\s*(master|develop|dev)\s*$)" | command xargs -n 1 git branch -d

This will delete all merged branches that you have locally except for master, developer or dev. If you have different names for your main and dev branches, you can change the grep regex accordingly.

This is a long command to remember, however you can set it to an alias like so:

alias gbda='git branch --no-color --merged | command grep -vE "^(\+|\*|\s*(master|develop|dev)\s*$)" | command xargs -n 1 git branch -d'

If you use Oh My Zosh this is already done for you. See the section on aliases for more.

Let's garbage collect old branches/detached commits

git fetch --all --prune

This is also a really useful command if you've setup your remote repository to delete branches on merge.

⌨️ Aliases

Git commands can be long and really hard to remember. We don't want to type them out each time or spend days memorizing them, so I strongly recommend you make git aliases for them.

Better yet, install a tool like Oh My Zosh for Z shell (Zsh) and you will get a bunch of the most commonly used git commands as aliases by default + tab completion for these. I'm lazy about configuring my shell exactly how I want it so I love open source tools like Oh My Zosh that do this for me. Not to mention it comes with a sweet looking shell.

Some of my favorites I use almost every day:

gst - git status
gc  - git commit
gaa - git add --all
gco - git checkout
gp  - git push
gl  - git pull
gcb - git checkout -b
gm  - git merge
grb - git rebase
gpsup - git push --set-upstream origin $(current_branch)
gbda  - git branch --no-color --merged | command grep -vE "^(\+|\*|\s*(master|develop|dev)\s*$)" | command xargs -n 1 git branch -d
gfa - git fetch --all --prune

If you forget what these aliases or any others you have set yourself stand for, you can run simply run:

alias

Or to search for a given alias:

alias grep <alias-name>

Other Git Tricks

Ignoring Files

Many files do not belong in version control. You should utilize your global gitignore for this. Examples of things that do not belong in version control are node_modules directories, .vscode or other IDE files, and Python virtual environments.

For any sensitive information, you can use environment files and add these to your local .gitignore at the root of your project.

Special Files

You may need to mark certain files as binary files so that git knows to ignore them and doesn't produce lengthy diffs. Git has a .gitattributes file for just this purpose. In a Javascript project, you may want to add your yarn-lock.json or package-lock.json so that Git doesn't try to diff them every time you make a change.

# .gitattributes
package-lock.json binary

Git Workflows

Rebase vs Merge

Your team may choose to adopt either a rebase or merge workflow. There are pros and cons to both, and I've seen both be used effectively. For the most part, unless you really know what you're doing, you should probably opt for a merge workflow.

You can still use rebase effectively even when you're primarily using merge to bring in your changes into production. The most common situation that would call for this, is if you're working on a feature while another developer pulls in a feature into master. You could certainly use git merge to bring those changes in, but now you have an extra commit for the simple change your teammate made. What you really want is to replay your commits on top of the new master branch.

git rebase master

This should give you a much cleaner commit history now.

To explain the difference in depth would take a whole article (blog post pending), so I suggest you check out the Atlassian docs on the difference instead.

Remote Repository Settings

I am most familiar with Github and Gitlab, however these settings should be supported by most other remote repository managers.

1. Delete branches on merge

Once things are merged, you should not care about the branch anymore since the history should be reflected on your master/dev branch. This significantly cleans up the number of branches you have to manage. It also makes git fetch --all --prune much more effective in keeping your local repository clean.

2. Prevent pushes directly to master

Without this, it's a very understandable mistake to accidentally forget you're on master and do a git push, potentially braking your production build. Not good.

3. Require at least one approval before merging

Depending on the size of your team, you may want to require even more than one approval before merging. The bare minimum should be one though, even if you're on a team of 2 people. You don't have to spend hours nit picking every single line, but in general your code should have two sets of eyes on it. Feedback is key to learning and personal growth.

4. Require passing CI tests to merge

Broken changes should not be merged into production. Reviewers will not be able to catch broken changes 100% of the time, so automate these checks instead. Enough said.

Pull Requests

Keep pull requests small and concise, no more than a couple hundred lines ideally. Small and frequent pull requests will make the review process faster, lead to more bug-free code, and make your teammates' life easier. It will also lead to increased productivity and more shared learning. Make a commitment with your team to spend a certain amount of time reviewing open pull requests, every day.

We all love reviewing these:
Alt Text

If you're working on a feature that will be in a broken state for a while, use feature flags to disable it in production. This will prevent your feature branch from diverging too much from your dev/master branch and allow you to do more frequent, small pull requests. The longer you go without merging code in, the harder it will be to do so later.

Finally, include a detailed description in your pull request, with images and/or gifs if necessary. If you use a tool like Jira to manage tickets, include the ticket number the pull request addresses. The more descriptive and visual you make your pull request, the more likely your teammates will want to review it and not drag their feet.

Branch Naming

Your team may want to come up with branch naming conventions for easier navigation. I like to start each branch with the first letter of your first name + last name, followed by a forward slash, and the branch description separated by hyphens.

This may seem insignificant but with tab completion and tools like grep, this can really facilitate finding and making sense of all the branches you may have.

For example, when I create a new branch:

git checkout -b gabud/implement-important-feature

A week later when I forget what I called my branch, I can start typing git checkout gabud, press TAB, and my Z shell then shows me all of my local branches to choose from without seeing any of my teammates' branches.

Commit Messages

Language is important. In general I find it best to not commit things in broken states, and each commit should have a succinct message that explains what the change should do. Following the official Git recommendation, I find it best to use the present, imperative sense for commit messages. Think of each commit message as being a command to the computer/git, which you can add to the end of this sentence:

If this commit were applied, it would...

An example of a good commit message in the present, imperative sense would then be:

git commit -m "Add name field to checkout form"

Which now reads: "If this commit were applied, it would add name field to checkout form". Nice and clear.

Final Thoughts

This is by no means all there is to learn about Git. I suggest you checkout the official documentation and git help for more. Don't be afraid to ask for help with Git from your teammates, you'd be surprised to learn that most of your teammates have many of the same questions.

What about you? Which Git commands or concepts have been the most instrumental in your workflow?

Posted on by:

g_abud profile

Gabriel Abud

@g_abud

Software engineer interested in React, Python, and serverless

Discussion

markdown guide
 

Just a warning on branches that I've learned the hard way.

You have branch foo/bar. Do NOT EVER create a branch foo.

This may have changed since it's an internal git detail, but in the first case, foo would be a directory. In the second, foo is a file. Something can not be both a file and directory at the same time, so you're asking for a world of hurt.

 

I would recommend foo-bar style naming instead. Tab completion works just as well, and there's no unnecessary directory structure created in refs. It also completely avoids this error.

EDIT: I found one downside to this. If you like to set all your primary branches to be the default output for git show-branch and not show topic branches, you have to use subdirectories for your topic branches. I don't use show-branch much at this time, but this could matter to someone who does. Reference: git-scm.com/docs/git-show-branch#_...

 

Interesting, I've never run into this situation but it makes sense. Will keep this in mind.

 

I would suggest the following approach to name branches. Assume your team is working on a project with name "Acme Project".

Assign an abbreviation to the project, let say this abbreviation is "ACM".

Then assign tasks in JIRA (or another tracking system) and assign unique number to every task. For example:

ACM-1: Implement "About" dialog.
ACM-2: Fix printing issues.
ACM-291: Refactor "Customer" class and reduce methods complexity and size.

Then assign these tasks to developers.

Upon getting a task, let say ACM-2, a developer creates a local branch with the same name: ACM-2, and works on it. Then it pushes this branch to the remote and makes Pull Request to the master. As the pull request approved and merged, the remote branch ACM-2 is being deleted and the relevant task is being marked as Done.

This approach simplifies communication and task/branch referencing inside of team and I recommend it.

 

I also follow this kind of branch name and wrote a prepare-commit-msg hook for it, which (among other things) places the issue tracker URL in the commit message based on the branch name.

gitlab.com/bimlas/home/-/blob/f826...

 

Yeah agreed that this makes sense if you use Jira. Not everyone uses project management tools like Jira though.

 

Maybe not Jira but they should be using something!!! Even working alone it's worth using these tools. Kanban, etc. You can do it free with a Gitlab or Azure DevOps account, so why not?

Agreed, i just managed a solo project with gitkraken boards and github issues and i am so glad i did. Kept me on task and eventually was the basis for my generated changelog.

 

Never push -f unless you're the repo manager repairing some terrible mistake that will require everyone to re-clone the repository. I always set my upstream/shared repository to fast-forward pushes only. I call it the "No Jedi Rule" -- force pushing not allowed!

 

Personally I often push force just to keep my commits clean, otherwise have an ugly history like 'bug fix', 'another bug fix', especially when working on new features that require many changes. Also nice to rebase from parent branch to avoid having all the merge commits.

Of course only push force when working alone on your branch.

 

Oh, definitely, when working alone, I break this rule all the time. It's really just part of the greater rule to never rewrite shared history. You can rewrite your own history all you want, though.

I think rebase is under-utilized, too. If I'm working directly on a shared branch, I almost never do a pull. I always fetch and rebase to keep the history clean. It actually does more than that, though, it keeps the first parent path consistent, which makes the log much easier to read and the graph much easier to trace. I'm also a fan of using rebase -i to squash feature branches into a single commit.

 

Very neat summary of git commands! Nice job.

Totally agree that rebase is the most confusing/frustrating thing to learn about git,. But once you practice a bit with on a test repo (I wouldn't dare to do it on a real project 😂) it's a very useful command to work with many branches.

 

Careful now, git push -f requires that everyone on the team have as good an understanding of git as you do, particularly when dealing with shared branches like master.

We generally don't ever want to see rewriting history on shared branches, because once any other team member grabs the latest (fetch/pull), they're going to have to deal with branches that have diverged. It goes from being a problem for one dev, to being a problem for the whole team.

When others are involved, it's a much better practice to resolve the issue as a fast-forward or merge commit, then tack that on to the HEAD of the remote branch.

 

Agreed, I will probably add that as a warning. Good suggestion.

 

Slight tweak: replace "Add a name field to the checkout form" with "Add name field to checkout form" --removing the articles saves 15% which can help keep you under the 50-character recommended commit message first-line length.

 

git specific aliases are generally better done as git aliases, rather than shell aliases. This way, they only apply when working with git, they don't need to be added to your shell rc, and also, if you decide to switch shells, they'll still work.

E.g., instead of

alias gbda='git branch --no-color --merged | command grep -vE "^(\+|\*|\s*(master|develop|dev)\s*$)" | command xargs -n 1 git branch -d'

Do:

git config --global alias.bda '!git branch --no-color --merged | command grep -vE "^(\+|\*|\s*(master|develop|dev)\s*$)" | command xargs -n 1 git branch -d'

Now, instead of gbda, you run git bda.

 

There is a couple of typos.
Your last sentence should be:
"Now, instead of gbda, you run git bda"

 

Thanks! I've corrected it.

 

I would add git status as a really handy command because it tends to teach you about other commands. For example, it will tell you the git rebase abort command when you are part way through a rebase

 

Added!

git status is a pretty beginner level command so I didn't include it at first but you're right that it does much more than we think and it's possibly one of the best learning tools in really internalizing git.

 

Modern git has replacements for the awkwardness of git checkout doing two different jobs:

  • git switch to switch branches
  • git restore to restore a file to its original state
 

git rerere to make rebase workflow a little less painful?

 

Though git rerere could do with a bit more user facing documentation ;-)

 

Hey Gabriel, nice article!

I would add another command in order to have a less destructive force push: git push --force-with-lease (or ggfl when using oh my zsh).
This works like a git push --force, but will not override the upstream branch if there's some modification. I usually default to this command when pushing my rebased branches.

Here's an article with a good explanation about that: blog.developer.atlassian.com/force...

 

A few things. First, do not use git branch for scripting, as it is meant to be user-facing command which output format can change and which is subject to config, but git for-each-ref explicitly meant for scripting.

Second, instead of git push --force, use safer alternative of git push --force-with-lease which would overwrite only your changes.

Third, git restore might be easier to use than git reset.

 

Nice article, but IMHO not deep enough. I used git command line for a long time, but then moved to visual tools.
Currently I use an awesome visual client: Fork. It really simplifies my work with git.
In the past I used GitKraken which is similar to Fork, but has more features.

 

Fork looks cool, will try that out. Currently using Gitkraken and built in JetBrains tools but I like the way this looks. I used to use command line too but when you have 20 plus repos it's so much easier to treat these like projects inside a GUI. Thanks for mentioning this. Edit: Dang no Linux client... I need all platforms.

 

For Windows there is atlassian sourcetree.
BTW many IDE have git plugins or capabilities to easy your work.

 

I agree, thanks for mentioning. I used Atlassian SourceTree before, but IMHO it is not such intuitive as Fork or GitKraken.
Regarding IDEs, I sometimes use the integration of WebStorm and InelliJ Idea, but still prefer the tools I mentioned.

 

In CI/CD world, I think git worktree is the most instrument.

epsi-rns.gitlab.io/devops/2020/02/...

Comparing HEAD for each branches

 

Hi! thanks really useful!

My two cents: in .gitattributes, instead of using
package-lock.json binary
You can use
package-lock.json -diff

It will have the same result I guess, preventing git to show the differences ;)

(maybe is more correct with -diff according to the nature of the file)

 

The git diff with the two dots end points A..B should be deprecated as it doesn't actually follow the normal revision range method of being the range of revisions between ^A B.

Rather its just git diff A B, not git diff $(git merge-base A B) B which the two dot notation would normally imply

 

Thanks. I was hoping for more detail on commit messages. Google has a standard that is multi-line, and I've been wondering how (or if there's a convenient CLI tool) to do that on the command line using a template.

 

@kevin - you might be interested in Conventional Commit coupled with tools like Git Hooks, Commitzen or Husky.

I wrote about it here stephencharlesweiss.com/blog/2020-...

 

Sometimes if I want to move some work from one location to another without creating a remote branch I diff into a file and then apply the file on the other end,

git diff > /var/tmp/foo.diff
rsync -av /var/tmp/foo.diff remote:/var/tmp
ssh remote
git apply /var/tmp/foo.diff

 

For a visual representation of things, and sometimes easier than having to remember commands, use something like GitKraken. I love that thing, but I also do use the command line ... sometimes. Could make a difference whether you develop for work or as a hobby.

 

You mentioned that you can see the diff's of all --staged changes and see the diff's of all unstaged changes, but you didn't mention how to show the diff's of all changes?

 

Great summary, I use almost everything of it every day.

 
 

Thanks for this, really useful, I've used git for years - but always a "little nervously" lol. This will help me think about clearing up some of the clutter.

 

Yeah getting past the "nervous" stage takes a lot of time, I know exactly what you mean.

It will make you so much more confident in what you can do with a shared codebase and really changes the way you think about your code though.

 

git add --patch is one of my favourites. It adds files by showing exactlly what changed. In OMZ, you can use gapa does the same thing

 
 

tig very handy tool on top of git. I like to use tig for viewing all branch history instead of git log. And I use zsh that gives all the git command aliases built in.

 

I'll have to check out tig, thanks for the recommendation.

 

Nicely done! I tend to use WebStorms git tools a lot so it's good to refresh my memory on native commands.

 

Great. Thanks for sharing !!

 
 

Really cool! Thank You Gabriel!!