Git is a great tool for managing code changes over time and allowing code to be integrated by teams of developers. But even the most helpful tools can become problematic when misused, and we need to understand these drawbacks in order to avoid them. The most common problems with Git are caused by rewriting history and undoing changes that have already been made.
Lets look at the parts of Git that we should avoid to keep our life as developers simple and easy.
Squash
The standard process for using Git is to clone a repository, make changes locally, then integrate them with the remote server. When a developer has committed multiple changes to a repository and pushes the code to the remote server there will be multiple commits showing in the version history. This is beneficial so that the changes can be tracked but sometimes is seen as confusing or hard to read.
Sometimes developers will use the git squash functionality to convert the multiple commits into a single commit so that a whole change is in one commit. However, this should be avoided for a few reasons.
Firstly, it removes intermediary states, which means that when something goes wrong we are not able to access the commits between the start and the end of the change.
Secondly, it rewrites the history of the changes and we lose track of when the commits were made. This results in less information being available to help identify when changes from different branches were made and when conflicts arose.
Thirdly, it increases the size of changes, which in turn increases risk. It is far easier to deal with frequent small changes as opposed to infrequent large changes.
Instead of using squash it is better to keep the commit history unmodified and present each commit as it was when made. This lets us keep the history of changes, review all of the commits made within a change, and manage the risk by making smaller changes.
Force Push
Force push is a way to overwrite changes on the remote server with our version of the code. But doing this can cause issues for other team members and cause us to lose code changes.
When we force push our changes to the remote server, other developers working on the same code base may end up working off old code. If they pull the latest copy and you subsequently force push over the top of that, their clients will be out of sync. The problem is compounded if they respond by force pushing their own changes, causing the same issue for you.
Forcing the overwrite of remote code can also lose changes that were made since your last pull. This can lead to old bugs coming back, failed builds, and the loss of work.
Summary
Git is a great tool but when misused can cause a lot of issues. As part of using Git we should avoid rewriting history or undoing changes with functionality such as squash and force push. Doing so will help us keep our work simple and efficient.
Follow me here for more content or contact me on:
- Twitter: @BenTorvo
- Email: ben@torvo.com.au
- Website: torvo.com.au
Top comments (4)
Won't agree too much here. Squashes on Main branch is perhaps a no-no, but often times you open a Pull Request with a feature you need to start making changes to the code (tests, more adaptations etc.) I would much prefer to squash these commits and bunch them a feature when merging to main branch and not multiple commits that sometimes drifts from the main feature.
These advices are meaningful for very beginners. When you really don't know what you are doing, it is better to not use “destructive tools” at all. I've already seen some team rebasing and rewriting their master once a month 😱
However, when properly used, squash, interactive rebase, push force, etc… really are part of day to day developer's toolbox. Typically, to avoid stacking "Address PR", or "Fix Typo" commit at the top of your branch.
Firstly, it removes intermediary states, which means that when something goes wrong we are not able to access the commits between the start and the end of the change.
can you explain what you mean by:when something goes wrong
?Secondly, it rewrites the history of the changes and we lose track of when the commits were made. This results in less information being available to help identify when changes from different branches were made and when conflicts arose.
`If you use squash on a main branch I agree but otherwise your assumption is not correct.
When we force push our changes to the remote server, other developers working on the same code base may end up working off old code
This is extatly the situation I wrote before never use squash or push --force on a main branch or more general on a branch where more than one is working on... otherwise it's absolutely fine and very helpful.
I disagree with you on the topic of sqashes. For years we squash every feature branch, when merging into the master branch and never had a problem with it.
It leads to a much cleaner history. I do not need commits like 'fix typo' or 'revert previous change because it did not work'.
It also gives us the opportunity to consolidate the commit messages.
When you work in conjunction with a remote repository like github you also do not lose intermediate states because you can always check out the pull requests to see what has been done.
I also do not understand your argument regarding "It is far easier to deal with frequent small changes as opposed to infrequent large changes".
The overall size of a change does not change when you squash or not and if you want to review a pull request. You have to review it as a whole.
Maybe it helps to have overall smaller features.