As an experienced engineer, I struggle with the commit cadence and find it is a personal development style - others might commit more or less frequently than I do.
As mentioned, break up your features or tasks into branches. Commit often on your local/remote branch at crucial save points or to mark work in progress with commit messages that at are short but to the point.
When you are ready for a pull request and to merge to master, git rebase is your friend. This command allows you to squash your mini commits into a one nicely worded and descriptive commit about the changes you have made.
A fellow engineer at Spreedly has a great blog post on commit messages
So true! Sometimes it helps me to git commit -m "wip" after every tiny change and then git rebase -i [branch] once I'm done — this way you can undo every single change in case you fuck up somewhere along the way and after you're all done you can squash the witty wips into one coherent, well-explained message (and nobody will ever know that you broke everything three times in the process)
git commit -m "wip"
git rebase -i [branch]
What I often find myself doing is making multiple logical changes at once and then interactively adding them to the commit with git add -i in chunks to separate out what I've added. It's easier for me as I don't always have a clear vision in my head when I set out as to what will need to be changed exactly. But when I go through the chunks it helps me to visualize the edits I've made and to logically group them and see where I might have missed changing something and allows me to commit them also in an order that makes sense.
git add -i
When you do Test Driven Development, a completed test is a very logical point in time to do a commit. Also, it's kind of a breaking point because you will then step to another piece.
With this approach, you get very granular commits (which I personally love)
Absolutely. After I read the article, seeing that the author had trouble in the past finding a good point to commit, I searched for "test driven" just to see if anyone else had the same intuition.
The hard part about using TDD to drive commits is that writing good tests is a very different skill from writing good code, and that you'll (initially) sink a lot of time into writing, maintaining and debugging tests instead of staying in the flow and just writing a test, see it fail, write some code, until you find that the tests you have are sufficient. I think I had to do TDD every day for almost a year before I was able to find good commit points while writing tests.
Great post. I, too, have found myself in this position. I think what @jonbloomer and @johannesEbner commented are also very helpful. I think the most important point is, you must plan. It doesn't mean the plan can't change--but you must have a logical series of steps you want to take to accomplish anything useful. Once this plan is in place, committing becomes easy.
Thanks for sharing.
I was having this problem at work too, until I decided to add sub-tasks to my assigned tasks.
So for each sub-task that should be one step into solving the overall bug or implementing the feature, I create a commit.
For instance, If I have Task "x", I would divide it into Subtask 1, Subtask 2, and so on. And commit each working subtask. And when I'm done with the whole task I push the changes.
Totally agree with having a plan of attack! With more complex changes, I do that as part of pre-writing as described here:
Commits absolutely do NOT need to be some coherent chunks, they should be whatever state you feel you want to save. Work in feature branches, make as many or as few commits as you think is suitable, and then when merging use a good merge message describing the feature or changes.
Often if you name your branch feature/new_login and merge that, the common auto generated message that is something like Merged feature/new_login is descriptive enough.
If your workplace has some strict policies about commits having to be coherent, or planned, you should complain or work somewhere else. The policies and tools should be made so your life becomes easier, not harder, and if you need to plan when you can commit you're making your life harder.
What you SHOULD strive to do is keeping your clone in a state that you can always push and pull, and you don't need to be afraid of breaking something, or something breaking on your end.
I often commit when I've done a significant amount of progress or when I finish the feature or bug fix, and when switching tasks, not much more often than that. When I realize I've got multiple unrelated changes I'll commit them separately. I'm not scared of committing something that's broken or not complete, as I never work on a branch that's supposed to immediately go live, or someone else depends on.
Instead of planning your commits, you SHOULD pay attention to what you commit though. Don't do blind commits of the git add . && git commit -m "stuff" -style that way too many people do. Make sure you actually know what changes you are committing and that you didn't leave debug code or similar things around, unless you need it when you continue working.
git add . && git commit -m "stuff"
Use a GUI tool to diff the changes (your IDE should help here). Make sure you use a similar visual diff to re-check all your changes when merging your branch (or pull request). Beyond that there really isn't a need to "plan" things out.
Some of what you say makes it sound like your workplace doesn't do code reviews. Reviewing changes submitted as incoherent batches of commits is a pain, to put it nicely. Especially if they're large commits. Especially if they're so incoherent that there are a bunch of changes mixed in that are completely unrelated to the change I'm reviewing.
No, code reviews are really simple. Do a PR on GitHub, or BitBucket, or whatever you use and you get a diff of ALL the changes in a branch. If you don't use those tools, you can always do a merge locally and view the full set of changes. I've been doing this for quite a few years and there is no problem in viewing all the changes in a feature branch.
Just learn to use the tools you have instead of making everyone spend extra effort.
The tools can't exclude a bunch of irrelevant changes if they're submitted with the code to be reviewed. Instead of the submitter doing a little extra work they're forcing the reviewer to.
The tools can't explain why changes were made when a bunch of commits have a single comment that doesn't explain anything, like "feature completed". Instead of the submitter doing a little extra work they're forcing the reviewer to.
If everyone learned to use the tools well there wouldn't be these problems, obviously, but it's clearly not as black and white as "Commits absolutely do NOT need to be some coherent chunks".
So don't commit irrelevant changes to your feature branch, this falls upon the "blind commits" -category of problem.
The best way to explain why certain changes were made is not in the commit messages as you should not review separate commits but the end result of the work in the feature branch.
What I do if some change is unclear is usually one of the following:
Oh and if you don't know which of the changes you made are relevant to your feature branch anymore you don't commit often enough.
There's more people in this world, and in your company, than you, dude. Just because you feel like it's too much of a burden to write a good commit message does not mean the tool is getting in your way. The entire point of git is to have a history, to be able to look at it, and revert certain parts of it. Step one of doing all of that is having a coherent history.
I've been programming using DVCS systems for a living and in teams for quite some time, and know exactly what level of effort you need to put into your DVCS usage to make it easy for other members of the team.
I didn't say your commit messages should be "a" or "stuff", but there is literally no need to be super careful with your single commits, as long as you work on feature branches.
If you're not using feature branches (you should) you must at the very least make a very good commit message, but really you should be using feature branches. When you create your feature branches the name should be clear enough that you don't need to read the commit messages to understand what it's about. Any additional information reviewers might need can be attached into the PR, and the task/issue related to the changes.
If you directly make a single commit to master or similar for your whole feature, you're doing it wrong.
This seems utterly unpractical and error-prone to me.
A better approach would be, imho, to just commit as often as you can.
Ideally, after every micro-iteration (at each stage when something is working).
Why is this approach better?
No worries: you committed.
Better time-travel possibilities.
At the end of the day you can squash all of your micro-commits in one big juicy commit that includes every changes made to implement a function.
It's nothing hard: it's just git rebase.
Totally agree, merging is to be planned, committing is to be done frequently. Anyone who has ever lost work from hardware failure knows this. At any point when you have work you would not want to lose, commit.
Exactly, early and often, to a user/feature branch, then rebase/squash them as needed before merging to a 'real' branch.
Yeah, I remember a time when I never used to remember to commit at the right places, and I'd end up with a pile of work I had to laboriously sort through. I ended up writing a little bespoke script that helped me get into the habit of committing in the right places. Perhaps it'll help someone else. The crucial idea is to make the changelog a first-class file that I can see and interact with in my text editor alongside my code. As I bounce around my text editor, the changelog keeps popping up in front of me, making sure it's never too far from my thoughts, and encouraging me to write in it. Everytime I save the changelog, editor automation records my latest update to the changelog as a new commit message. The changelog now becomes conversational, a record of my thought process as I work through a problem.
Over time I stopped needing these training wheels, but they were very helpful for a couple of years.
If you break up your work into discrete tasks (or user stories) it makes committing code a much more logical procedure. Plan a task, create a new branch for the task, commit logical chunks (or your own choosing) until the task is complete and then create a PR to be reviewed a before merging branch into master or if not on a team just merge yourself.
If you are new to planning your own work, using something like Trello will help you to break up your work. The smaller the chunk of functionality the better from my experience because it keeps you focussed on documenting your commits and gives you a continuous sense of achievement.
I use stacked git to keep a series of patches going, and then I can separate the idea of committing. I can work on any patch in the series indefinitely, shuffle them around, and when a patch or series of patches becomes ready, cherry pick it into master and push it. It also makes it really easy to keep patches split into small logical pieces, (or to split them, if you didn't in the first place). And it makes it easy to keep things separate in the first place, so when you think "oh, I should have factored out this first...", you pop off your current patch, create a new one to factor it out, then push your patch back on and continue. It's difficult to explain (people often ask "how is this different than plain git") but I find it a nice way to work (nicer than plain git.)
If I want to implement a feature I usually create a new branch and make commits whenever significant changes are made to files that I can group in the same category. So a commit with a message like Add footer is a part of a branch like create-about-page, which is merged with master or dev upon completion.
How do you guys handle commits with changes unrelated to the branch you're working on?
Like Andrew, I often break a large commit into a series of smaller commits by interactively selecting chunks that form logical groups. But usually I'll commit often enough that there's no need to break it up.
I also takes notes that will go into the commit if I know some part of a change will need explanation (e.g., why the new code is preferable to the old). But most of the time looking over the chunks while I commit is enough to recall the reason I made the change.
I find it helps me and my teammates to commit frequently because that's a hell of a lot easier to review than one massive commit with a message that provides no useful information.
More frequent commits also means it's easier to fix the build if you break it.
I find it much easier to make logical meaningful commits if you use a tool like SourceTree or some kind of UI. It's just easier with a tool that lets you select lines of a file to be part of a commit. So after you've done a day or two worth of work, you can spend time adding files / lines to be part of a logical commit, continue doing that on a branch until everything is committed.
I was able to improve this gut feeling about when to commit a lot. Basically I was forcing this technique on myself and, as many others already have stated, create a commit for each subtask (if the project was well managed and had tasks and subtasks, of course). Although all of this comes more naturally to me now, I still often end up mixing two or three commits, but I think saying "aw, crap" and throwing all the changes into one commit is the wrong way then. What I do instead is take the time, extract the single logical units from my pile of changes (if you have a git client that fits your way of working that should be doable - not naming any because taste is different) and commit them one by one. This can be tedious but enforces to remember to commit early next time - there are things you have to learn the hard way.
For a long time I used TFS rather than a git-based repo. It got me into the habit of withholding all my commits until I was ready to check in and commit to an integrated build. That could usually mean days or a week of work, all sitting on my local machine (I tended to check-point it with "shelvesets").
Since I've moved to git, being able to check point locally has got me to cut right down on the amount of time between commits. Now, I try to make a list of things I've got to do, and when I check something off the list, it's a commit and the message is the item I've finished. Usually an hour or two of work at most :)
I haven't done really well at working on one thing at a time, so by the time I'm ready to commit my work tree tends to be pretty cluttered with changes.
So I always use -p to selectively stage stuff, making multiple commits after everything is done. Even if the same code is touched by multiple unrelated changes (eg. refactoring or code style), I try to edit the diff to make sure each commit only does what it should. And on the reverse side, if an unpushed commit has an error, I'll rebase or amend before pushing.
Being able to juggle multiple change sets locally and organizing stuff before making the final push is really one of the most powerful parts of git.
I've had the same struggles and tried the same solution. I even whipped up a Git extension to help the process: fractaledmind.com/projects/git-do/
It allows to actually write the commit message before you do the work, then just say when you've completed that work.
Commit early, commit often. Like an online CTRL+S
I've been at this for over decade and will readily admit that my most common commit message is WIP. :-)
try writing your commit message (in your head) before you write the code
try writing your commit message (in your head) before you write the code