I build a lot of public experimental demo repos to support my tutorials, and I often encourage people to fork them and deploy them to Netlify to try things out. The problem is, these experimental repos can often come with a terribly messy git history — and I don't want people to inherit that. And so, before I publish and promote a demo project, I like to clean up the git history ✨.
First, the TL;DR: use these four commands — with caution (more on this later!) — and enjoy a tidy ship.
# 1. Reset the repo to the initial commit, preserving the current state of the files on your local machine
git reset --soft {INITIAL_COMMIT_HASH}
# 2. Stage all files
git add .
# 3. Make a commit
git commit -m "Initial commit"
# 4. Force push to the origin
git push --force
Let's look in detail at what each command does.
Soft reset to the initial commit
Each time you push a change to git, a unique identifier called a commit hash is generated from various bits of data associated with the commit, such as the commit message, file changes, commit author and date. Commit hashes point to the state of the repository when the commit was made, and allow us to move back and forward in time in the repository.
If your repository is hosted on GitHub, you can view a list of your commit hashes from the home page of your repository, by clicking on the commits link at the top right of the file list.
This will take you to a list of commits, showing a shortened commit hash to the right of the commit message. Scroll down to find the first commit, click to view the commit and full commit hash, or use the copy button to copy the full commit hash to your clipboard, which might look something like this: 64d52970e9b73551e7d837ec367610p
.
There are ways to get the first commit hash using a terminal command, but the method may vary depending on the number of root commits in the history — which is beyond the scope of this blog post.
When you've got your initial commit hash, run the following commands in your terminal. The reset
command instructs git to reset the repository to a particular state, in this case the initial commit, and the --soft
flag leaves all files in their most recent up-to-date state.
# reset the repo to the initial commit
git reset --soft {INITIAL_COMMIT_HASH}
# check the status of your local repository
git status
You should now see a message stating your branch is behind the main branch by however many commits you made after the first commit, and that there are some changes to be committed.
Stage all files
Next, use git add .
to stage all files. Given that your .gitignore
file will remain up-to-date and include any sensitive files you don't want to commit, you should be good to go. But at this point it's worth double-checking that running git status
didn't list any files you don't want to add to the repository.
Make a commit
Next it's time to overwrite that history with a nice, tidy commit message. Use Initial commit
or whatever commit message you'd like.
git commit -m 'I am rewriting history'
Force push to the origin
It's time to push your new commit to rewrite the history using git push --force
. Force pushing tells git to prioritise the changes in your local branch over the changes pushed to the remote. And this is exactly what we want to do in this case! Run the following command in your terminal, and enjoy the power it brings.
Full disclaimer that you should only force push in git when you absolutely know what you're doing, or if your project is tiny and you're the only developer working on it — like most of mine!
git push --force
And just like that, you've pushed a forced update to the repository.
The result
Go back to GitHub and refresh the list of commits, and you'll see your shiny new commit in the commit list!
The keen-eyed perfectionists out there might notice that I now have two commits in the history rather than one single initial commit. However, for me and my purposes this a quick and friendly way to clean up my experimental git history — and gives us all a reminder that nothing in code is ever, really perfect.
If you are interested in completely resetting a git repository to its initial initial commit, the process involves a few more steps (which I think are a little less friendly). One way you can achieve this is by creating an orphan branch from main that contains all the current files, creating a single commit to that branch, and eventually replacing the original main branch with the new branch. Engineering Manager Candost Dagdeviren has a great blog post which lays out the steps.
So go forth and have a little spring clean of your git history — but please remember to do it with caution and only when you really, really need to!
Top comments (25)
Souvenir souvenir, this was my first article on this website actually :) dev.to/jmfayard/pull-requests-a-si...
git reset --soft
is still my go-to solution, there is so much less opportunity to shoot yourself in the foot thangit rebase --shit
(not an actual git flag).In practice, I almost always do that against
origin/main
to clean up the pull requestIn this case, the complete workflow would be this:
That's it, the PR contains only one commit
my commit message
and the workflow is pretty much idiot proof.Oh, nice!
I’m getting a lot of people all over the place say “why not rebase?” 😅 and the fact of the matter is this method is less scary and much more suited to my usual use cases. As you say, you’re much less likely to shoot yourself in the foot — meaning you won’t end up on a weird state of “in an interactive rebase and not know how to get out if you make a mistake”.
It's a difference in mindset.
Some people like powerful tools like can do everything. If something go wrong they will read the reference docs and the best practices and make sure they get it right the next time.
For me the most important feature is simplicity, which for practical purpose I would define as focusing on choosing good enough solutions that minimize the probability that things could go wrong.
The first set will always prefer git rebase and the second git reset --soft
Absolutely! Well said!
This seems excessively complicated; you can just do
git rebase --interactive <last good commit>
and git will give you a nice description of all the things you can do, like rewording a commit message, merging two commits together, etc. You can even move commits around to put a commit fixing a bug right after the one that introduced it and have them be merged.It is four commands, vs your single command. Those four commands, though, leave very little margin for error, whereas an interactive rebase is a minefield. There are ways to go wrong when re-ordering commits that can result in cascading merge conflicts, confusion, and frustration.
This is a nice shorthand for squashing everything into a single commit if you're not quite ready to learn interactive rebase.
Much appreciated Vincent!
Interactive rebase literally tells you what to do and if all you do is squash commits, there is nothing to shoot yourself in the foot with. This is a complete misrepresentation.
Put even more simply,
rebase
is 1 command that holds your hand the whole time, it doesn't get more fool-proof than that.I don't know what tools you use to interact with rebase. Perhaps they're more beginner friendly than my first experience, where an editor containing a script that would be run by git appeared. The hand-holding was minimal.
As for minefield, create a series of commits where each commit depends on changes introduced in the one prior.
Now try moving the last commit first. As git replays each of the commits, it will raise a merge conflict, as the diff will not apply cleanly to the code. The interactive rebase drops you to a command shell where you must merge and continue. This will happen, though, on every subsequent commit, dropping you back to shell every time.
This is not a misrepresentation, it is my first-hand experience learning the tools. You may be an expert, but you have zero authority over others' actual experiences.
/Edit
Excuse me, it looks like I missed your craftily placed qualification:
The fact that you have many other options when performing an interactive rebase proves that it is less simple than the 4 commands by OP. No matter how hand-holdy your tools are, the tools don't know you need to squash all commits. They won't tell you to squash all. Again, there is much more margin for error in an interactive rebase.
Yep. That's what I was talking about. It comes with instructions and all you need to do is make your changes, save and close. It doesn't get simpler than that.
This is EXACTLY my experience and why I wrote this article. Thank you for explaining it brilliantly!
I'm getting a lot of "why not use rebase?" 😅 Rebase is a tricky concept for beginners, and it's got me in a mess before in large organisations and when I was less experienced. I'm not saying this is the only way, it's just a way.
To me, this isn't excessively complicated because you're not likely to end up in a situation that you can't escape on the command line if you don't know what you're doing. Moving commits round is a scary concept for people just starting out. Even me now, with many years of experience.
Git is really forgiving in this regard though; interactive rebase gives you an on-screen explanation of all the things you can do, and
git status
always includes hints about what you're doing and how you can cancel it (usually just the command that started it with--abort
)I agree with you.
I can get why @whitep4nth3r prefers a way she understands.
But for me, this is misleading for who wants to understand the git concepts/features, and very strange for people experienced with git.
Does it work ? Yes
Is it a way to achieve what git rebate interactive? Yes, somehow
Is it simpler ? Maybe
Will I use it ? No
Would I ever suggest doing like this? No
Would I discourage people to do this? No, but I will encourage them to use git rebate interactive.
Give man a fish ... you know
But yes, I remember git rebase interactive was something I had to learn before mastering it
A very uncommon a
way to do a rebase interactive and fix up.
I've had some bad professional experiences on teams who rebased everything which ended in some wild consequences. I'm scared to go back to that place 😅
Elaboration? The basic rule is that branches have a single owner and don't force push main.
This is my preferred way to do it with my personal projects. Rebase is a more involved concept and this article is meant to show git in a friendly light for personal projects where you might want to clean up your commits.
“The basic rule” you speak of isn’t always followed … which is why people can end up on a mess with rebase. Anyway this is really just about my personal experience and some people requested that I share what I do after seeing me do it on stream.
I think you got many fans here😅 in comments section.
Thanks for sharing this, after reading all comments decided to try both way.
Interesting. Thanks for the post
git rebase -i
git push --force-with-lease
I am relatively new to Git and as per my knowledge, anything you do with git is still undoable. I think the
git reset
is also tracked in the git history.I did not understand when you say "the --soft flag leaves all files in their most recent up-to-date state"
Did you mean that the working directory is not reset?
By the way, as I was learning Git's architecture, I also wrote an article about workings of git like: the working directory, The commit hashes, ...
understand Git
Using
--soft
ensures the working directory is not reset to the state it was during your initial commit. So your local files are in the same state as the most recent commit. It's like you're winding back the commit history in terms of the hash, but keeping the latest file changes.There's a
git reflog
so the true history is still in your repo, you're just hiding it.If I understand correctly, your goal is to have a single commit that contains all the work you've done while making other commits during the way.
Can't you just remove
.git
folder andgit init
to create a new clean repo?If (for whatever reason) you want your new local repo to override the previous origin repo you can just preserve
config
file from your.git
folder and reuse it aftergit init
to don't need to manually set origin.Then the repo will be truly clean without the whole bloat of history available in
reflog
.Good one 👍