Introduction
Git is an incredibly powerful version control system that tracks every change in your project. But what happens when you need to modify past commits? Maybe you accidentally committed a password, a giant log file, or need to clean up metadata like author names.
This is where git filter-branch comes in---a powerful but dangerous tool that lets you rewrite Git history. Unlike git reset, which only affects recent changes, filter-branch scans and modifies every commit in your repository.
Why Rewrite Git History?
Before diving into git filter-branch, let’s understand why you might need it:
- Remove Sensitive Data – Accidentally committed passwords or API keys.
 - Delete Large Files – Reduce repository size by purging big binaries.
 - Change Commit Metadata – Fix incorrect author names or emails.
 - Extract a Subdirectory – Split a repo into smaller ones.
 
  
  
  git filter-branch vs. git reset
Many beginners confuse git reset with git filter-branch. Here's how they differ:
| Feature | git reset | 
git filter-branch | 
|---|---|---|
| Scope | Only affects recent commits | Rewrites entire history | 
| Use Case | Undo local changes | Permanently modify past commits | 
| Impact on Hashes | Does not change old commit IDs | Changes all commit hashes | 
| Collaboration Impact | Safe if not pushed yet | Requires --force push | 
| Best For | Fixing last few commits | Deep cleanup (files, authors, etc.) | 
  
  
  When to Use git reset
You haven't pushed yet and want to undo recent commits.
Example:
git reset --hard HEAD~3  # Discards last 3 commits
  
  
  When to Use git filter-branch
You need to modify old commits (even if pushed).
Example:
git filter-branch --force --index-filter 'git rm --cached passwords.txt' -- --all
  
  
  Introducing git filter-repo - A Better Alternative
While git filter-branch works, it has several drawbacks:
Very slow on large repositories
Complex syntax
Can leave behind "dangling" commits
Officially discouraged in Git's own documentation
git filter-repo is a modern replacement that:
Is 10-100x faster
Has simpler, more intuitive commands
Better handles edge cases
Automatically runs garbage collection
Installing filter-repo
# For Python users:
pip install git-filter-repo
# For Homebrew (Mac/Linux):
brew install git-filter-repo
Step-by-Step Examples
Example 1: Remove a File from Entire History
Scenario: You committed secrets.txt a year ago and need to erase it.
Using filter-branch:
git filter-branch --force --index-filter\
  'git rm --cached --ignore-unmatch secrets.txt'\
  --prune-empty --tag-name-filter cat -- --all
Using filter-repo (better):
git filter-repo --path secrets.txt --invert-paths
Explanation:
--pathspecifies the file to target--invert-pathsmeans "keep everything except these paths"
Example 2: Change Author Email in Old Commits
Scenario: Your old commits show the wrong email (old@example.com).
Using filter-branch:
git filter-branch --commit-filter '
  if [ "$GIT_AUTHOR_EMAIL" = "old@example.com" ];
  then
    GIT_AUTHOR_NAME="Your Name";
    GIT_AUTHOR_EMAIL="new@example.com";
    git commit-tree "$@";
  else
    git commit-tree "$@";
  fi' HEAD
Using filter-repo (better):
Create mailmap.txt:
Old Name <old@example.com> New Name <new@example.com>
Run:
git filter-repo --mailmap mailmap.txt
Advanced Tips & Tricks
  
  
  1. Use --replace-text to Modify File Contents
git filter-repo --replace-text replacements.txt
Where replacements.txt contains:
OLD_PASSWORD==>NEW_PASSWORD
2. Analyze Before Making Changes
git filter-repo --analyze
Creates a .git/filter-repo/analysis directory with statistics
3. Clean Up After Filtering
git reflog expire --expire=now --all && git gc --prune=now --aggressive
(Removes orphaned objects to save space.)
Dangers & Best Practices
⚠ 1. Always Backup First!
git clone --mirror repo.git repo-backup
⚠ 2. Warn Your Team
Rewriting history breaks everyone's local copies.
They'll need to:
git fetch --all && git reset --hard origin/main
Conclusion
When working with Git history, you have three main options:
git reset- Best for undoing recent, local changes. Simple but limited to your current branch.git filter-branch- Rewrites entire commit history (including pushed changes). Powerful but slow and complex - use with caution.git filter-repo(Recommended) - Modern replacement for filter-branch. Faster, safer, and more efficient for permanent history changes.
Simple Rule:
Recent mistakes? Use
resetNeed to permanently modify history? Use
filter-repoAvoid
filter-branchunless absolutely necessary
Remember to always backup your repository before making permanent changes, and warn your team when force-pushing rewritten history.
For most Git history cleanup tasks today, git filter-repo is the best choice - it combines power with better safety and performance.
Further Reading:
Up Next in the Series: git revert --no-commit – Revert multiple commits without auto-committing
Daily advance GIT tips in your inbox---worth starting? Respond to my poll here🚀
For more useful and innovative tips and tricks, Let's connect on Medium
              
    
Top comments (0)