What is nbwipers?
nbwipers is a CLI tool that strips outputs and metadata from Jupyter notebooks before git commit.
- Written in Rust - faster than
nbstripout - Supports git clean filter
- Works with
.ipynbfiles
Why use it?
Jupyter notebooks store cell outputs inside the .ipynb file (JSON). This causes problems:
- Noisy diffs - output changes pollute every commit
- Repo size - images and large outputs bloat the repo
- Security - sensitive data can leak in outputs (API keys, query results)
The solution: strip outputs automatically on git add via a clean filter.
Why not nbstripout?
nbstripout is written in Python. It is slow - git status, git diff, and git add all became noticeably slow on this repo because nbstripout was invoked for every .ipynb file.
The main cause is Python startup time. With 100+ notebooks, nbstripout can take 40+ seconds where a Rust-based tool takes ~1 second.
Faster alternatives:
| Tool | Language | Notes |
|---|---|---|
| nbstripout-fast | Rust | Up to 200x faster; no git filter install support |
| nbwipers | Rust | Inspired by nbstripout-fast; adds git filter + pyproject.toml config |
nbwipers is essentially nbstripout-fast with better git integration. Switching to nbwipers fixed the slowness.
Setup
1. Install
felixgwilliams/nbwipers is now in the aqua registry as of v4.517.0.
Using aqua, add to aqua.yaml:
packages:
- name: felixgwilliams/nbwipers@v0.6.2
Then run:
aqua install
2. Configure git filter
Run once per repo (writes to .git/config):
git config filter.nbwipers.clean "nbwipers clean -"
git config filter.nbwipers.smudge cat
git config filter.nbwipers.required true
Or edit .git/config directly:
[filter "nbwipers"]
clean = nbwipers clean -
smudge = cat
required = true
required = true makes the commit fail if nbwipers is not installed. This prevents accidentally committing outputs.
3. Add .gitattributes
In the repo root, add .gitattributes:
*.ipynb filter=nbwipers
**/.ipynb_checkpoints/*.ipynb !filter
**/.virtual_documents/*.ipynb !filter
The !filter lines exclude checkpoint and virtual document files from filtering.
4. Verify
git check-attr filter path/to/notebook.ipynb
# Expected: filter: nbwipers
Today's issue: nbwipers was not working
Symptom
Committed a notebook in example2/. Outputs were not stripped.
Debug
git check-attr filter example2/pipeline.ipynb
# Output: example2/pipeline.ipynb: filter: nbstripout ← wrong!
The filter was nbstripout, not nbwipers, even though .gitattributes says nbwipers.
Root cause
Found the culprit in .git/info/attributes:
*.ipynb filter=nbstripout
*.zpln filter=nbstripout
*.ipynb diff=ipynb
This file was written by nbstripout --install in the past.
Git attribute priority (highest to lowest):
-
.git/info/attributes← highest priority -
.gitattributesin the same directory as the file -
.gitattributesin parent directories (up to repo root) - Global
core.attributesfile
Because .git/info/attributes has the highest priority, nbstripout was winning over nbwipers in .gitattributes.
Fix
Remove the nbstripout lines from .git/info/attributes:
Before:
*.ipynb filter=nbstripout
*.zpln filter=nbstripout
*.ipynb diff=ipynb
After:
*.ipynb diff=ipynb
Verify fix
git check-attr filter example2/pipeline.ipynb
# Output: example2/pipeline.ipynb: filter: nbwipers ← correct
Key takeaway
When switching from nbstripout to nbwipers, check .git/info/attributes.
nbstripout --install writes there, not to .gitattributes.
If both tools have entries, .git/info/attributes always wins.
Top comments (0)