Every team that adopts mutation testing hits the same wall around week two.
The tool generates a mutation. Something like flipping a > to >= in a boundary check you know is dead code. Or injecting a fault into a logging call that doesn't affect behavior. The mutation survives, your score drops, and someone on the team asks the reasonable question: can we just ignore this one?
That's not the wrong question. Some mutations genuinely aren't worth killing. The wrong answer is making suppression easy and invisible.
I've been building pytest-gremlins, a mutation testing plugin for pytest, and this week we shipped v1.5.0 with a feature I've been thinking about for a while: inline mutation pardoning with mandatory reasoning and enforcement ceilings.
It works like this. You add a comment to the line:
logger.debug(f"Processing {item.id}") # gremlin: pardon[logging-only, no behavioral impact]
The pardon keyword tells pytest-gremlins to skip mutations on that line. The brackets force you to write a reason. Not "skip" or "ignore" or "TODO." An actual explanation of why this mutation isn't worth your time.
That alone would be fine but incomplete. The ceiling is what makes it honest. You set --max-pardons=15 or --gremlin-max-pardons-pct=5 in your pyproject.toml, and pytest-gremlins enforces it. Exceed the limit and your suite fails. It's a ratchet. You can use pardons, but you can't use them to quietly erode your mutation score back to meaninglessness.
This matters because mutation testing adoption fails when teams feel like they're fighting the tool instead of improving their tests. A blanket "fix every surviving mutation" policy burns people out. A blanket "ignore what you want" policy makes the tool decorative. The pardon system sits in the middle: acknowledge the exception, explain it, and keep the total bounded.
The same release also shipped two-phase xdist integration. If you use pytest -n auto for parallel testing, pytest-gremlins now works alongside it without you changing anything. The architecture runs your normal test suite in parallel first (using xdist as usual), then runs the mutation phase sequentially. Previous mutation testing tools either couldn't coexist with xdist at all or required you to choose between parallel tests and mutation analysis.
Getting started takes about thirty seconds:
pip install pytest-gremlins
pytest --gremlins
Configure it in pyproject.toml if you want to tune things:
[tool.pytest-gremlins]
workers = 4
cache = true
report = "json,html"
max_pardons = 15
The HTML reports now include trend charts so you can see your mutation score over time, which matters more than any single run's number. And as of v1.5.1 (shipped the next day), you can generate JSON and HTML reports in the same run for both human review and CI pipeline consumption.
We also updated the comparison page this week after getting feedback from the mutmut maintainer about architecture claims we'd gotten wrong. I'd rather be accurate than flattering to our own tool. The mutation testing space is small enough that good faith matters.
If you're using pytest and you've ever wondered whether your tests actually catch the bugs they're supposed to catch, give it a try. The zero-config experience is intentional. And if you hit a mutation that feels unfair, now you can pardon it, as long as you explain yourself.
GitHub: https://github.com/mikelane/pytest-gremlins
PyPI: pip install pytest-gremlins
Top comments (0)