In 2026, I want my AI coding agents to have one more rule: know when to stop.
AI agents do not always fail by stopping.
Sometimes they fail by continuing.
I ran into this while building a custom Cyrillic font extension for a real brand system. The task looked concrete: make Cyrillic letters, Latin letters, numerals, and special symbols feel like one editorial type family.
Claude Code and Codex kept working. They generated files, exported proofs, reported progress, and fixed the last visible complaint.
But the same defect class kept returning.
That is the dangerous version of an AI-agent failure loop: the workflow looks productive while the real quality problem survives.
What is a failure loop?
A failure loop is a repeated pattern where an agent keeps producing new candidate fixes while the same underlying defect remains unresolved.
It usually has five steps:
- The user rejects the same kind of defect again.
- The agent patches the latest symptom.
- The proof gate is too weak to catch the issue.
- The agent asks for another manual review.
- Everyone spends another cycle on the same problem.
One mistake is normal.
The real process bug appears when the agent continues after its validation system has already failed.
Why normal proof loops can fail
Proof loops are useful. Tests, screenshots, build checks, linting, diffs, and generated reports all matter.
But proof loops can also become theater if they measure the wrong thing.
In my font project, the agent could prove that the font compiled, the PDF rendered, the screenshot existed, bounding boxes changed, and a numeric score improved.
That did not prove the letters looked right.
Users were rejecting a different thing: visual consistency.
Some Cyrillic glyphs felt too short, too thick, too loosely spaced, or structurally wrong next to Latin letters.
If the gate cannot see the defect the human keeps seeing, the gate is not allowed to declare the task done.
The rule I now use
After the same visible defect class appears twice, stop normal implementation.
Do not make one more speculative patch.
Do not relax the threshold.
Do not ask the user to inspect another candidate artifact.
Switch into failure-loop breaker mode.
What a failure-loop breaker does
A failure-loop breaker is a hard mode switch for AI-agent work.
A better next output is a diagnostic package, not another candidate fix.
It should include:
- the repeated failure class;
- a rejected corpus of known-bad examples;
- a red-first gate that fails on those examples;
- a fix that turns the gate green;
- blind or independent validation when the author has seen the answer;
- a clear continue, stop, or human-decision recommendation.
This is not only a retry limit.
A retry limit stops cost growth. A failure-loop breaker changes the work itself.
The red-first gate matters
A useful gate must fail before the fix, because otherwise it has not proven that it can see the old failure.
If the agent cannot make the new checker fail on previous bad artifacts, it has not built a checker for the real problem.
Many agent workflows skip this part.
They add a new metric, see the new candidate score higher, and call it progress. The metric was never forced to reject the old failure.
For subjective or visual tasks, this matters even more because the rejected corpus becomes the bridge between human taste and deterministic validation.
When the agent is contaminated
Another trap is contaminated validation: the same agent writes the fix, knows the target, and grades the result.
That can be useful during iteration, but it is not independent validation.
If the agent has already seen the expected answer, the final check needs a deterministic gate with withheld examples, a blind reviewer, a separate model that does not receive the author reasoning, or a human decision when the requirement is taste rather than computation.
Same-author validation is often self-consistency, not proof.
I packaged this as a small public skill
I turned the rule into a small public repo:
https://github.com/g-shevchenko/agent-failure-loop-breaker
It installs a compact skill and repo-local rules for Claude Code, Codex, Cursor, and Windsurf.
Its installed rule is deliberately simple:
If the same defect class appears twice, the agent must stop normal patching and build a rejected corpus plus a red-first gate before continuing.
This package is not meant to make the model smarter.
It makes the workflow less willing to confuse motion with progress.
Where companies go wrong
Teams often treat agent persistence as an asset by default.
That is reasonable for well-scoped implementation tasks with strong tests. It is risky for work where the acceptance criterion is visual, editorial, architectural, or operational.
If Claude Code, Codex, Cursor, or Windsurf keeps failing the same class of review, the next investment should go into the validation contract.
The best prompt in the world will still loop when the gate rewards the wrong artifact.
Where this helps
This pattern is useful for UI polish loops, visual regression work, PDF and presentation generation, typography systems, content QA, and agentic coding tasks where the same bug returns.
Here is the signal:
If the user says “this is still the same problem” twice, the process should change.
Practical takeaway
Do not ask an AI agent to “keep trying” forever.
Ask it to prove that its checker can catch the last failed attempt.
If it cannot, the next task is not implementation.
At that point, the next task is building a better gate.
Full write-up:
https://gregshevchenko.com/notes/ai-agent-failure-loop-breakers/
Top comments (0)