Yes, the title says "5 strategies" like every other listicle. The number isn't a framework. It's just how many I got through before my API bill made me pause. There are plenty more approaches worth testing. If you've benchmarked others or have a strategy that works well for you, I'd genuinely like to hear about it.
Telling an agent to "edit the file" is easy. Being sure the result is correct is hard.
I've been using Claude Code daily for months. One pattern kept showing up: the agent says "done," I commit, and later I find lines missing from the middle of the file. Or a formatter ran between edits and the next match fails silently.
So I tested it systematically. 5 strategies, 20 scenarios, two file sizes (378 and 1053 lines), with 5 and 10 changes each.
The 5 Strategies
Sequential Edit: One Edit call per change, top to bottom. Simple, but line numbers drift after insertions.
- Atomic Write: Read once, rewrite entire file. Fewest tool calls, but token cost explodes on large files and middle content can silently disappear (the "lost-in-the-middle" problem).
- Bottom-up Edit: Same as Sequential, but changes applied from bottom to top. Eliminates line drift because lower edits don't shift upper line numbers.
- Script Generation: Agent writes a shell script with sed commands. File content never enters the token stream.
- Unified Diff: Agent generates a patch file, applied with patch. Standard format, reversible.
Results:
1053-line file, 10 changes:
| Strategy | Tokens | Duration | Tool Calls |
|---|---|---|---|
| Script Generation | 7,000 | 10s | 2 |
| Unified Diff | 8,500 | 12s | 2 |
| Sequential Edit | 25,000 | 65s | 11 |
| Bottom-up Edit | 25,000 | 65s | 11 |
| Atomic Write | 43,000 | 50s | 2 |
Script Generation: 3.5x cheaper and 6.5x faster than Sequential Edit on the same task.
The Decision Table
| - | 1-2 changes | 3-5 changes | 6+ changes |
|---|---|---|---|
| < 300 lines | Edit | Script / Diff | Script |
| 300-1000 lines | Edit | Script / Diff | Script |
| > 1000 lines | Edit | Script | Script |
The Missing Piece: Deterministic Protection
Strategy choice helps, but agents still pick wrong sometimes. I built edit-guard, a hook that runs after every Edit/Write call and catches three failure modes:
- Consecutive edit counter: Warns at 3, blocks at 5 sequential edits on the same file
- Line count verification: Flags unexpected line count changes after Write
- Lost-in-the-middle detection: Catches empty blocks and repeated patterns from truncation
It's a Claude Code PostToolUse hook. Deterministic, not probabilistic. The agent choosing the right strategy is probabilistic. The hook catching a bad outcome is guaranteed.
Source code and full benchmark data: github.com/ceaksan/edit-guard
Top comments (3)
Really appreciate the actual numbers here — most "AI coding" posts skip the benchmarks and go straight to opinions.
The Script Generation results are eye-opening. 3.5x cheaper and 6.5x faster than Sequential Edit is a massive difference when you're running agents at scale. I manage a bunch of automated tasks that touch config files and template code across a large static site, and the "lost-in-the-middle" problem with Atomic Write has bitten me more than once — the agent confidently says "done" and a chunk of the file just vanished.
The PostToolUse hook as a deterministic safety net is the real gem here. Probabilistic strategy selection + deterministic validation is a pattern more people should adopt. Going to check out edit-guard — thanks for open-sourcing it.
Thanks @apex_stack. The lost-in-the-middle problem gets especially nasty with config and template files. Repetitive structures (similar key-value blocks, repeated sections) make the model treat parts of the file as redundant and silently drop them.
For large-scale automated tasks, Script Generation largely avoids this by keeping file content out of the token window. That said, it struggles with context-aware rewrites, so it really depends on the type of changes your agents are making.
If you run into edge cases with edit-guard, feel free to open an issue. The thresholds may need tuning for things like dynamic imports or generated sections.
Good point about config and template files — that's exactly the type of content where repetitive structures trip up the model. I've seen similar issues with i18n translation files where the model merges or drops locale keys because they all look structurally identical.
The tradeoff between Script Generation (keeps file content out of token window) and context-aware edits is a useful mental model. For my use case most automated changes are structural — updating metadata fields, inserting sections into templates — so Script Generation sounds like the better fit for the bulk operations. I'll save the smarter strategies for the one-off refactors that need reasoning about surrounding code.
Thanks for the offer on edit-guard — will definitely open an issue if I hit edge cases. Great work on this.