The fix is five lines in your CLAUDE.md, not a settings change
Note: This is a personal workaround based on my own experience. Your mileage may vary.
If you use Claude Code for anything longer than a short conversation, you have probably seen this:
API Error: Stream idle timeout - partial response received
It cuts off mid-response. Your work disappears. The retry often fails the same way. There is no recovery button. As of April 2026, it is one of the most reported bugs on the Claude Code GitHub repo, with multiple open issues going back months.
The issue has been more common since the launch of Claude Opus 4.7. Several GitHub issues filed since mid-April specifically name Opus 4.7 and the 1M context variant as triggers, and recent Claude Code changelogs show stream-handling improvements shipping. The bug also shows up in regular Claude chat sessions during long outputs, but Claude Code is where it hits hardest because of the heavy tool-call chains.
I hit it repeatedly while using Claude Code for multi-file projects. After losing work three or four times in a row, I started experimenting with prompt-level instructions that prevent the timeout from firing in the first place. The trick is not to fix the timeout. The trick is to never trigger it.
Why it happens
The timeout fires when Claude Code's streaming connection goes idle for too long during a single response. Long outputs are the trigger. If Claude tries to write a 300-line file in one tool call, or runs a grep that dumps hundreds of lines, or chains multiple heavy tool calls without pausing, the stream stalls and the connection drops.
The bug is worse in longer sessions. After 20 or more tool calls in a single conversation, the probability of hitting it goes up noticeably.
The fix: add these instructions to your CLAUDE.md
Create or open a CLAUDE.md file in your project root. Add this block:
## Stream Timeout Prevention
1. Do each numbered task ONE AT A TIME. Complete one task fully,
confirm it worked, then move to the next.
2. Never write a file longer than ~150 lines in a single tool call.
If a file will be longer, write it in multiple append/edit passes.
3. Start a fresh session if the conversation gets long (20+ tool calls).
The error gets worse as the session grows.
4. Keep individual grep/search outputs short. Use flags like
`--include` and `-l` (list files only) to limit output size.
5. If you do hit the timeout, retry the same step in a shorter form.
Don't repeat the entire task from scratch.
That is it. Claude Code reads CLAUDE.md at the start of every session and follows the instructions as constraints. These five rules keep each streaming chunk small enough that the idle timeout never fires.
Why this works
Each rule targets a specific trigger:
Rule 1 prevents Claude from batching multiple tasks into one giant response. Instead of "create three files, run tests, and fix the errors" in a single output, it does one step, confirms, then moves on. Smaller outputs, no stall.
Rule 2 is the most important one. A 300-line file write is the single most common trigger for the timeout. Splitting it into two 150-line passes keeps each chunk under the threshold.
Rule 3 addresses session degradation. I have not seen Anthropic document this publicly, but in my experience the timeout becomes almost guaranteed after about 20 tool calls in a single session. Starting fresh resets whatever internal state is accumulating.
Rule 4 catches the other common trigger: unbounded search output. A recursive grep that returns 500 lines of matches will stall the stream just as badly as a long file write.
Rule 5 saves you from the retry death spiral. When you hit the timeout and retry the exact same prompt, you get the exact same stall. Retrying with a shorter version of the same step usually works on the first try.
What I tried that did not work
Before landing on the CLAUDE.md approach, I tried several other things:
-
Increasing
CLAUDE_STREAM_IDLE_TIMEOUT_MS: This is a terminal CLI environment variable and does not always resolve the issue. - Switching browsers: Same behavior in Chrome, Firefox, and Safari.
- Switching models: Happens on both Opus and Sonnet.
- Shorter prompts: The prompt length is not the issue. The output length is.
The CLAUDE.md approach works because it constrains the output at the source. Claude follows the instructions before it starts generating, so the stream never gets long enough to stall.
Worth noting
Recent Claude Code changelogs show stream-handling improvements shipping regularly. This CLAUDE.md workaround is a bridge for the meantime, not a permanent solution. Once the platform-level fix ships, you can remove the block.
If you are using Claude Code for real work today, adding these five lines saves a lot of frustration while improvements are in progress.

Top comments (0)