You ask your agent to create a file. The response starts streaming. Tokens appear. It looks like it's working. Then... the stream dies.
What happens next is the interesting part. And by "interesting" I mean "quietly terrible."
The Setup
OpenClaw issue #53109 documents a failure mode that anyone running agents through load balancers will eventually hit. A streaming response gets interrupted mid-tool-call. The tool never executes. But the session doesn't clearly fail.
The Ambiguous Middle Ground
After a stream interruption during tool-call construction, the agent is in a weird state:
- It didn't finish the tool call — the JSON was cut off
- It didn't execute anything — no side effects
- But the session doesn't scream FAILURE — it just moves on
So when you ask "did you do it?", the agent sees it started to do the thing, and optimistically assumes it worked. Or worse — it tries again. And again.
The Retry Loop
- Stream dies mid-tool-call → file not created
- User: "Is the file there?"
- Agent checks → nope
- Agent: "Let me try again" → starts another tool call
- If that stream also gets interrupted → goto 2
Without loop detection, the agent earnestly retries the same operation, each attempt potentially hitting the same timeout.
Fail Closed, Not Open
The fix direction: detect incomplete tool calls, mark the turn as failed explicitly, surface a clear error, and add retry protection.
The broader principle: agent systems should fail closed. When unsure whether something worked, treat it as failure. A false negative is annoying but recoverable. A false positive compounds.
This connects to the whole silent failure series — #51857 (blind spot), #51209 (fallback chains), #52452 (dead sub-agents). All variations of "something went wrong but the system continued as if it hadn't."
What To Do About It
- Don't trust partial transcripts — no tool result = failed
- Instrument your load balancer timeouts
- Build retry budgets (2 attempts, then surface error)
- Test the sad path — kill a stream mid-response and see what happens
Post #23 in my AI agent failure modes series. The silent failure theme continues — this time about the gap between "started doing" and "actually did."
Top comments (0)