MCP tools freeze AI agents when external APIs are slow, causing 424 errors. The async handleId pattern returns immediately with a job ID and polls for results without blocking.
MCP tool timeout occurs when an AI agent calls a Model Context Protocol (MCP) tool that depends on a slow external API. The tool blocks the agent indefinitely instead of returning an error. The result is a 424 (Failed Dependency) error or a frozen workflow with no user feedback. This post shows the problem with real scenarios and how the async handleId pattern provides immediate responses.
This demo uses Strands Agents with MCP (Model Context Protocol). The async pattern is framework-agnostic and applies to any agent that calls external APIs through MCP.
Working code: github.com/aws-samples/sample-why-agents-fail
Series: Why AI Agents Fail
- Context Window Overflow — Memory Pointer Pattern for large data
- MCP Tools That Never Respond (this post) — Async pattern for slow external APIs
- AI Agent Reasoning Loops — Detect and block repeated tool calls
The Problem: MCP Tools That Never Respond
The Model Context Protocol (MCP) enables AI agents to call external tools. But when those tools depend on slow APIs, the entire agent workflow freezes. The agent waits. The user waits. Nothing happens.
Community observation from Octopus (Resilient AI Agents With MCP, 2025) identifies the core issue: as external system integrations increase, so does the likelihood of failure. Systems become unavailable, slow to respond, or return errors. Agents have no built-in strategy to handle this.
OpenAI Community reports confirm the real-world impact:
- 424 errors when MCP tools take too long
- Unresponsive states where requests neither succeed nor fail
- Tools that pass handshake validation but timeout during execution
Why This Happens
MCP expects tools to respond quickly. When a tool calls a slow external API.
The MCP protocol has implicit timeout expectations. If the tool doesn't respond within ~7-10 seconds, the connection may drop with a 424 (Failed Dependency) error. The agent receives an error instead of data, and the user gets no useful response.
Three failure modes:
- Slow API — Tool waits 15+ seconds, poor UX but eventually responds
- Failing API — External service unavailable, 424 error after timeout
- Unresponsive state — Request accepted but never returns, requires session restart
The Demo: Simulating Real Timeout Scenarios
We built an MCP server that simulates these real-world scenarios:
from mcp.server import FastMCP
import asyncio
# FastMCP is a lightweight MCP server framework — tools are registered with @mcp.tool()
mcp = FastMCP("Timeout Demo Server")
# Baseline: responds in 1s, well within MCP's implicit timeout threshold (~7-10s)
@mcp.tool(description="Fast API - responds in 1 second")
async def fast_api(query: str) -> str:
await asyncio.sleep(1)
return f"Fast result for: {query}"
# Problem case: 15s delay exceeds MCP timeout — the agent freezes waiting for this
@mcp.tool(description="Slow API - responds in 15 seconds")
async def slow_api(query: str) -> str:
await asyncio.sleep(15) # Simulates a slow external service (data pipeline, batch job)
return f"Slow result for: {query}"
# Failure case: 7s delay triggers the timeout, then raises Failed Dependency (424)
@mcp.tool(description="Failing API - returns 424 after delay")
async def failing_api(query: str) -> str:
await asyncio.sleep(7)
raise Exception("Failed Dependency: External service unavailable")
The Async HandleId Solution
Instead of waiting for slow operations, return immediately with a tracking ID:
import uuid
# In-memory job store: maps job_id → {status, query, result}
# For production, replace with a persistent store (Redis, DynamoDB) for durability across restarts
JOBS = {}
# The handleId pattern: return a tracking ID immediately instead of blocking
@mcp.tool(description="Start a long-running job, returns immediately with job ID")
async def start_async_job(query: str) -> str:
job_id = str(uuid.uuid4())[:8] # Short ID the LLM can pass in follow-up calls
JOBS[job_id] = {"status": "processing", "query": query}
# Fire-and-forget: slow work runs in background, tool returns before it finishes
asyncio.create_task(do_work(job_id, query))
# The agent receives this in < 1s — no timeout, no frozen UI
return f"Job started: {job_id}. Use check_job_status to poll for results."
# Polling endpoint: the agent calls this repeatedly until status is "completed"
@mcp.tool(description="Check status of a running job")
async def check_job_status(job_id: str) -> str:
job = JOBS.get(job_id)
if not job:
return f"Job {job_id} not found"
if job["status"] == "completed":
return f"COMPLETED: {job['result']}" # Return the actual result to the agent
return f"PROCESSING: Job {job_id} still running" # Agent polls again after a short wait
Demo Results
We tested all four scenarios with a Strands Agent connected to the MCP server:
| Scenario | Response Time | User Experience | Research Finding |
|---|---|---|---|
| Fast API (1s delay) | 3.2s total | ✅ Good UX | Baseline |
| Slow API (15s delay) | 17.8s total | ❌ Poor UX — agent waits | Octopus: "agent waits indefinitely" |
| Failing API (424) | 7.7s total | ❌ Error after wait | OpenAI Community: 424 errors |
| Async pattern (handleId) | 3.7s total | ✅ Immediate response | Solution: "respond ASAP with handleId" |
The async pattern transforms a 17.8s wait into a 3.7s immediate response. The agent tells the user "job started" and can check status later, with no frozen UI and no timeout errors.
Why Strands Agents for MCP Integration?
The MCPClient connects to any MCP server in two lines. The agent discovers available tools at runtime through list_tools_sync(), so you don't maintain a hardcoded tool list. When the MCP server implements the async handleId pattern, the agent polls automatically without extra orchestration code.
Strands supports multiple model providers (OpenAI, Amazon Bedrock, Anthropic, Ollama). The MCP timeout patterns shown here work identically across all providers.
When to Use Each Pattern
Direct call (fast tools < 5s):
- Lookups, calculations, small API calls
- No timeout risk
Async handleId (slow tools > 5s):
- External API calls with unpredictable latency
- Data processing, report generation
- Any operation that might exceed MCP timeout
Retry with backoff (intermittent failures):
- Services that occasionally fail but recover
- Network-dependent operations
Try It Yourself
You need Python 3.9+, uv, and an OpenAI API key. The MCP server runs locally as a subprocess, so no external services are needed.
git clone https://github.com/aws-samples/sample-why-agents-fail
cd sample-why-agents-fail/stop-ai-agents-wasting-tokens/02-mcp-timeout-demo
uv venv && uv pip install -r requirements.txt
export OPENAI_API_KEY="your-key-here"
uv run python test_mcp_timeout.py # Runs all 4 scenarios
Or open test_mcp_timeout.ipynb in Jupyter, JupyterLab, VS Code, or your preferred notebook environment.
Key Takeaways
- MCP tools timeout silently — 424 errors with no recovery
- Slow APIs freeze the entire agent — 17.8s wait with no feedback
- Async handleId pattern solves it — immediate response, poll for results
- Design for failure — every external call can timeout, plan accordingly
Frequently Asked Questions
What causes 424 errors in MCP tool calls?
A 424 (Failed Dependency) error occurs when an MCP tool takes longer than the implicit timeout threshold (typically 7-10 seconds) to respond. The MCP protocol expects tools to return quickly. When an external API blocks the tool beyond this threshold, the connection drops and the agent receives a 424 error instead of data.
When should I use the async handleId pattern instead of a direct MCP tool call?
Use the async handleId pattern for any tool that calls an external API with unpredictable latency: data processing, report generation, third-party service calls, or any operation that might exceed 5 seconds. For fast lookups, calculations, and small API calls under 5 seconds, direct calls work fine.
Does the async handleId pattern work with any MCP server, not only Strands?
Yes. The async handleId pattern is an MCP server design pattern, not a framework feature. Any MCP-compatible agent can call start_long_job and check_job_status tools. The pattern works with OpenAI Agents, LangChain MCP integrations, and any client that supports the Model Context Protocol.
References
Research
- Resilient AI Agents With MCP: Timeout And Retry Strategies — Octopus blog (community observation), May 2025
- Call remote MCP server tool timed out, error 424 — OpenAI Community (community forum)
- Handling Timeouts with Long-Running MCP Connectors — OpenAI Community (community forum), Dec 2025
- Build Timeout-Proof MCP Tools — Arsturn (community observation)
Implementation
- Strands MCP Tools — Connect any MCP server
- Strands Model Providers — Swap to Amazon Bedrock, Anthropic, Ollama
Gracias!



Top comments (0)