Why MCP?
MCP stands for Model Context Protocol. Think of it as a standardised USB-C port for AI tools. Instead of every AI application inventing its own plugin system, MCP defines a common interface: a server exposes tools, a client (our agent here) discovers and calls them. The beauty is that the same MCP server can be reused by Claude Desktop, by a Strands agent, by any MCP-compatible client — without changing a single line of server code. This is the interoperability that the ecosystem has been missing. Here, we use FastMCP, a Python library that reduces an MCP server to almost nothing:
from mcp.server import FastMCP
mcp = FastMCP("Calculator Server")
@mcp.tool(description="Add two numbers together")
def add(x: float, y: float) -> float:
return x + y
if __name__ == "__main__":
mcp.run(transport="stdio")
That is a fully functional MCP server. The decorator @mcp.tool is all you need to expose a function to any MCP client.
Review: Our Analyst (Sub-agent)
We start with creating our first toolkit: an analyst. Running two terminals — one for the server, the other for the client — we immediately see the protocol in action.
The client (our agent here) connects, discovers the available tools (addition, subtraction, multiplication, and division), and calls them through the protocol (MCP). The agent calls a function without knowing whether the toolkit lives in a separate process or not.
The toolkit definitely resides within a server. The client (our agent here) receives a structured description and autonomously decides when and how to invoke the tool. No prompt engineering is needed to explain what the addition does.
Review: Our Weather Forecaster (Sub-agent)
The second toolkit is to introduce a weather forecaster that speaks the National Weather Service API.
The weather forecaster on a server exposes three tools:
-
get_alerts(state) - active weather alerts by US state code -
get_forecast(latitude, longitude) - multi-period forecast for a location -
get_current_weather(latitude, longitude) - live conditions from the nearest observation station
The implementation of the toolkit is fully async, using httpx for HTTP calls. We can call any external API, read from a database, run a subprocess - whatever our agent needs. The MCP layer is one communication contract between the agent (client) and a server where toolkits reside.
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location."""
points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
# format and return
Supervisor Agent (Head)
Now we connect the dots. A supervisor-mode Strands agent built upon the Amazon Bedrock (Claude Sonnet) is given access to controlling multiple toolkits simultaneously via MCP. When we ask the supervisor something like: “What should I have for lunch given today’s weather in New York?”, the supervisor will leverage the magic power of MCP to pass along the command to different toolkits (sub-agents).
The supervisor figures out that s/he needs to:
- Calls the weather forecaster to get current conditions in New York
- Reasons which food delivery service makes sense in terms of the current temperature and traffic conditions
- Returns a recommendation
Now, the agent is not just completing the prompt texts. Instead, it is orchestrating tool calls, chaining results, and synthesising an answer. The Strands SDK handles the entire tool call loop:
from strands import Agent
from strands.models import BedrockModel
from strands.tools.mcp import MCPClient
agent = Agent(
model=BedrockModel(model_id="anthropic.claude-sonnet-4-5"),
tools=[*calculator_tools, *weather_tools],
)
response = agent("What should I eat for lunch in New York today?")
Build A Text Analyzer (Sub-agent) From Scratch
We can build a text analyzer to shorten the length of a URL. More importantly, we can internalize the URL pattern.
The server exposes four tools: shorten_url, expand_url, get_url_stats, and analyze_text. The implementation uses an in-memory JSON store (with a note that production code would use a real database). Here is the shortener:
@mcp.tool(description="Shorten a long URL into a compact format")
def shorten_url(url: str, custom_code: str = None) -> Dict:
content = f"{url}{datetime.now().isoformat()}"
short_code = hashlib.md5(content.encode()).hexdigest()[:8]
# store and return
return {"short_url": f"short.ly/{short_code}"}
Key Takeaway: Any Python function can become an AI tool in minutes. Have a legacy internal API? Wrap it. Have a spreadsheet formula? Wrap it. The barrier is almost zero.
Launch Our Agentic App (Research Analyzer) via Bedrock AgentCore
From the development, we have our agent access toolkits via MCP locally over the stdio transport. On the other hand, the real production systems need:
- Persistent endpoints accessible over HTTPS
- IAM-based authentication (no open endpoints)
- Container packaging for reproducible deployments
- Auto-scaling based on load
Amazon Bedrock AgentCore Runtime handles all of this. The deployment script is deceptively short:
from bedrock_agentcore_starter_toolkit import Runtime
agentcore_runtime = Runtime()
response = agentcore_runtime.configure(
entrypoint="text_utils_server.py",
auto_create_execution_role=True,
auto_create_ecr=True,
requirements_file="requirements.txt",
protocol="MCP",
agent_name="text_utils_mcp",
)
launch_result = agentcore_runtime.launch()
print(f"Agent ARN: {launch_result.agent_arn}")
Under the hood, the bedrock_agentcore_starter_toolkit:
- Packages your server and its dependencies into a Docker container
- Pushes it to ECR (auto-created if it does not exist)
- Creates an IAM execution role with least-privilege permissions
- Deploys to AgentCore Runtime and returns a live ARN
After deployment (around 3–5 minutes), we update our agentic App configuration to point to the remote endpoint.
The remote MCP server is protected by AWS IAM via SigV4 signatures — the same signing mechanism used by every other AWS API. Here, we include a custom transport class that handles this transparently:
class SigV4HTTPXAuth(httpx.Auth):
def __init__(self, credentials, service, region):
self.signer = SigV4Auth(credentials, service, region)
def auth_flow(self, request):
aws_request = AWSRequest(
method=request.method,
url=str(request.url),
data=request.content,
headers=dict(request.headers),
)
self.signer.add_auth(aws_request)
request.headers.update(dict(aws_request.headers))
yield request
This extends HTTPX’s auth interface and plugs directly into the MCP StreamableHTTPTransport. Our agentic App continues to call tools with the exact same code - authentication is completely transparent at the transport layer. No API keys in environment variables, no token rotation to manage.
Once the remote server is alive, activating it in the agentic app is a toggle:
Putting Everything Altogether - Our Agentic App (Research Analyzer)
With the full stack in place — local servers, remote AgentCore server, SigV4 auth — our agentic app that combines all tools is now formed:
The "Research Analyzer" can:
- Fetch and summarise web content
- Analyse and transform text (via the remote AgentCore server)
- Shorten URLs from its research
- Check the weather for context-aware suggestions
All from natural language. All with tools that any developer could have written in an afternoon.
The Multi-Tool Orchestration Layer
Now, let's explore advanced implementation patterns — how to structure agents that manage many tools gracefully, handle tool failures, and maintain context across a conversation.
A few things worth highlighting from main.py:
-
Tool cache with TTL: tools are loaded once at startup and refreshed on demand, avoiding re-negotiation on every request -
Session-scoped agents: each user session gets its own agent instance with isolated memory -
Structured logging per server: each MCP server has its own log stream, invaluable for debugging multi-server interactions
def refresh_tools_cache():
global cached_tools, tools_last_updated
cached_tools = mcp_manager.get_all_tools(active_only=True)
tools_last_updated = datetime.now()
This pattern, loading tools once and reusing them, becomes critical at scale. Negotiating the MCP protocol on every request adds latency; caching eliminates it.
Key Takeaways
MCP is the right abstraction. The separation between tool definition (server) and tool consumption (client/agent) is clean and powerful. It enables a genuine ecosystem of reusable tools.
FastMCPmakes it trivial to start. Going from idea to working MCP server takes minutes, not days. The barrier to contributing tools to the ecosystem is essentially zero.AgentCorecloses the last mile. The hardest part of agentic AI in production is not the model or the tools — it is infrastructure: endpoints, auth, containers, scaling. AgentCore Runtime handles all of it with three lines of configuration.SigV4is the right auth for AWS-native stacks. No secret management overhead, no token rotation, automatic credential refresh. If your infrastructure is already on AWS, this is the obvious choice.












Top comments (0)