A single RAG agent is easy to ship.
But the moment you need multiple domains (HR, IT, …), different knowledge sources, and clean ownership boundaries, a single “do‑everything” agent turns into:
- an unmaintainable prompt,
- a tangled toolset,
- and a retrieval strategy that mixes unrelated documents.
This post shows a production-shaped approach:
- HR Agent: RAG over HR documents (AWS Knowledge Base ID configured via env)
- IT Agent: RAG over IT documents (different KB ID)
- Orchestrator Agent: routes queries by calling HR/IT agents over A2A, then synthesizes the final answer
All three are deployed as AgentCore Runtime apps (your “AgentCore patate” deployment unit): each handler is a BedrockAgentCoreApp entrypoint.
What we are building
Components
-
HR Agent
- Strands
Agent - Tool:
strands_tools.retrieve - Config:
KNOWLEDGE_BASE_ID(HR KB) - Exposed as an A2A endpoint via AgentCore Runtime
- Strands
-
IT Agent
- Same as HR agent
- Config:
KNOWLEDGE_BASE_ID(IT KB)
-
Orchestrator Agent
- Strands
Agent - Tools:
ask_hr_agent,ask_it_agent(A2A client calls) - Config:
HR_AGENT_URL,IT_AGENT_URL - Decides routing automatically (tool choice) and synthesizes a final answer
- Strands
Runtime request flow
- User sends a question to the Orchestrator (AgentCore Runtime endpoint)
- Orchestrator LLM decides whether to use:
ask_hr_agent()ask_it_agent()- or both
- Orchestrator calls the specialized agent(s) via A2A
- Specialized agent uses
retrievetool (RAG) against its Knowledge Base - Orchestrator produces a final, human‑readable answer
1 — Configuration (env vars)
This architecture is intentionally configuration-first. No hardcoded URLs or KB IDs.
Knowledge bases are created with S3 Vector base but outside of this topic since it's easy to create with 5 clics. Just collect the IDs once created.
HR Agent env
-
KNOWLEDGE_BASE_IDThe Knowledge Base ID for HR content.
IT Agent env
-
KNOWLEDGE_BASE_IDThe Knowledge Base ID for IT content.
Orchestrator env
HR_AGENT_URL
A2A endpoint for the HR Agent (for example, an AgentCore Runtime service URL).IT_AGENT_URL
A2A endpoint for the IT Agent.AWS_REGION(optional)
Defaults toeu-central-1in the handler.
2 — HR Agent handler (AgentCore Runtime + RAG)
This is the complete HR agent handler.
Key design choices:
- Lazy initialization: create the Strands agent only once per runtime container
-
Single tool:
retrieve - Strict behavior: if retrieval doesn’t find anything relevant, say so
"""
HR Agent handler for AgentCore Runtime deployment with A2A support.
Requirements: 1.1, 1.2, 1.4
"""
import os
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Configuration
KNOWLEDGE_BASE_ID = os.getenv("KNOWLEDGE_BASE_ID", "")
os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID
HR_SYSTEM_PROMPT = """
You are an expert HR assistant. You answer questions related to human resources,
HR policies, leave, benefits, recruitment, and people management.
Use the retrieval tool to search the HR knowledge base.
If you cannot find relevant information, clearly state it.
Always respond in a professional and concise manner.
"""
# Lazy agent initialization
_hr_agent = None
def get_hr_agent():
"""Lazy load HR agent."""
global _hr_agent
if _hr_agent is None:
from strands import Agent
from strands_tools import retrieve
_hr_agent = Agent(
tools=[retrieve],
system_prompt=HR_SYSTEM_PROMPT
)
logger.info("HR agent initialized")
return _hr_agent
def handle_request(prompt) -> str:
"""Handle HR agent requests."""
try:
prompt_str = str(prompt)
agent = get_hr_agent()
response = agent(prompt_str)
return response.message
except Exception as e:
logger.error(f"HR agent error: {e}")
return f"Erreur agent RH: {str(e)}"
# For AgentCore Runtime with A2A
try:
from bedrock_agentcore.runtime import BedrockAgentCoreApp
app = BedrockAgentCoreApp()
logger.info("HR BedrockAgentCoreApp initialized")
@app.entrypoint
def handler(prompt: str) -> str:
logger.info(f"HR received: {prompt[:100]}...")
return handle_request(prompt)
except ImportError as e:
logger.warning(f"bedrock_agentcore.runtime not available: {e}")
app = None
Why os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID?
Some tooling layers (including retrieval integrations) resolve configuration directly from process environment. By forcing it into os.environ, you ensure:
- consistent behavior in local runs
- consistent behavior in AgentCore Runtime containers
What “RAG” means here
In this handler, RAG is not manual (no explicit vector DB queries in your code).
Instead, the Strands agent uses the retrieve tool:
- the LLM decides when to call
retrieve - the tool pulls relevant chunks from your Knowledge Base
- the response is generated with that context
3 — IT Agent handler (AgentCore Runtime + RAG)
This is the complete IT agent handler.
Same architecture as HR:
- lazy init
- tool =
retrieve
"""
IT Agent handler for AgentCore Runtime deployment with A2A support.
Requirements: 2.1, 2.2, 2.4
"""
import os
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Configuration
KNOWLEDGE_BASE_ID = os.getenv("KNOWLEDGE_BASE_ID", "")
os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID
IT_SYSTEM_PROMPT = """
You are an expert IT assistant. You answer questions related to IT,
systems, software, technical support, cybersecurity, and infrastructure.
Use the retrieval tool to search the IT knowledge base.
If you cannot find relevant information, clearly state it.
Always respond in a professional and concise manner.
"""
# Lazy agent initialization
_it_agent = None
def get_it_agent():
"""Lazy load IT agent."""
global _it_agent
if _it_agent is None:
from strands import Agent
from strands_tools import retrieve
_it_agent = Agent(
tools=[retrieve],
system_prompt=IT_SYSTEM_PROMPT
)
logger.info("IT agent initialized")
return _it_agent
def handle_request(prompt) -> str:
"""Handle IT agent requests."""
try:
prompt_str = str(prompt)
agent = get_it_agent()
response = agent(prompt_str)
return response.message
except Exception as e:
logger.error(f"IT agent error: {e}")
return f"Erreur agent IT: {str(e)}"
# For AgentCore Runtime with A2A
try:
from bedrock_agentcore.runtime import BedrockAgentCoreApp
app = BedrockAgentCoreApp()
logger.info("IT BedrockAgentCoreApp initialized")
@app.entrypoint
def handler(prompt: str) -> str:
logger.info(f"IT received: {prompt[:100]}...")
return handle_request(prompt)
except ImportError as e:
logger.warning(f"bedrock_agentcore.runtime not available: {e}")
app = None
Why separate agents instead of one big agent with two KBs?
Because the separation buys you:
- independent prompts
- independent retrieval scopes (no “HR policy” leaking into IT answers)
- independent deployment/scaling
- simpler evaluation (HR answers vs IT answers)
4 — Orchestrator handler (AgentCore Runtime + A2A tools)
Now the interesting part: the orchestrator.
Your orchestrator is a Strands agent with two tools:
ask_hr_agentask_it_agent
Those tools are wrappers around an A2A client call.
Orchestrator key behaviors
- It does not use
retrieveitself - It delegates to specialized agents
- It may call both if needed
"""
Orchestrator Agent handler for AgentCore Runtime deployment.
Uses Strands Agent with HR/IT agents as tools - the model decides routing.
Requirements: 3.1, 3.2, 3.3, 3.4, 3.5
"""
import os
import logging
import asyncio
from typing import Any
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Configuration
AWS_REGION = os.getenv("AWS_REGION", "eu-central-1")
HR_AGENT_URL = os.getenv("HR_AGENT_URL", "")
IT_AGENT_URL = os.getenv("IT_AGENT_URL", "")
# Lazy initialization
_orchestrator_agent = None
async def _call_a2a(url: str, query: str) -> str:
"""Internal A2A call."""
from a2a.client import A2AClient
async with A2AClient(url) as client:
response = await client.send_message(query)
if hasattr(response, 'message') and response.message:
for part in response.message.parts:
if hasattr(part, 'text'):
return part.text
return str(response)
def ask_hr_agent(query: str) -> str:
"""
Ask the specialized HR agent.
Use this tool for questions about:
- HR policies, leave, payroll
- Recruitment, hiring, contracts
- Training, performance reviews
- Benefits, insurance, retirement
- Remote work, absences
"""
if not HR_AGENT_URL:
return "Erreur: Agent RH non configuré"
try:
return asyncio.run(_call_a2a(HR_AGENT_URL, query))
except Exception as e:
logger.error(f"HR agent error: {e}")
return f"Erreur agent RH: {str(e)}"
def ask_it_agent(query: str) -> str:
"""
Ask the specialized IT agent.
Use this tool for questions about:
- Computers, software, support
- Network, Wi‑Fi, VPN, internet
- Passwords, accounts, access
- Security, antivirus
- Email, Teams, apps
"""
if not IT_AGENT_URL:
return "Erreur: Agent IT non configuré"
try:
return asyncio.run(_call_a2a(IT_AGENT_URL, query))
except Exception as e:
logger.error(f"IT agent error: {e}")
return f"Erreur agent IT: {str(e)}"
ORCHESTRATOR_SYSTEM_PROMPT = """You are an orchestrator assistant for employees.
You have access to two specialized agents:
- HR Agent: for all human resources questions (leave, payroll, contracts, training, etc.)
- IT Agent: for all IT-related questions (software, hardware, technical support, etc.)
Analyze each question and use the appropriate agent(s) to answer.
If a question spans both domains, consult both agents.
After receiving the responses, synthesize a clear and complete final answer.
"""
def get_orchestrator_agent():
"""Get or create the orchestrator agent with HR/IT tools."""
global _orchestrator_agent
if _orchestrator_agent is None:
from strands import Agent
_orchestrator_agent = Agent(
tools=[ask_hr_agent, ask_it_agent],
system_prompt=ORCHESTRATOR_SYSTEM_PROMPT
)
logger.info("Orchestrator agent initialized with HR/IT tools")
return _orchestrator_agent
def handle_request(prompt) -> str:
"""Handle orchestrator requests."""
try:
prompt_str = str(prompt)
agent = get_orchestrator_agent()
response = agent(prompt_str)
return response.message
except Exception as e:
logger.error(f"Orchestration error: {e}")
return f"Erreur: {str(e)}"
# For AgentCore Runtime
try:
from bedrock_agentcore.runtime import BedrockAgentCoreApp
app = BedrockAgentCoreApp()
logger.info("Orchestrator BedrockAgentCoreApp initialized")
@app.entrypoint
def handler(prompt: str) -> str:
return handle_request(prompt)
except ImportError as e:
logger.warning(f"bedrock_agentcore.runtime not available: {e}")
app = None
Why implement HR/IT calls as tools?
Because it makes routing model-driven instead of hard-coded.
You’re giving the orchestrator:
- a system prompt describing the domains
- tool docstrings that describe when to use them
- a stable interface (
ask_hr_agent(query) -> str)
The LLM chooses the right tool(s), and you keep your Python logic minimal.
5 — A2A call parsing
A2A responses often come as structured messages with parts.
The parsing strategy is resilient:
if hasattr(response, 'message') and response.message:
for part in response.message.parts:
if hasattr(part, 'text'):
return part.text
return str(response)
Why this matters:
- You don’t assume a single “text blob” response
- You gracefully fallback to
str(response)
6 — Lazy initialization (production detail that matters)
All three handlers use lazy init (_agent = None).
Why it’s important in AgentCore Runtime:
- cold start cost is paid once per container
- subsequent requests reuse the same agent instance
- lower latency and less overhead
This is the kind of detail that separates a demo from something you can run for real users.
7 — How routing actually happens
You don’t have if "vpn" in query: ....
Instead:
- The orchestrator receives the question
- It decides which tool(s) to call based on system prompt + tool descriptions
- It synthesizes a final answer
Example behaviors:
- HR-only question → calls
ask_hr_agent - IT-only question → calls
ask_it_agent - Mixed question → calls both, then merges
8 — Failure modes and debugging
Your code already includes the essentials:
- INFO logging
- explicit “not configured” messages
- tool-level error handling
Common issues:
Missing URL
If HR_AGENT_URL is empty:
-
ask_hr_agentreturnsError: HR agent not configured
Network/A2A issues
If A2A fails:
- error is logged
- tool returns
Error HR agent: .../Error IT agent: ...
The orchestrator can still produce a partial answer if only one agent fails.
9 — Practical hardening ideas (optional)
These keep the same architecture, but improve resilience:
- Timeouts & retries for A2A calls (network I/O).
- Avoid
asyncio.run()if you ever run inside an existing event loop. - Return structured tool outputs (citations, metadata) if you need auditability later.
Why this architecture scales
Adding a new domain agent (Finance / Legal / Security) is just repetition of the same pattern:
1) new RAG agent handler with retrieve + domain prompt + KNOWLEDGE_BASE_ID
2) new orchestrator tool ask_finance_agent calling A2A
3) add tool to orchestrator’s tools=[...]
No refactor required.
Final thoughts
If you already know how to build RAG (or deterministic tools), the next step is orchestration.
This pattern gives you:
- domain isolation
- deployable units (AgentCore Runtime)
- model-driven routing (tools)
- a clean evolution path
It’s basically microservices:
small, focused services with a clear interface — except the services can reason.
Top comments (0)