Meidi Airouche for Onepoint

Posted on Jan 14 • Edited on Jan 15

Multi-Agent Platform with A2A, Python, Strands & AWS AgentCore

#python #aws #agents #rag

A single RAG agent is easy to ship.

But the moment you need multiple domains (HR, IT, …), different knowledge sources, and clean ownership boundaries, a single “do‑everything” agent turns into:

an unmaintainable prompt,
a tangled toolset,
and a retrieval strategy that mixes unrelated documents.

This post shows a production-shaped approach:

HR Agent: RAG over HR documents (AWS Knowledge Base ID configured via env)
IT Agent: RAG over IT documents (different KB ID)
Orchestrator Agent: routes queries by calling HR/IT agents over A2A, then synthesizes the final answer

All three are deployed as AgentCore Runtime apps : each handler is a BedrockAgentCoreApp entrypoint.

Strands Agents is a Python SDK for building LLM-based production-ready agents as first-class software components.
It abstracts prompt management, tool invocation, and agent-to-agent communication, enabling to compose complex agent systems without hard-coding control flows or glue logic.

What we are building

Components

HR Agent
- Strands Agent
- Tool: strands_tools.retrieve
- Config: KNOWLEDGE_BASE_ID (HR KB)
- Exposed as an A2A endpoint via AgentCore Runtime
IT Agent
- Same as HR agent
- Config: KNOWLEDGE_BASE_ID (IT KB)
Orchestrator Agent
- Strands Agent
- Tools: ask_hr_agent, ask_it_agent (A2A client calls)
- Config: HR_AGENT_URL, IT_AGENT_URL
- Decides routing automatically (tool choice) and synthesizes a final answer

Runtime request flow

User sends a question to the Orchestrator (AgentCore Runtime endpoint)
Orchestrator LLM decides whether to use:
- ask_hr_agent()
- ask_it_agent()
- or both
Orchestrator calls the specialized agent(s) via A2A
Specialized agent uses retrieve tool (RAG) against its Knowledge Base
Orchestrator produces a final, human‑readable answer

1 — Configuration (env vars)

This architecture is intentionally configuration-first. No hardcoded URLs or KB IDs.

Knowledge bases are created with S3 Vector base but outside of this topic since it's easy to create with 5 clics. Just collect the IDs once created.

The retrieve tool provided by strands_tools is automatically configured at runtime using KNOWLEDGE_BASE_ID, which AgentCore resolves to the correct vector store and injects into the retrieval tool execution.

HR Agent env

KNOWLEDGE_BASE_ID The Knowledge Base ID for HR content.

IT Agent env

KNOWLEDGE_BASE_ID The Knowledge Base ID for IT content.

Orchestrator env

HR_AGENT_URL

A2A endpoint for the HR Agent (for example, an AgentCore Runtime service URL).
IT_AGENT_URL

A2A endpoint for the IT Agent.
AWS_REGION (optional)

Defaults to eu-central-1 in the handler.

2 — HR Agent handler (AgentCore Runtime + RAG)

This is the complete HR agent handler.

Key design choices:

Lazy initialization: create the Strands agent only once per runtime container
Single tool: retrieve
Strict behavior: if retrieval doesn’t find anything relevant, say so

"""
HR Agent handler for AgentCore Runtime deployment with A2A support.

Requirements: 1.1, 1.2, 1.4
"""

import os
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Configuration
KNOWLEDGE_BASE_ID = os.getenv("KNOWLEDGE_BASE_ID", "")
os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID

HR_SYSTEM_PROMPT = """
You are an expert HR assistant. You answer questions related to human resources,
HR policies, leave, benefits, recruitment, and people management.

Use the retrieval tool to search the HR knowledge base.
If you cannot find relevant information, clearly state it.

Always respond in a professional and concise manner.
"""

# Lazy agent initialization
_hr_agent = None


def get_hr_agent():
    """Lazy load HR agent."""
    global _hr_agent
    if _hr_agent is None:
        from strands import Agent
        from strands_tools import retrieve

        _hr_agent = Agent(
            tools=[retrieve],
            system_prompt=HR_SYSTEM_PROMPT
        )
        logger.info("HR agent initialized")
    return _hr_agent


def handle_request(prompt) -> str:
    """Handle HR agent requests."""
    try:
        prompt_str = str(prompt)
        agent = get_hr_agent()
        response = agent(prompt_str)
        return response.message
    except Exception as e:
        logger.error(f"HR agent error: {e}")
        return f"Erreur agent RH: {str(e)}"


# For AgentCore Runtime with A2A
try:
    from bedrock_agentcore.runtime import BedrockAgentCoreApp
    app = BedrockAgentCoreApp()
    logger.info("HR BedrockAgentCoreApp initialized")


    @app.entrypoint
    def handler(prompt: str) -> str:
        logger.info(f"HR received: {prompt[:100]}...")
        return handle_request(prompt)
except ImportError as e:
    logger.warning(f"bedrock_agentcore.runtime not available: {e}")
    app = None

Why `os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID`?

Some tooling layers (including retrieval integrations) resolve configuration directly from process environment. By forcing it into os.environ, you ensure:

consistent behavior in local runs
consistent behavior in AgentCore Runtime containers

What “RAG” means here

In this handler, RAG is not manual (no explicit vector DB queries in your code).
Instead, the Strands agent uses the retrieve tool:

the LLM decides when to call retrieve
the tool pulls relevant chunks from your Knowledge Base
the response is generated with that context

3 — IT Agent handler (AgentCore Runtime + RAG)

This is the complete IT agent handler.

Same architecture as HR:

lazy init
tool = retrieve

"""
IT Agent handler for AgentCore Runtime deployment with A2A support.

Requirements: 2.1, 2.2, 2.4
"""

import os
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Configuration
KNOWLEDGE_BASE_ID = os.getenv("KNOWLEDGE_BASE_ID", "")
os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID

IT_SYSTEM_PROMPT = """
You are an expert IT assistant. You answer questions related to IT,
systems, software, technical support, cybersecurity, and infrastructure.

Use the retrieval tool to search the IT knowledge base.
If you cannot find relevant information, clearly state it.

Always respond in a professional and concise manner.
"""

# Lazy agent initialization
_it_agent = None


def get_it_agent():
    """Lazy load IT agent."""
    global _it_agent
    if _it_agent is None:
        from strands import Agent
        from strands_tools import retrieve

        _it_agent = Agent(
            tools=[retrieve],
            system_prompt=IT_SYSTEM_PROMPT
        )
        logger.info("IT agent initialized")
    return _it_agent


def handle_request(prompt) -> str:
    """Handle IT agent requests."""
    try:
        prompt_str = str(prompt)
        agent = get_it_agent()
        response = agent(prompt_str)
        return response.message
    except Exception as e:
        logger.error(f"IT agent error: {e}")
        return f"Erreur agent IT: {str(e)}"


# For AgentCore Runtime with A2A
try:
    from bedrock_agentcore.runtime import BedrockAgentCoreApp
    app = BedrockAgentCoreApp()
    logger.info("IT BedrockAgentCoreApp initialized")


    @app.entrypoint
    def handler(prompt: str) -> str:
        logger.info(f"IT received: {prompt[:100]}...")
        return handle_request(prompt)
except ImportError as e:
    logger.warning(f"bedrock_agentcore.runtime not available: {e}")
    app = None

Why separate agents instead of one big agent with two KBs?

Because the separation buys you:

independent prompts
independent retrieval scopes (no “HR policy” leaking into IT answers)
independent deployment/scaling
simpler evaluation (HR answers vs IT answers)

4 — Orchestrator handler (AgentCore Runtime + A2A tools)

Now the interesting part: the orchestrator.

Your orchestrator is a Strands agent with two tools:

ask_hr_agent
ask_it_agent

Those tools are wrappers around an A2A client call.

Orchestrator key behaviors

It does not use retrieve itself
It delegates to specialized agents
It may call both if needed

"""
Orchestrator Agent handler for AgentCore Runtime deployment.

Uses Strands Agent with HR/IT agents as tools - the model decides routing.

Requirements: 3.1, 3.2, 3.3, 3.4, 3.5
"""

import os
import logging
import asyncio
from typing import Any

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Configuration
AWS_REGION = os.getenv("AWS_REGION", "eu-central-1")
HR_AGENT_URL = os.getenv("HR_AGENT_URL", "")
IT_AGENT_URL = os.getenv("IT_AGENT_URL", "")

# Lazy initialization
_orchestrator_agent = None


async def _call_a2a(url: str, query: str) -> str:
    """Internal A2A call."""
    from a2a.client import A2AClient

    async with A2AClient(url) as client:
        response = await client.send_message(query)
        if hasattr(response, 'message') and response.message:
            for part in response.message.parts:
                if hasattr(part, 'text'):
                    return part.text
        return str(response)


def ask_hr_agent(query: str) -> str:
    """
    Ask the specialized HR agent.

    Use this tool for questions about:
    - HR policies, leave, payroll
    - Recruitment, hiring, contracts
    - Training, performance reviews
    - Benefits, insurance, retirement
    - Remote work, absences
    """
    if not HR_AGENT_URL:
        return "Erreur: Agent RH non configuré"

    try:
        return asyncio.run(_call_a2a(HR_AGENT_URL, query))
    except Exception as e:
        logger.error(f"HR agent error: {e}")
        return f"Erreur agent RH: {str(e)}"


def ask_it_agent(query: str) -> str:
    """
    Ask the specialized IT agent.

    Use this tool for questions about:
    - Computers, software, support
    - Network, Wi‑Fi, VPN, internet
    - Passwords, accounts, access
    - Security, antivirus
    - Email, Teams, apps
    """
    if not IT_AGENT_URL:
        return "Erreur: Agent IT non configuré"

    try:
        return asyncio.run(_call_a2a(IT_AGENT_URL, query))
    except Exception as e:
        logger.error(f"IT agent error: {e}")
        return f"Erreur agent IT: {str(e)}"


ORCHESTRATOR_SYSTEM_PROMPT = """You are an orchestrator assistant for employees.

You have access to two specialized agents:
- HR Agent: for all human resources questions (leave, payroll, contracts, training, etc.)
- IT Agent: for all IT-related questions (software, hardware, technical support, etc.)

Analyze each question and use the appropriate agent(s) to answer.
If a question spans both domains, consult both agents.

After receiving the responses, synthesize a clear and complete final answer.
"""


def get_orchestrator_agent():
    """Get or create the orchestrator agent with HR/IT tools."""
    global _orchestrator_agent
    if _orchestrator_agent is None:
        from strands import Agent

        _orchestrator_agent = Agent(
            tools=[ask_hr_agent, ask_it_agent],
            system_prompt=ORCHESTRATOR_SYSTEM_PROMPT
        )
        logger.info("Orchestrator agent initialized with HR/IT tools")
    return _orchestrator_agent


def handle_request(prompt) -> str:
    """Handle orchestrator requests."""
    try:
        prompt_str = str(prompt)
        agent = get_orchestrator_agent()
        response = agent(prompt_str)
        return response.message
    except Exception as e:
        logger.error(f"Orchestration error: {e}")
        return f"Erreur: {str(e)}"


# For AgentCore Runtime
try:
    from bedrock_agentcore.runtime import BedrockAgentCoreApp
    app = BedrockAgentCoreApp()
    logger.info("Orchestrator BedrockAgentCoreApp initialized")


    @app.entrypoint
    def handler(prompt: str) -> str:
        return handle_request(prompt)
except ImportError as e:
    logger.warning(f"bedrock_agentcore.runtime not available: {e}")
    app = None

Why implement HR/IT calls as tools?

Because it makes routing model-driven instead of hard-coded.

You’re giving the orchestrator:

a system prompt describing the domains
tool docstrings that describe when to use them
a stable interface (ask_hr_agent(query) -> str)

The LLM chooses the right tool(s), and you keep your Python logic minimal.

5 — A2A call parsing

A2A responses often come as structured messages with parts.
The parsing strategy is resilient:

if hasattr(response, 'message') and response.message:
    for part in response.message.parts:
        if hasattr(part, 'text'):
            return part.text
return str(response)

Why this matters:

You don’t assume a single “text blob” response
You gracefully fallback to str(response)

6 — Lazy initialization (production detail that matters)

All three handlers use lazy init (_agent = None).

Why it’s important in AgentCore Runtime:

cold start cost is paid once per container
subsequent requests reuse the same agent instance
lower latency and less overhead

This is the kind of detail that separates a demo from something you can run for real users.

7 — How routing actually happens

You don’t have if "vpn" in query: ....

Instead:

The orchestrator receives the question
It decides which tool(s) to call based on system prompt + tool descriptions
It synthesizes a final answer

Example behaviors:

HR-only question → calls ask_hr_agent
IT-only question → calls ask_it_agent
Mixed question → calls both, then merges

8 — Failure modes and debugging

Your code already includes the essentials:

INFO logging
explicit “not configured” messages
tool-level error handling

Common issues:

Missing URL

If HR_AGENT_URL is empty:

ask_hr_agent returns Error: HR agent not configured

Network/A2A issues

If A2A fails:

error is logged
tool returns Error HR agent: ... / Error IT agent: ...

The orchestrator can still produce a partial answer if only one agent fails.

9 — Practical hardening ideas (optional)

These keep the same architecture, but improve resilience:

Timeouts & retries for A2A calls (network I/O).
Avoid asyncio.run() if you ever run inside an existing event loop.
Return structured tool outputs (citations, metadata) if you need auditability later.

Why this architecture scales

Adding a new domain agent (Finance / Legal / Security) is just repetition of the same pattern:

1) new RAG agent handler with retrieve + domain prompt + KNOWLEDGE_BASE_ID

2) new orchestrator tool ask_finance_agent calling A2A

3) add tool to orchestrator’s tools=[...]

No refactor required.

Final thoughts

If you already know how to build RAG (or deterministic tools), the next step is orchestration.

This pattern gives you:

domain isolation
deployable units (AgentCore Runtime)
model-driven routing (tools)
a clean evolution path

It’s basically microservices:

small, focused services with a clear interface — except the services can reason.

Top comments (7)

Corey Strausman • Jan 14

This is great! Have you considered applying for the AWS Community Builders program? builder.aws.com/community/communit...

Meidi Airouche Onepoint • Jan 14

Thank you very much for your feedback. I've applied twice I think but I've never been enrolled unfortunately

Corey Strausman • Jan 14

Well let see if we can change that this year! Feel free to email me at coreydst@amazon.com if you want to chat more.

Corey Strausman • Jan 14

Feel free to apply and let me know if you have any questions coreydst@amazon.com

Meidi Airouche Onepoint • Jan 14

That's kind from you thank you very much. i'll do it for sure

Kosmik Onepoint • Jan 15

Many thanks for this post. What would be the decider that makes you switch between explicit workflow and llm orchestration ? If both tools are called, is there a good context isolation between them ?

Meidi Airouche Onepoint • Jan 15

Good question.

I use explicit workflows when the sequence of steps is deterministic, compliance-driven, or must be fully controlled.

I use LLM orchestration when the problem is about intent understanding, routing, or synthesis and when flexibility matters more than strict determinism.

When multiple tools are called, context isolation is handled by design: each agent runs in its own runtime, with its own system prompt and knowledge base. The orchestrator only receives the agents’ outputs, not their internal retrieval context, which keeps domains cleanly separated.

What we are building

Components

Runtime request flow

1 — Configuration (env vars)

HR Agent env

IT Agent env

Orchestrator env

2 — HR Agent handler (AgentCore Runtime + RAG)

Why os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID?

What “RAG” means here

3 — IT Agent handler (AgentCore Runtime + RAG)

Why separate agents instead of one big agent with two KBs?

4 — Orchestrator handler (AgentCore Runtime + A2A tools)

Orchestrator key behaviors

Why implement HR/IT calls as tools?

5 — A2A call parsing

6 — Lazy initialization (production detail that matters)

7 — How routing actually happens

8 — Failure modes and debugging

Missing URL

Network/A2A issues

9 — Practical hardening ideas (optional)

Why this architecture scales

Final thoughts

Why `os.environ["KNOWLEDGE_BASE_ID"] = KNOWLEDGE_BASE_ID`?