Building intelligent conversational agents that can maintain context, integrate with external tools, and scale to production workloads requires careful architectural decisions. This blog explores how we built a production-ready AWS DevOps agent using Strands framework and Amazon Bedrock AgentCore, demonstrating key patterns for scalable agent development.
Amazon Bedrock AgentCore
Amazon Bedrock AgentCore is a complete set of capabilities to deploy and operate agents securely, at scale using any agentic framework and any LLM model. With it, developers can accelerate AI agents into production quickly, accelerating the business value timelines.
Amazon Bedrock AgentCore provides tools and capabilities to make agents more effective and capable, purpose-built infrastructure to securely scale agents, and controls to operate trustworthy agents.
Amazon Bedrock AgentCore capabilities are composable and work with popular open-source frameworks and any model, so you don’t have to choose between open-source flexibility and enterprise-grade security and reliability.
Strands Agents SDK
The Strands Agents SDK is an open source framework for building AI agents that emphasizes a model-driven approach. Instead of hardcoding complex task flows, Strands uses the reasoning abilities of modern large language models (LLMs) to handle planning and tool usage autonomously. Developers can create an agent with a prompt (defining the agent’s role or behavior) and a list of tools, and the LLM-powered agent will figure out how to chain its reasoning and invoke tools as needed. This dramatically simplifies agent development compared to traditional workflow-based frameworks.
Technical Architecture Deep Dive
The architecture implements several key patterns:
Event-Driven Memory: Every interaction generates events stored in AgentCore Memory with semantic embeddings for contextual retrieval.
Stateless Agent Core: The agent itself maintains no state, relying entirely on external memory and configuration services.
Gateway Abstraction: MCP protocol abstracts tool complexity, allowing the agent to focus on conversation flow while tools handle specialized operations.
Configuration as Code: All configuration stored in SSM Parameter Store with automatic resource discovery and creation.
Memory-First Architecture
Traditional chatbots lose context between sessions. We implemented persistent memory as a first-class citizen:
# Memory hooks automatically capture interactions
class DevOpsAgentMemoryHooks(MemoryHooks):
def on_before_query(self, query: str, context: dict) -> dict:
# Retrieve relevant context before processing
return self.memory_client.retrieve_context(query)
def on_after_response(self, query: str, response: str, context: dict):
# Persist interaction for future sessions
self.memory_client.create_event(query, response, context)
This approach enables the agent to remember user preferences, past conversations, and build contextual understanding over time.
MCP Gateway Integration Pattern
Instead of direct Lambda integration, we adopted MCP through Bedrock AgentCore Gateway:
# Clean separation: Agent focuses on conversation, tools handle operations
agent = BedrockAgentCoreApp(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
memory_hooks=DevOpsAgentMemoryHooks(),
gateway_config=gateway_config # MCP tools via gateway
)
Benefits:
- Decoupled Architecture: Lambda functions operate independently
- Secure Authentication: JWT tokens and OAuth2 Client Credentials flow
- Scalable Tool Access: New tools added without agent code changes
- Production Ready: Built-in monitoring and error handling
Microservices Lambda Architecture
We demonstrate Lambda best practices with specialized functions:
lambda/
├── websearch/ # DuckDuckGo search integration
├── prometheus/ # Monitoring functions (4 specialized)
│ ├── lambda_query.py # Instant queries
│ ├── lambda_range_query.py # Time series
│ ├── lambda_list_metrics.py # Discovery
│ └── lambda_server_info.py # Config
└── eks/ # Kubernetes management functions
Each function follows single responsibility principle with right-sized resources and independent scaling, which can be further optimized based on your use cases.
Implementation Highlights
Multi Access Deployment
The agent supports four deployment modes:
Local CLI (agent.py
): Interactive development with full feature access
Local Runtime (agent_runtime.py
): HTTP API for testing production behavior
Streamlit Web UI (streamlit/streamlit_app.py
): Modern web interface for end users
AgentCore Runtime: Containerized production deployment with auto-scaling
Authentication & Security
OAuth2 Client Credentials flow with Amazon Cognito provides secure service access:
# Secure token exchange for MCP gateway access
cognito_client = utils.get_cognito_client()
token_response = cognito_client.initiate_auth(
AuthFlow='CLIENT_CREDENTIALS',
AuthParameters={'SCOPE': 'gateway/invoke'}
)
Configuration Management
SSM Parameter Store centralizes configuration with automatic discovery:
# Self-configuring memory resources
memory_id = utils.get_ssm_parameter('/app/devopsagent/agentcore/memory_id')
if not memory_id:
memory_resource = memory_client.create_memory_resource()
utils.put_ssm_parameter('/app/devopsagent/agentcore/memory_id', memory_resource.id)
Streamlit Web Interface
A modern web UI provides user-friendly access to the agent through Streamlit:
class StreamlitAgentInterface:
def invoke_agent(self, prompt, session_id=None):
# Direct integration with AgentCore Runtime
response = self.client.invoke_agent_runtime(
agentRuntimeArn=agent_runtime_arn,
runtimeSessionId=session_id,
payload=json.dumps(payload).encode('utf-8')
)
return json.loads(response['response'].read())
Key Features:
- Real-time Chat Interface: Interactive conversation with persistent sessions
- AWS-Themed Styling: Professional UI with custom CSS and responsive design
- Example Prompts: Pre-built queries for common DevOps scenarios
- Session Management: New session creation and chat history clearing
- Mobile-Friendly: Responsive design for various screen sizes
- Direct Runtime Integration: No additional API layer required
Production Deployment
Containerized Runtime
ARM64-optimized Docker container deployed to AgentCore Runtime:
FROM public.ecr.aws/lambda/python:3.11-arm64
COPY requirements.txt agent_runtime.py utils.py ./
RUN pip install -r requirements.txt
CMD ["agent_runtime.lambda_handler"]
Performance Characteristics
- Basic Queries: 4-5 seconds response time
- Web Search: 30-60 seconds (external API dependent)
- Memory Retrieval: <1 second with semantic search
- Cold Start: Optimized with shared utilities pattern
Monitoring & Observability
Comprehensive logging and monitoring through CloudWatch:
# Structured logging for production debugging
logger.info("Processing query", extra={
"session_id": session_id,
"model": model_id,
"tools_used": tools_used,
"response_time": response_time
})
Lessons Learned
1. Memory as Infrastructure
Treating memory as infrastructure rather than an afterthought enables sophisticated conversational experiences. The agent learns user preferences and maintains context across sessions.
2. Gateway Pattern for Tools
MCP through AgentCore Gateway provides cleaner integration than direct Lambda invocation. This pattern scales better and provides better security isolation.
3. Right-Sized Functions
Breaking monolithic Lambda functions into specialized microservices (query vs. range query vs. metrics) improves performance and cost efficiency.
4. Configuration Automation
Self-configuring resources through SSM Parameter Store reduces deployment complexity and enables environment-specific configurations.
Results
The production deployment demonstrates:
- ✅ Cross-session memory with 100% persistence success rate
- ✅ Secure authentication with JWT tokens and OAuth2 flow
- ✅ Scalable tool integration through MCP gateway
- ✅ Production performance with 4-5s response times
- ✅ Microservices architecture with specialized Lambda functions
- ✅ Modern web interface with Streamlit providing user-friendly access
- ✅ Multi-modal deployment supporting CLI, web UI, and production runtime
Getting Started
The complete implementation is available on GitHub with comprehensive documentation, deployment scripts, and testing utilities. The project demonstrates production-ready patterns for building scalable conversational agents with persistent memory and secure tool integration.
Repository: aws-devops-strands-agentcore
This architecture provides a solid foundation for building sophisticated conversational agents that can maintain context, integrate with external services, and scale to production workloads while following AWS best practices for security and performance.
Top comments (0)