Week 4 - Architecting AI Systems: Scaling with Bedrock AgentCore and Designing at Speed
Batch 09 – BeSA Cloud Academy
Disclaimer:
These notes were drafted using AI for clarity, structure, and readability. They are intended solely for learning purposes.
These are the structured notes from Week 4, focused only on the two role plays. Writing this as a quick revision for those who attended the session and a concise recap for anyone who couldn’t make it.
Role Play 1 – Architecting for Production with Amazon Bedrock Agent Core
Context
This conversation focused on a healthcare technology company that had successfully built agent prototypes using the Strands framework but was struggling to move those agents into production.
The company’s use case involves AI agents assisting with clinical documentation processing across multiple hospitals.
Prototype to Production Challenge
The customer described several challenges they were facing while scaling their prototype.
High concurrency requirements
- The organization has about 500 physicians across three hospitals.
- Each physician handles 20–30 patients.
- This results in thousands of concurrent agent sessions.
- Their current system cannot handle this scale.
Sensitive healthcare data
- The system handles PHI (Personal Health Information).
- HIPAA compliance requires strict isolation between sessions.
- Each physician should only access data relevant to their patients.
External system integrations
-
The agents must connect to:
- Insurance systems
- Healthcare databases
- External APIs
These systems require different authentication methods such as:
- API keys
- OAuth-based access.
Observability and auditability
- Healthcare systems require full traceability.
-
It must be possible to track:
- Who accessed what
- When it was accessed
- What actions were performed.
The customer emphasized that a “black box” AI system is not acceptable in this environment.
The Prototype-to-Production Gap
The solutions architect explained that this situation is very common.
Many teams successfully build prototypes, but production systems introduce additional challenges.
Four key areas typically create the gap:
Performance
- Managing large numbers of concurrent agent sessions.
Scaling
- Infrastructure bottlenecks appear when workloads grow.
Security
- Ensuring agents only access data allowed for a specific user.
Governance
- Tracking activity, usage, and access for compliance purposes.
Introduction to Amazon Bedrock Agent Core
To address these challenges, the conversation introduced Amazon Bedrock Agent Core.
Agent Core is described as a modular suite of services designed to help move agent applications into production environments.
A key point mentioned was that existing Strands agent code does not need to be rewritten.
Instead, Agent Core services can be integrated alongside existing frameworks.
Agent Core Architecture Components
Agent Core consists of multiple modular services that can be adopted based on requirements.
Runtime
The runtime provides a serverless environment for deploying agents.
Key capabilities include:
- Wrapping the agent in a container.
- Exposing an endpoint for interaction.
- Automatically scaling from zero to thousands of concurrent sessions.
This eliminates the need to manage infrastructure manually.
Session Management
Clinical workflows often involve long interactions.
The runtime supports sessions lasting up to eight hours.
This enables complex medical documentation workflows to complete without interruption.
Identity and Access Management
Two types of identity control were discussed.
Inbound authentication
- Ensures only authorized users can access the agent.
- Can integrate with identity providers such as Cognito or Okta.
Outbound authorization
- Allows the agent to securely access external systems.
- API credentials are stored in a credential provider backed by Secrets Manager.
- This prevents sensitive credentials from being exposed in the application.
Gateway and Tool Management
When agents interact with many tools, a new challenge emerges.
If every tool is included in the prompt context, it can:
- Increase token usage.
- Expand the context window unnecessarily.
- Reduce response accuracy.
The Agent Core Gateway addresses this problem.
It indexes tools such as:
- APIs
- Lambda functions
- MCP servers.
Using semantic search, the gateway retrieves only the tools relevant to a specific request.
Benefits include:
- Reduced token usage.
- Lower operational costs.
- Improved response accuracy.
Memory Capabilities
Agent Core provides two memory layers.
Short-term memory
- Maintains state within an active session.
Long-term memory
- Stores persistent information such as user preferences or conversation history.
This allows agents to maintain context across interactions.
Key Takeaways from the Production Discussion
Moving from prototype to production requires solving operational challenges such as scaling, security, and governance.
Production-ready agent systems must support:
- High concurrency workloads
- Secure data isolation
- Integration with external systems
- Auditability and observability
Agent Core provides modular services designed to address these needs without requiring a full redesign of existing agents.
Role Play 2 – Rapid Architecture and Design at Speed
Context
This conversation explored how AI tools can significantly accelerate architecture design workflows for solutions architects.
The scenario involved a new customer engagement requiring a complete architecture proposal within three days.
Traditional Architecture Design Timeline
Previously, preparing a full architecture proposal could take two to three weeks.
Tasks involved:
- Industry research
- Evaluating architecture patterns
- Designing diagrams
- Writing architecture decision documentation.
The discussion demonstrated how AI-assisted workflows can compress this timeline dramatically.
Agentic Architecture Assistant
The architect described using an AI-powered assistant built with several components.
LLM (the brain)
- Responsible for reasoning and generating ideas.
MCPs (Model Context Protocols)
- Provide access to domain-specific knowledge such as AWS services or pricing.
Agent interface
- Implemented as an IDE extension such as Cline inside VS Code.
Together, these components form an “agentic helper” that assists with architecture design tasks.
Incorporating Real Customer Constraints
While AI can generate architectures quickly, it does not automatically understand real-world constraints.
In the scenario, the customer had several limitations:
Legacy architecture
- A 15-year-old monolithic system.
Team skill set
- Developers are experienced in Java.
- No experience with serverless architectures.
Leadership preferences
- The CTO is risk-averse.
These contextual factors strongly influence architectural choices.
Human Judgment in Architecture
The architect explained how human judgment guides AI outputs.
For example:
AI might suggest a fully serverless architecture.
However, considering the team’s Java expertise and risk tolerance, this approach may not be appropriate.
Instead, the architect may choose a hybrid architecture combining:
- Containers
- Selected serverless components.
This approach balances modernization with practicality.
Generating Architecture Artifacts with AI
The conversation also highlighted how AI can accelerate the creation of design artifacts.
Architecture diagrams
Using specialized MCP tools, the AI can generate diagrams that include:
- Front-end layers
- Web Application Firewall (WAF)
- Content Delivery Network (CDN)
- Containerized backend and persistence layers.
What previously required hours of manual diagramming can now be generated in seconds.
Architecture Decision Records (ADRs)
The AI can also generate ADR documents.
These explain why certain design choices were made.
Example:
Choosing ECS Fargate instead of Lambda based on:
- Java-based development environment
- PCI compliance requirements.
Generating Infrastructure as Code
The agent can also create starter templates for infrastructure.
Example outputs include:
- Terraform templates
- Deployment configuration samples.
These provide a strong starting point for implementation.
Three-Day Architecture Delivery Plan
The architect outlined a structured workflow for meeting the three-day deadline.
Day 1
- Research e-commerce architecture patterns.
- Generate three viable architectural approaches.
Day 2
- Create detailed architecture diagrams.
- Draft Architecture Decision Records.
- Generate infrastructure code samples.
Day 3
- Refine messaging for leadership stakeholders.
- Focus on cost predictability and risk mitigation.
- Produce an executive summary.
Cost of AI-Assisted Architecture
An interesting detail mentioned was the cost of the automated workflow.
The entire AI-assisted architecture generation process cost approximately $2.30 in compute time.
Key Takeaways from the Architecture Discussion
AI can dramatically accelerate architecture workflows by:
- Automating research
- Generating architecture options
- Producing diagrams and documentation
- Creating infrastructure templates.
However, AI does not replace architectural expertise.
Human architects remain responsible for:
- Interpreting organizational constraints
- Making trade-offs
- Applying context and judgment.
The most effective approach is a partnership between AI capabilities and human architectural thinking.
Week 4 Consolidated Takeaways
From the first role play:
Moving AI agents into production introduces new requirements around scalability, security, governance, and operational visibility.
Services like Amazon Bedrock Agent Core provide modular capabilities that help address these production challenges.
From the second role play:
AI-assisted workflows can significantly accelerate architecture design, but the architect’s role in applying context, judgment, and stakeholder awareness remains critical.
This week highlighted the transition from building agents to operating them at scale and demonstrated how AI tools can also transform how architects design and deliver solutions.

Top comments (0)