The Problem: When "Minimal Fix" Means "Complete Refactor"
In my previous blog, I shared how I built a cloud engineer agent using AWS Strands that could automatically detect CloudWatch log errors and raise PRs with potential fixes. While the concept worked, I encountered a critical issue: the agent consistently made drastic changes when simple fixes were needed.
Picture this scenario: Your Lambda function fails because it's missing a single IAM permission. The fix? Add one line to your IAM policy. But instead, your agent decides to refactor your entire infrastructure, reorganise your code structure, and "improve" things you never asked it to touch.
Despite countless iterations of my system prompt, emphasising:
- Apply ONLY the specific fix needed
- No broader improvements beyond fixing the specific error
- Your role is automated incident response with minimal, targeted fixes only
AWS Strands kept making those broad, sweeping changes. The surgical precision I needed just wasn't there.
💡 The Solution: Testing Claude Code SDK with Bedrock AgentCore
Frustrated with this limitation, I decided to test the Claude Code SDK integrated with Bedrock AgentCore. The results were immediately promising – it successfully applied the precise, surgical fixes I was looking for.
But I wanted to be thorough. So I built a comprehensive POC with three different setups to test various combinations of how Claude Code can work with Bedrock AgentCore:
Running the Claude Code command-line interface directly within the Bedrock AgentCore framework.
Integrating the Claude Code SDK directly into Bedrock AgentCore for more programmatic control.
Using the Claude Code SDK as a specialised tool while keeping AWS Strands as the orchestrator agent.
The Architectural Differences That Matter
AWS Strands: The Over-Eager Optimiser
Strands seemed to treat every incident as an opportunity for comprehensive optimisation. Even with explicit constraints in the prompt, it would:
- Refactor code beyond the error scope
- Suggest architectural improvements
- Make style and formatting changes
- Add "helpful" features nobody requested
Claude Code SDK + Bedrock AgentCore: The Surgical Specialist
The Claude Code SDK approach demonstrated much better constraint adherence:
- Identified the exact root cause
- Applied minimal viable fixes
- Respected the existing codebase structure
- Maintained focus on incident response, not improvement
Real-World Example: The IAM Permission Fix
The Scenario: Lambda function fails with "Access Denied" error when trying to publish a message to SQS.
The Required Fix: Grant SNS publish permission to the Lambda function.
😅 AWS Strands Response:
"Oh, you have a permission issue? Let me fix EVERYTHING!"
Claude Code SDK Response:
Key Learnings: Why This Matters for Production Systems
1. Blast Radius Control
In production incident response, you want the smallest possible change to restore service. Every additional modification increases risk and potential for new issues.
2. Audit and Compliance
When changes are minimal and targeted, they're easier to:
- Review and approve
- Audit for compliance
- Roll back if needed
- Understand the impact
3. Team Confidence
Teams are more likely to trust and adopt automated incident response when they know it won't make unexpected changes to their carefully crafted systems.
Implementation Insights
What Worked Well with Claude Code SDK
- Better constraint understanding: Claude Code SDK consistently respected boundaries
- Contextual awareness: Better at understanding what NOT to change
- Debugging precision: More accurate root cause identification
Challenges with AWS Strands
- Scope creep tendency: Natural inclination to "improve" beyond requirements
- Prompt instruction drift: Difficulty maintaining a narrow focus
- Context bleeding: Unable to separate "fixing" from "improving"
Conclusion: The Right Tool for the Right Job
While AWS Strands excels at comprehensive analysis and broad improvements, the Claude Code SDK with Bedrock AgentCore proved superior for surgical and incident-response scenarios where precision and constraint adherence are paramount.
The key insight? Not all AI agents are created equal for every task. Sometimes you need a scalpel, not a sledgehammer.
What's your experience with AI agent precision in production environments? Have you found similar challenges with scope creep in automated systems? Share your thoughts in the comments below!
Resources
GitHub Repository
Claude Code SDK Documentation
AWS Bedrock AgentCore
Top comments (0)