Sarvar Nadaf for AWS Community Builders

Posted on Apr 5

I Connected 12 MCP Servers to Amazon Q. Here's What Broke

#aws #ai #agents #aiops

👋 Hey there, tech enthusiasts!

I'm Sarvar, a Cloud Architect with a passion for transforming complex technological challenges into elegant solutions. With extensive experience spanning Cloud Operations (AWS & Azure), Data Operations, Analytics, DevOps, and Generative AI, I've had the privilege of architecting solutions for global enterprises that drive real business impact. Through this article series, I'm excited to share practical insights, best practices, and hands-on experiences from my journey in the tech world. Whether you're a seasoned professional or just starting out, I aim to break down complex concepts into digestible pieces that you can apply in your projects.

Let's dive in and explore the fascinating world of cloud technology together! 🚀

Written from experience building AI agent integrations for AWS infrastructure management. Your mileage may vary, but the principles hold across different use cases.

Do We Still Need MCP When We Have Agent Skills?

As a cloud architect working with AI agents, I've spent the last few months exploring how to extend their capabilities. Specifically, I've been working with Amazon Q Developer Pro - AWS's AI assistant that helps with coding, infrastructure management, and cloud operations through a chat interface.

This article shares what I learned about two different approaches to extending AI agents: Model Context Protocol (MCP) and Agent Skills. While the specific implementation details are still evolving, the architectural patterns I describe here represent where the ecosystem is heading based on current capabilities and community standards.

The Problem I Was Solving

I needed Amazon Q Developer Pro to help me manage AWS infrastructure across multiple accounts. The agent needed to:

Check CloudWatch metrics
Query RDS databases
Send alerts to Slack
Generate cost reports
Review CloudFormation templates

I tried two approaches, and each had serious problems.

Approach 1: Connecting Everything via MCP

MCP is a standard protocol that lets AI agents connect to external tools. Think of it like USB ports on your computer - one standard interface that works with many devices.

I connected 12 MCP servers to Amazon Q Developer Pro:

AWS CloudWatch server
RDS query server
Slack messaging server
Cost Explorer server
GitHub server
And seven more

What Went Wrong

Before I even asked a question, 35% of the context window was already full. Each MCP server loaded all its available functions into memory. The CloudWatch server alone exposed 15 different functions with detailed parameter descriptions.

When I asked Amazon Q Developer Pro to "check if our xyz database is healthy," it had to scan through multiple function definitions to figure out which ones to use. Sometimes it picked the wrong ones.

Every function call and response consumed more context. After three or four operations, I was running out of space for the actual conversation.

Approach 2: Using Agent Skills

Agent Skills are different. Instead of connecting to external tools, you give the agent domain knowledge - a guide on how to think about a problem.

I created a skill called "database-health-check" with a simple file structure:

database-health-check/
  SKILL.md          (when to use this, what steps to follow)
  scripts/
    check_rds.py    (Python script to query RDS)

The SKILL.md file contained:

# Database Health Check Skill

## Trigger
Keywords: "database health", "RDS status", "database performance", "DB issues"

## Process
1. Check CPU utilization (threshold: 80%)
2. Check active connections (threshold: 90% of max)
3. Check replication lag (threshold: 1000ms)
4. Check storage space (threshold: 85%)

## Decision Logic
- If CPU > 80%: WARN - "High CPU usage detected"
- If connections > 90%: CRITICAL - "Connection pool nearly exhausted"
- If replication lag > 1000ms: WARN - "Replication falling behind"
- If storage > 85%: CRITICAL - "Low storage space"

## Output Format
Status: [HEALTHY|WARN|CRITICAL]
Issues: [list of problems found]
Recommendations: [suggested actions]

What Went Wrong

The skill worked perfectly on my laptop. Then my colleague tried to use it and got an error: "ModuleNotFoundError: No module named 'boto3'".

The Python script needed boto3, pandas, and psycopg2. My machine had them installed. His didn't. We had no standard way to declare or install these dependencies.

The Real Difference

After working with both, I realized they solve different problems:

MCP answers: "What can I do?"
It provides capabilities - functions the agent can call. Each MCP server is self-contained with its own dependencies and environment.

Skills answer: "How should I think?"
They provide expertise - decision logic, quality standards, and workflows. But they run in whatever environment the agent has.

The Solution: Use Both Together

Here's what actually works in real time. I'll use a real example from last week.

Example: Automated Cost Anomaly Detection

I needed Amazon Q Developer Pro to monitor our AWS costs and alert us when something unusual happens.

The MCP Layer (The Hands)

I set up two MCP servers:

aws-cost-explorer-mcp - exposes functions like get_daily_costs(), get_service_breakdown()
slack-notifications-mcp - exposes send_message(), create_incident()

Each server runs in its own Docker container with all dependencies installed. Amazon Q Developer Pro doesn't need to know about boto3 or the Slack SDK.

The Skill Layer (The Brain)

I created a cost-monitoring skill with this logic in SKILL.md:

Trigger: When user asks about cost anomalies or unusual spending

Process:
1. Get last 30 days of daily costs
2. Calculate average and standard deviation
3. Check if today's cost is more than 2 standard deviations above average
4. If yes, get service breakdown to identify which service spiked
5. If spike is over $500, send high-priority Slack alert
6. If spike is under $500, just report in conversation

Quality checks:
- Always compare against same day of week (Monday vs Monday)
- Exclude known scheduled events (monthly backups, etc.)
- Include percentage change, not just absolute numbers

How They Work Together

When I ask Amazon Q Developer Pro: "Are our AWS costs normal today?"

Amazon Q Developer Pro matches the question to the cost-monitoring skill
The skill loads its SKILL.md (only 2KB of context)
The skill requires cost data, so Amazon Q Developer Pro connects to the aws-cost-explorer-mcp server
The skill orchestrates: call get_daily_costs(), analyze the data, decide if it's anomalous
If anomalous, the skill calls the slack-notifications-mcp server
After completion, the skill unloads from memory

The MCP servers handle the technical execution. The skill handles the business logic and decision-making.

Why This Architecture Works

Context Efficiency
Only the active skill loads into memory. MCP tools load on-demand when the skill needs them. I went from 35% context consumed at startup to 5%.

Portability
The same skill works on my laptop, my colleague's Windows machine, and our CI/CD pipeline. The MCP servers can run locally, in containers, or as remote services.

Reusability
The aws-cost-explorer-mcp server is used by three different skills:

cost-monitoring (detects anomalies)
budget-planning (forecasts spending)
cost-optimization (finds savings opportunities)

Each skill brings different expertise, but they share the same data source.

Maintainability
When AWS changes their Cost Explorer API, I update one MCP server. All skills continue working. When business rules change (new alert thresholds), I update the skill. The MCP servers don't need to change.

Visual Overview

Here's how the three layers work together:

┌─────────────────────────────────────────┐
│   Agent Layer (Amazon Q Developer Pro)  │
│   Matches tasks → Loads skills          │
│   Connects to MCP servers               │
└─────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────┐
│         Skill Layer (The Brain)         │
│   cost-monitoring.SKILL.md              │
│   - When to trigger                     │
│   - Decision logic                      │
│   - Quality checks                      │
└─────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────┐
│         MCP Layer (The Hands)           │
│   aws-cost-explorer-mcp                 │
│   slack-notifications-mcp               │
│   - Tool execution                      │
│   - Environment isolation               │
└─────────────────────────────────────────┘

A Practical Pattern

Here's the decision framework I use:

Put it in an MCP server if:

Multiple skills will use it
It has heavy dependencies (Python libraries, system tools)
It needs credentials or secrets
It's a stable, reusable capability

Put it in a skill if:

It's domain knowledge or business logic
It orchestrates multiple tools
It has conditional decision-making
It's specific to one workflow

Keep it as a simple script if:

It's a one-time operation
The entire workflow is tightly coupled
Splitting it would add unnecessary complexity

What's Still Missing

The ecosystem isn't mature yet. Here's what I wish existed:

Dependency Declaration
Skills need a standard way to say "I need these capabilities" without hardcoding specific MCP servers. Something like:

requires:
  - cloud_metrics
  - notifications

Then the runtime figures out which MCP servers provide those capabilities.

Dynamic Loading
Right now, when you connect an MCP server, all its tools load immediately. I want skills to control this: "Load only the cost analysis tools for this task."

Graceful Fallback
If an MCP server is unavailable, the skill should automatically fall back to a built-in script or tell me clearly what's missing.

Conclusion

After six months of exploring these patterns with Amazon Q Developer Pro and other AI agents, here's what I know:

MCP and Agent Skills aren't competing approaches. They're complementary layers of the same system.

MCP gives your agent reliable, isolated capabilities. Skills give your agent the expertise to use those capabilities intelligently.

You need both. MCP without skills is a toolbox with no craftsman. Skills without MCP are expertise with unreliable tools.

The architecture that works:

Skills encode what to do and when
MCP servers provide how to do it
The agent runtime connects them together

This isn't just theoretical. These patterns are being implemented in actual environments, managing real AWS infrastructure.

Getting Started

If you want to explore this architecture:

For MCP:

Check the official MCP specification at modelcontextprotocol.io
Browse existing MCP servers at github.com/modelcontextprotocol/servers
Amazon Q Developer supports MCP through the Q CLI

For Agent Skills:

Review the Agent Skills specification by Anthropic
Start with simple skills that encode your team's operational knowledge
Focus on decision logic and workflows, not heavy computation

Next Steps:

Identify one repetitive task you do with your AI agent
Ask: Does this need external tools (MCP) or decision logic (Skill)?
Build the simplest version that works
Iterate based on what you learn

The ecosystem is still maturing, but the core patterns are solid. Start small, learn from erros, and build up from there.

📌 Wrapping Up

Thank you for reading! I hope this article gave you practical insights and a clearer perspective on the topic.

Was this helpful?

❤️ Like if it added value
🦄 Unicorn if you’re applying it today
💾 Save for your next optimization session
🔄 Share with your team

Follow me for more on:

AWS architecture patterns
FinOps automation
Multi-account strategies
AI-driven DevOps

💡 What’s Next

More deep dives coming soon on cloud operations, GenAI, Agentic-AI, DevOps, and data workflows follow for weekly insights.

🌐 Portfolio & Work

You can explore my full body of work, certifications, architecture projects, and technical articles here:

👉 Visit My Website

🛠️ Services I Offer

If you're looking for hands-on guidance or collaboration, I provide:

Cloud Architecture Consulting (AWS / Azure)
DevSecOps & Automation Design
FinOps Optimization Reviews
Technical Writing (Cloud, DevOps, GenAI)
Product & Architecture Reviews
Mentorship & 1:1 Technical Guidance

🤝 Let’s Connect

I’d love to hear your thoughts drop a comment or connect with me on LinkedIn.

For collaborations, consulting, or technical discussions, feel free to reach out directly at simplynadaf@gmail.com

Happy Learning 🚀

DEV Community

I Connected 12 MCP Servers to Amazon Q. Here's What Broke

Do We Still Need MCP When We Have Agent Skills?

The Problem I Was Solving

Approach 1: Connecting Everything via MCP

What Went Wrong

Approach 2: Using Agent Skills

What Went Wrong

The Real Difference

The Solution: Use Both Together

Example: Automated Cost Anomaly Detection

Why This Architecture Works

Visual Overview

A Practical Pattern

What's Still Missing

Conclusion

Getting Started

📌 Wrapping Up

💡 What’s Next

🌐 Portfolio & Work

🛠️ Services I Offer

🤝 Let’s Connect

Top comments (0)