Abstract
They say a picture is worth a thousand prompts but in the fast-paced world of cloud-native development, the Amazon EKS Model Context Protocol (MCP) says even more. Since its release, MCP has quickly distinguished itself as a breakthrough innovation, a clear example of how purposeful design can redefine best practices and significantly accelerate application development on Amazon EKS.
The Amazon EKS Model Context Protocol (MCP) Server represents a paradigm shift in cloud-native development, introducing AI-powered assistance directly into Kubernetes workflows. This open-source protocol bridges the gap between Large Language Models (LLMs) and EKS cluster management, enabling developers to interact with complex Kubernetes operations through natural language interfaces while maintaining enterprise-grade security and operational excellence.
Table of Contents
- Introduction
- What is Amazon EKS Model Context Protocol?
- Core Features and Capabilities
- Comparison With Traditional Approaches
- Use Cases and Real-World Examples
- How to Use MCP in EKS
- Architecture and Visual Overview
- Security and Governance
- Future Potential and AWS Vision
- Conclusion
- References
Introduction
Containerized applications have become the cornerstone of modern cloud deployments, offering consistent environments, streamlined dependency management, and seamless scaling capabilities. However, the journey from application development to production deployment remains fraught with manual, time-consuming processes that require deep expertise in Kubernetes operations, AWS services, and infrastructure management.
AWS has recently announced the launch of the open-source Amazon EKS Model Context Protocol (MCP) Server, alongside the Amazon ECS MCP Server, marking a significant advancement in AI-assisted cloud-native development. This revolutionary tool brings artificial intelligence directly into the Kubernetes development workflow, transforming how developers interact with EKS clusters.
The Challenge
Traditional Kubernetes and EKS management requires developers to:
- Master complex
kubectl
commands and YAML manifests - Navigate intricate AWS service integrations (IAM, VPC, EBS)
- Manually troubleshoot cluster issues using multiple tools and documentation sources
- Context-switch between various interfaces for cluster management, monitoring, and debugging
The Solution
The EKS MCP Server addresses these challenges by:
- Simplifying cluster setup with automated prerequisite creation and best practice application
- Streamlining application deployment through high-level workflows and automated code generation
- Accelerating troubleshooting via intelligent debugging tools and integrated knowledge base access
- Enabling natural language interactions for complex Kubernetes operations
What is Amazon EKS Model Context Protocol?
The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need.
Why MCP Servers?
MCP servers enhance the capabilities of foundation models (FMs) in several key ways:
Improved Output Quality: By providing relevant information directly in the model's context, MCP servers significantly improve model responses for specialized domains like AWS services. This approach reduces hallucinations, provides more accurate technical details, enables more precise code generation, and ensures recommendations align with current AWS best practices and service capabilities.
Access to Latest Documentation: FMs may not have knowledge of recent releases, APIs, or SDKs. MCP servers bridge this gap by pulling in up-to-date documentation, ensuring your AI assistant always works with the latest AWS capabilities.
Workflow Automation: MCP servers convert common workflows into tools that foundation models can use directly. Whether it's CDK, Terraform, or other AWS-specific workflows, these tools enable AI assistants to perform complex tasks with greater accuracy and efficiency.
Specialized Domain Knowledge: MCP servers provide deep, contextual knowledge about AWS services that might not be fully represented in foundation models' training data, enabling more accurate and helpful responses for cloud development tasks.
In the context of Amazon EKS
, Integrating the EKS MCP server into AI code assistants enhances development workflow across all phases, from simplifying initial cluster setup with automated prerequisite creation and application of best practices. Further, it streamlines application deployment with high-level workflows and automated code generation. Finally, it accelerates troubleshooting through intelligent debugging tools and knowledge base access. All of this simplifies complex operations through natural language interactions in AI code assistants.
MCP in the EKS Ecosystem
A Model Context Protocol (MCP) server for Amazon EKS that enables generative AI models to create and manage Kubernetes clusters on AWS through MCP tools specifically addresses the complexity of Kubernetes cluster management by
- Context-Aware Operations: Understanding the current state of your EKS clusters and providing relevant suggestions
- EKS Cluster Management: Create and manage EKS clusters with dedicated VPCs, proper networking, and CloudFormation templates for reliable, repeatable deployments
- Kubernetes Resource Management: Create, read, update, delete, and list Kubernetes resources with support for applying YAML manifests
- Application Deployment: Generate and deploy Kubernetes manifests with customizable parameters for containerized applications
- Operational Support: Access pod logs, Kubernetes events, and monitor cluster resources
- CloudWatch Integration: Retrieve logs and metrics from CloudWatch for comprehensive monitoring
- Integrated Troubleshooting: Accessing AWS's internal EKS troubleshooting knowledge base
- Security-First Design: Configurable read-only mode, sensitive data access controls, and IAM integration for proper permissions management
Core Features and Capabilities
1. Kubernetes Resource Management
The EKS MCP Server provides comprehensive resource management capabilities without requiring deep kubectl
expertise:
# Traditional approach - manual YAML creation
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web-app
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
With MCP: Natural language request like "Deploy a web application with 3 replicas using nginx 1.21 in the production namespace" automatically generates and applies the appropriate resources.
2. EKS Auto Mode Cluster Management
Automated Cluster Creation
# Traditional eksctl approach
eksctl create cluster \
--name my-cluster \
--version 1.29 \
--region us-west-2 \
--vpc-private-subnets subnet-xxx,subnet-yyy \
--vpc-public-subnets subnet-aaa,subnet-bbb \
--with-oidc \
--managed
MCP Enhancement: Request "Create an EKS cluster with Auto Mode in us-west-2" triggers automated CloudFormation stack deployment including:
- Dedicated VPC with appropriate subnets
- Security groups with least-privilege access
- OIDC provider configuration
- Auto Mode node pools with optimal instance selection
3. Intelligent Troubleshooting Engine
The MCP server includes direct access to AWS's internal EKS troubleshooting guide through the search_eks_troubleshoot_guide
function:
// Example MCP function call
{
"method": "search_eks_troubleshoot_guide",
"params": {
"query": "pod scheduling issues",
"cluster_context": {
"version": "1.29",
"node_groups": ["managed", "fargate"]
}
}
}
4. Security-Centric Design
Default Read-Only Operation
# Starting MCP server in secure mode (default)
mcp-server-eks --region us-west-2
# Enabling write operations (explicit flag required)
mcp-server-eks --region us-west-2 --allow-write
Comparison With Traditional Approaches
The Reality Check: Before and After MCP
Let's be honest - working with Kubernetes has never been easy. Even experienced developers find themselves drowning in YAML files, debugging cryptic error messages, and spending hours on tasks that should take minutes. The traditional EKS experience often feels like this:
A Day in the Life: Traditional EKS Development
Picture this: You're a developer who just wants to deploy a simple Python web application. Here's what your day typically looks like:
- Morning Coffee & kubectl Confusion ☕
# You start with the basics, but even this requires research
kubectl create namespace my-app
kubectl create deployment my-app --image=my-python-app:latest
# Wait, what's the right syntax for resource limits again?
- Afternoon YAML Wrestling 🤼♂️
# After hours of Stack Overflow and documentation diving
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-python-app
namespace: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-python-app
template:
metadata:
labels:
app: my-python-app
spec:
containers:
- name: app
image: my-python-app:latest
ports:
- containerPort: 8080
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- Evening Troubleshooting Sessions 🌙
# Your pods are failing, but why?
kubectl describe pod my-python-app-xyz
kubectl logs my-python-app-xyz
kubectl get events --namespace my-app
# 3 hours later, you realize it was a simple port mismatch
Enter MCP: The Game Changer
Now, imagine the same scenario with the EKS MCP Server. Here's how that same day transforms:
A Day in the Life: MCP-Enhanced Development
- Morning Simplicity ☀️
You: "I have a Python app in my ECR repo at 123456789.dkr.ecr.eu-west-1.amazonaws.com/my-python-app:latest.
Can you deploy it to an EKS cluster called 'my-test-cluster'?"
AI: "I'll help you deploy this! Let me check if the cluster exists and create the necessary resources."
-
Automatic Infrastructure Creation 🏗️
Behind the scenes, MCP intelligently:- Checks if
my-test-cluster
exists - Creates a CloudFormation stack with VPC, subnets, and security groups
- Generates appropriate Kubernetes manifests
- Deploys your application with best practices built-in
- Checks if
Intelligent Problem Resolution 🧠
When issues arise:
You: "My pods seem to be failing. Can you investigate?"
AI: "I found the issue! Your image architecture (ARM64) doesn't match your node group (AMD64).
I'll recreate the deployment with the correct node selector."
Real-World Impact: The Numbers Don't Lie
Based on real developer experiences and our analysis:
Task | Traditional Time | MCP-Enhanced Time | Improvement |
---|---|---|---|
New Cluster Setup | 45-90 minutes | 5-10 minutes | 85% faster |
Application Deployment | 30-60 minutes | 3-5 minutes | 90% faster |
Troubleshooting Issues | 2-8 hours | 15-45 minutes | 80% faster |
Learning Basic Operations | 2-6 months | 1-3 weeks | 75% faster |
Use Cases and Real-World Examples
The "Vibe Coding" Revolution🎧💻
The EKS MCP Server isn't just about automation - it's about enabling what AWS engineers call "vibe coding." This means you can go from a rough idea to a deployed, production-ready application through natural conversation with your AI assistant.
Use Case 1: The Startup Sprint - Multi-Tenant SaaS Deployment
The Scenario: Meet Alex, a startup founder who needs to deploy a multi-tenant SaaS platform for their new customer management tool. They have limited DevOps experience but big ambitions.
The Traditional Nightmare 😰
# Alex would typically spend days on this:
# 1. Research namespace isolation patterns
# 2. Manually create network policies
# 3. Set up resource quotas for each tenant
# 4. Configure monitoring and logging
# 5. Debug inevitable security and networking issues
kubectl create namespace tenant-companya
kubectl create namespace tenant-companyb
# ... followed by dozens of YAML files and kubectl commands
The MCP Magic ✨
Alex: "I need to set up a multi-tenant environment for my SaaS app. I have tenants 'TechCorp' and 'StartupInc',
each should be isolated with 2GB RAM limits and auto-scaling between 2-10 pods based on demand."
AI Assistant: "Perfect! I'll create isolated environments for both tenants with proper security boundaries.
Let me set this up with network policies and resource quotas."
What Happens Behind the Scenes:
# Auto-generated with security best practices
apiVersion: v1
kind: Namespace
metadata:
name: tenant-techcorp
labels:
tenant: techcorp
isolation: enabled
created-by: mcp-server
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: tenant-isolation-techcorp
namespace: tenant-techcorp
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
tenant: techcorp
egress:
- to:
- namespaceSelector:
matchLabels:
tenant: techcorp
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota-techcorp
namespace: tenant-techcorp
spec:
hard:
requests.memory: "2Gi"
limits.memory: "2Gi"
pods: "10"
services: "5"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: saas-app-hpa-techcorp
namespace: tenant-techcorp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: saas-app-techcorp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
The Result: Alex goes from concept to secure, multi-tenant environment in under 10 minutes instead of 3-4 days of research and implementation.
Use Case 2: The "Oh No!" Moment - Production Troubleshooting
The Scenario: Jamie, a DevOps engineer, gets paged at 2 AM. The company's main application is down, customers are complaining, and the CEO is asking for updates every 15 minutes.
The Traditional Detective Work 🕵️♀️
# Jamie's typical 2 AM troubleshooting journey:
kubectl get pods --all-namespaces | grep -i CrashLoopBackOff
kubectl describe pod failing-pod-xyz
kubectl logs failing-pod-xyz --previous
kubectl get events --sort-by=.metadata.creationTimestamp
aws logs start-query --log-group-name /aws/eks/cluster-name/cluster
# 2 hours later, still searching through logs and documentation...
The MCP Superhero Moment 🦸♀️
Jamie: "The payment service pods in production are failing. Can you investigate what's happening?"
AI Assistant: "I'm analyzing the issue now. Let me check the pod status, events, and recent logs."
[MCP automatically invokes multiple tools:]
- Checks pod health across namespaces
- Retrieves recent events and error patterns
- Pulls CloudWatch logs with error filtering
- Accesses EKS troubleshooting knowledge base
AI Assistant: "Found the issue! The payment service is failing due to insufficient IAM permissions
for accessing the RDS database. The IAM role is missing the 'rds:DescribeDBInstances'
permission. I can fix this by updating the service account's IAM policy."
Jamie: "Yes, please fix it."
AI Assistant: "Done! I've updated the IAM policy and restarted the affected pods.
The service should be healthy in about 2 minutes."
The Magic Behind the Scenes:
The MCP server automatically:
- Used
list_k8s_resources
to identify failing pods - Called
get_k8s_events
to gather error context - Invoked
get_cloudwatch_logs
with error filtering - Searched the
eks_troubleshoot_guide
for IAM-related issues - Used
add_inline_policy
to fix the permissions - Applied the fix with
manage_k8s_resource
The Result: Jamie resolves a critical production issue in 5 minutes instead of 2-3 hours, becoming the office hero.
How to Use MCP in EKS
Prerequisites: Getting Your Environment Ready For magic 🪄
Before we dive into the magic, let's make sure you have everything you need. Think of this as preparing your workspace before starting a project:
Essential Tools (The Must-Haves):
- Python 3.10+ - The foundation for running MCP servers
- uv package manager - For fast Python package management
- AWS CLI with credentials - Your gateway to AWS services
Optional But Recommended (The Nice-to-Haves):
- eksctl - For advanced cluster management
- kubectl - For direct Kubernetes interaction when needed ### 🔐 Are You Authorized to Use MCP?
Before you can use the EKS MCP server to manage your Kubernetes resources, it's essential to ensure that your IAM role or user has the proper permissions. Without these, actions like querying cluster metadata, generating manifests, or deploying infrastructure will fail with authorization errors.
Let's walk through what permissions you need and why they matter.
🕵️♂️ Read-Only Permissions (For Observability and Safe Exploration)
If you're only querying information—such as cluster status, resource metrics, or IAM roles—grant your IAM principal the following read-only policy. This enables the MCP server to gather cluster insights, CloudWatch metrics, and IAM configurations without making changes:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"eks:DescribeCluster",
"cloudformation:DescribeStacks",
"cloudwatch:GetMetricData",
"logs:StartQuery",
"logs:GetQueryResults",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:ListRolePolicies",
"iam:ListAttachedRolePolicies",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"eks-mcpserver:QueryKnowledgeBase"
],
"Resource": "*"
}
]
}
✅ Tip: Start with read-only mode for safer exploration, especially in production environments.
✍️ Write Permissions (For Cluster Creation and Resource Deployment)
To fully leverage MCP's deployment automation—such as provisioning EKS clusters, creating networking resources, or applying manifests—you'll need broader permissions. We recommend attaching the following managed policies to your IAM role or user:
IAMFullAccess
Grants the ability to create and manage IAM roles and policies needed by your EKS workloads.AmazonVPCFullAccess
Allows provisioning of VPCs, subnets, route tables, NAT gateways, and other essential networking components.AWSCloudFormationFullAccess
Required to deploy the CloudFormation stack located at:
/awslabs/eks_mcp_server/templates/eks-templates/eks-with-vpc.yaml
Custom EKS Full Access Policy (needed for full cluster and node group operations):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "eks:*",
"Resource": "*"
}
]
}
🔄 Accessing the Kubernetes API: What You Should Know
Even with the correct IAM permissions, Kubernetes API access in EKS has a few additional rules. For your user or role to successfully interact with the Kubernetes API via MCP, one of the following conditions must be true:
- The IAM principal created the EKS cluster originally, and thus has automatic API access.
- An EKS Access Entry has been manually configured to grant access to your IAM principal.
If you encounter Unauthorized
or Forbidden
errors while performing Kubernetes actions, it's likely due to a missing access entry. Review the EKS documentation on Access Entries for instructions on granting permissions explicitly.
Setting Up Your AI Copilot
The beauty of the EKS MCP Server is that it works with multiple AI assistants. Here's how to set it up with the most popular options:
Option 1: Cursor IDE Setup (Recommended for Developers)
Cursor IDE has become the go-to choice for developers who want AI assistance integrated directly into their coding workflow.
Step 1: Basic Configuration
- Open Cursor and click the gear icon (⚙️) in the top-right corner
- Navigate to MCP → Add new global MCP server
- Paste this configuration:
For Mac/Linux:
{
"mcpServers": {
"awslabs.eks-mcp-server": {
"autoApprove": [],
"disabled": false,
"command": "uvx",
"args": [
"awslabs.eks-mcp-server@latest",
"--allow-write"
],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-profile",
"AWS_REGION": "us-west-2"
},
"transportType": "stdio"
}
}
}
For Windows:
{
"mcpServers": {
"awslabs.eks-mcp-server": {
"autoApprove": [],
"disabled": false,
"command": "uvx",
"args": [
"--from",
"awslabs.eks-mcp-server@latest",
"awslabs.eks-mcp-server.exe",
"--allow-write"
],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-profile",
"AWS_REGION": "us-west-2"
},
"transportType": "stdio"
}
}
}
After a few minutes, you should see a green indicator if your MCP server definition is valid.
Step 2: Test Your Setup
Open a chat panel in Cursor (Ctrl/⌘ + L
) and try:
"Create a new EKS cluster named 'my-test-cluster' in the 'us-west-2' region using Kubernetes version 1.31."
Option 2: Amazon Q Developer CLI Setup
Step 1: Install Q Developer CLI
-
Set up the Amazon Q Developer CLI
- Install the Amazon Q Developer CLI .
- The Q Developer CLI supports MCP servers for tools and prompts out-of-the-box. Edit your Q developer CLI's MCP configuration file named mcp.json following these instructions. For example:
Verify Setup
# Check available tools
q tools
Step 2: Configure MCP
Edit your mcp.json
file:
For Mac/Linux:
{
"mcpServers": {
"awslabs.eks-mcp-server": {
"command": "uvx",
"args": ["awslabs.eks-mcp-server@latest"],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR"
},
"autoApprove": [],
"disabled": false
}
}
}
For Windows:
{
"mcpServers": {
"awslabs.eks-mcp-server": {
"command": "uvx",
"args": ["--from", "awslabs.eks-mcp-server@latest", "awslabs.eks-mcp-server.exe"],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR"
},
"autoApprove": [],
"disabled": false
}
}
}
Verify your setup by running the /tools
command in the Q Developer CLI to see the available EKS MCP tools.
Understanding Security Flags and Configurations 🔒
The EKS MCP Server comes with built-in configurable arguments and environment variables as safety switches:
The args
field in your MCP server definition allows you to customize how the EKS MCP server runs by passing specific command-line arguments. These flags control permissions, security behavior, and how the server interacts with Kubernetes and AWS resources.
You can fine-tune the behavior of the EKS MCP server using environment variables defined under the env
field. These variables control everything from logging verbosity to AWS authentication settings.
🔧 Common Command Arguments
--allow-write
Flag
When the --allow-write
flag is enabled, the EKS MCP Server can create missing IAM permissions for EKS resources through the add_inline_policy
tool. This tool enables the following:
- Only creates new inline policies; it never modifies existing policies.
- Is useful for automatically fixing common permissions issues with EKS clusters.
Should be used with caution and with properly scoped IAM roles.
What it does: Enables creation, modification, and deletion of resources
When to use: Development environments, trusted automation
When NOT to use: Production clusters without proper review processes
// Conservative approach (read-only)
"args": ["awslabs.eks-mcp-server@latest"]
// Development approach (with write access)
"args": ["awslabs.eks-mcp-server@latest", "--allow-write"]
--allow-sensitive-data-access
Flag
Enables access to sensitive data such as logs, events, and Kubernetes Secrets.
- Default: false (Access to sensitive data is restricted by default)
- What it does: Allows access to logs, events, and secrets
- When to use: Troubleshooting, monitoring, development
- When NOT to use: Shared environments or when logs contain sensitive data
// Full access (use carefully)
"args": [
"awslabs.eks-mcp-server@latest",
"--allow-write",
"--allow-sensitive-data-access"
]
Important Security Note: Users should exercise caution when --allow-write
and --allow-sensitive-data-access
modes are enabled with these broad permissions, as this combination grants significant privileges to the MCP server. Only enable these flags when necessary and in trusted environments. For production use, consider creating more restrictive custom policies.
⚙️ Common Environment variables
Here's a sample configuration snippet:
{
"mcpServers": {
"awslabs.eks-mcp-server": {
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "my-profile",
"AWS_REGION": "us-west-2"
}
}
}
}
🔊 FASTMCP_LOG_LEVEL
(optional)
Controls the verbosity of logs produced by the MCP server.
-
Accepted values:
"DEBUG"
,"INFO"
,"WARNING"
,"ERROR"
,"CRITICAL"
-
Default:
"WARNING"
-
Use case: Set to
"ERROR"
in production to reduce noise; use"DEBUG"
when troubleshooting.
📌 Example:
"FASTMCP_LOG_LEVEL": "ERROR"
🔐 AWS_PROFILE
(optional)
Specifies which named AWS CLI profile to use when authenticating with AWS services.
- Default: If not set, the server falls back to the default credentials provider chain (e.g., environment, EC2 metadata).
- Use case: Ideal when running the server locally with multiple profiles configured.
📌 Example:
"AWS_PROFILE": "my-profile"
🌍 AWS_REGION
(optional)
Defines the target AWS region where EKS clusters are located. All MCP operations will use this region context.
- Default: If not provided, AWS SDK default behavior will apply (which may vary based on environment).
- Use case: Ensure MCP commands and deployments run in the intended region, especially when managing clusters across multiple environments.
📌 Example:
"AWS_REGION": "us-west-2"
Best Practices for Safe MCP Usage
The "Production Safety" Checklist ✅
- [ ] Start Read-Only: Always begin with read-only mode for evaluation
- [ ] Environment Separation: Use different configurations for dev/staging/prod
- [ ] Access Control: Apply least-privilege IAM policies
- [ ] Audit Everything: Enable comprehensive logging
- [ ] Regular Updates: Keep MCP server updated with security patches
The "Developer Happiness" Checklist 😊
- [ ] Enable Write Mode: For development environments, enable
--allow-write
- [ ] Sensitive Data Access: Enable for troubleshooting capabilities
- [ ] Auto-Approve: Consider enabling for trusted, repeated operations
- [ ] Multiple MCP Servers: Combine EKS with other AWS MCP servers as needed
- [ ] Custom Regions: Set appropriate AWS regions for your infrastructure ### Quick Troubleshooting Guide
"It's Not Working!" - Common Issues and Solutions
Issue: MCP server shows as disconnected
# Check AWS credentials
aws sts get-caller-identity
# Verify Python and uv installation
python --version
uv --version
# Check MCP server logs
# (Look in your AI assistant's debug/log output)
Issue: Permission denied errors
# Verify IAM permissions
aws iam simulate-principal-policy \
--policy-source-arn $(aws sts get-caller-identity --query Arn --output text) \
--action-names eks:DescribeCluster \
--resource-arns "*"
Issue: Cluster connection problems
# Update kubeconfig
aws eks update-kubeconfig --region us-west-2 --name my-cluster
# Test connectivity
kubectl cluster-info
Architecture and Visual Overview
How Everything Connects: The Big Picture
Imagine the EKS MCP Server as a universal translator that sits between your natural language requests and the complex world of AWS and Kubernetes APIs. Here's how the magic happens:
- AI Assistant (e.g.,
Cursor
) at the top - MCP Protocol layer
- EKS MCP Server in the middle
- AWS Services (EKS, IAM, CloudWatch, VPC) at the bottom
- Bidirectional data flow arrows
- Security boundaries and encryption indicators]
The Intelligence Behind the Simplicity
What you see: Simple conversation with your AI assistant
What's actually happening: A sophisticated orchestration of AWS services
Your Input: "Deploy my Python app to EKS"
↓
AI Processing: Understanding intent and context
↓
MCP Translation: Converting to specific tool calls
↓
AWS API Calls: Executing infrastructure operations
↓
Kubernetes Operations: Managing application deployments
↓
Real-time Feedback: Monitoring and reporting status
↓
Human-friendly Response: "Your app is deployed and healthy!"
The Tools Under the Hood
The EKS MCP Server comes packed with an impressive array of tools. Think of them as specialized functions that handle different aspects of cluster management to automate and simplify management of your Amazon EKS clusters and Kubernetes resources. Each tool performs a targeted operation and can be invoked as part of your workflow for provisioning, managing, observing, and troubleshooting infrastructure.
Cluster Management Tools 🏗️
-
manage_eks_stacks
- Your cluster lifecycle manager. Automates lifecycle management of EKS CloudFormation stacks. Features:- Generate CloudFormation templates for EKS clusters.
- Deploy clusters with all necessary components (VPCs, subnets, IAM roles, etc.).
- Describe stack metadata, status, outputs.
- Delete stacks and clean up associated resources.
- Operates only on stacks originally created by this tool.
Parameters:
-
operation
:generate
,deploy
,describe
, ordelete
-
template_file
: required forgenerate
/deploy
-
cluster_name
: required for all operations-
search_eks_troubleshoot_guide
- Your troubleshooting expert Searches AWS EKS Troubleshoot Guide for relevant issue resolutions.
-
Features:
- Provides solutions for common EKS issues (bootstrap, node autoscaling, etc.)
- Suggests short-term fixes and long-term resolutions
Parameters:
query
Kubernetes Resource Tools ⚙️
-
manage_k8s_resource
- Your Swiss Army knife for Kubernetes objects Manages any Kubernetes resource directly.
Features:
- Supports
create
,replace
,patch
,delete
, andread
- Works with both namespaced and non-namespaced resources
Parameters:
-
operation
,cluster_name
,kind
,api_version
,name
-
namespace
(optional),body
(for create/replace/patch)-
list_k8s_resources
- Your resource discovery tool Lists resources by type in a Kubernetes cluster.
-
Features:
- Filters by namespace, label, or field selectors
- Outputs metadata for matched resources
Parameters:
-
cluster_name
,kind
,api_version
-
namespace
,label_selector
,field_selector
(all optional)-
apply_yaml
- Your manifest deployment specialist Applies multi-resource YAML manifests to a cluster.
-
Features:
- Accepts multi-document YAML files
- Applies all resources within a specified namespace
- Can force updates to existing resources
Parameters:
-
yaml_path
,cluster_name
,namespace
,force
-
list_api_versions
your Kubernetes objects refrence Lists all API versions available in a Kubernetes cluster. Features: - Includes both core (
v1
) and grouped (apps/v1
, etc.) APIs - Useful for compatibility checks and YAML generation
Parameters:
cluster_name
-
Application Support Tools 🚀
-
generate_app_manifest
- Your deployment template generator Generates basic Kubernetes manifests for your application.
Features:
- Produces
Deployment
andService
YAML files - Configurable replicas, resources, load balancer, etc.
Parameters:
-
app_name
,image_uri
,output_dir
- Optional:
port
,replicas
,cpu
,memory
,namespace
,load_balancer_scheme
-
get_pod_logs
- Your application debugger Retrieves logs from a specific pod.
-
Features:
- Filter by time window, line count, or byte size
- Supports logs from specific containers
- Requires
--allow-sensitive-data-access
Parameters:
-
cluster_name
,pod_name
,namespace
- Optional:
container_name
,since_seconds
,tail_lines
,limit_bytes
-
get_k8s_events
- Your event investigator Fetches Kubernetes events for a resource.
-
Features:
- Returns detailed info: timestamps, reasons, component, and type
- Supports both namespaced and cluster-wide resources
- Requires
--allow-sensitive-data-access
Parameters:
-
cluster_name
,kind
,name
- Optional:
namespace
CloudWatch Integration Tools 📊
-
get_cloudwatch_logs
- Your centralized logging assistant Fetches CloudWatch logs for specific EKS resources.
Features:
- Query logs by time, resource type, name, filter patterns
- Supports both infrastructure and application logs
- Requires
--allow-sensitive-data-access
Parameters:
-
cluster_name
,log_type
,resource_type
- Optional:
resource_name
,minutes
,start_time
,end_time
,limit
,filter_pattern
,fields
-
get_cloudwatch_metrics
- Your performance monitoring tool Fetches CloudWatch metrics for your workloads.
-
Features:
- Query by metric name, namespace, dimensions
- Configure range, granularity, and statistic
- Supports custom dimensions
Parameters:
-
cluster_name
,metric_name
,namespace
,dimensions
- Optional:
minutes
,start_time
,end_time
,limit
,stat
,period
-
get_eks_metrics_guidance
Lists recommended metrics and dimensions for various EKS resource types.
-
Features:
- Covers supported types:
cluster
,node
,pod
,namespace
,service
- Outputs available metrics, descriptions, and dimension mappings
Parameters:
resource_type
Implementation Note:
Generated from AWS Container Insights metrics using:
uv pip install bs4
python /scripts/update_eks_cloudwatch_metrics_guidance.py
IAM Integration 🔐
-
get_policies_for_role
Retrieves policy details for an IAM role.
Features:
- Includes assume role policy, managed policies, and inline policies
Parameters:
-
role_name
add_inline_policy
Attaches a new inline policy to an IAM role.
Features:
- Prevents accidental overwrite of existing policies
- Accepts JSON policy document or list of statements
- Requires
--allow-write
Parameters:
-
role_name
,policy_name
,permissions
The Smart Design Philosophy
Why Unified Tools Instead of Separate Functions?
Traditional approaches would create individual tools for every Kubernetes resource type (pods, services, deployments, etc.). This would quickly overwhelm the AI's context window. Instead, the EKS MCP Server uses a clever approach:
Instead of:
- create_pod_tool
- create_service_tool
- create_deployment_tool
- update_pod_tool
- update_service_tool
- ... (50+ tools)
We have:
- manage_k8s_resource (handles all CRUD operations)
- list_k8s_resources (handles all resource discovery)
- apply_yaml (handles manifest deployment)
This design keeps the context window manageable while providing comprehensive functionality.
Security and Governance: Balancing Power with Control
Understanding the Security Paradigm
When we talk about granting AI agents permissions to manage your cloud infrastructure, it's natural to have concerns. The AWS team has designed EKS MCP with a fundamental security principle in mind: MCP servers only have access to what you already have access to. They cannot magically access secrets from other accounts or perform actions beyond your existing permissions.
Think of it this way: the MCP server operates with the same level of access that you, as a developer, would have. It's essentially acting as an intelligent extension of your existing credentials, not as a privileged escalation tool.
Critical Security Considerations in Production
The Reality of AI-Powered Operations
During AWS's internal discussions, the team emphasized a crucial point: these tools are incredibly powerful, and that power requires responsibility,and as any Spider-Man fan knows: with great power comes great responsibility." 🕷️💻 As one AWS engineer put it during their live demo: "We are in some ways making it more powerful for them, making it easier for them to deploy... but again, make sure you check, please, Vibe coding and AI tools can take you far—but if you’re flying blind, you might also crash hard."
Production Environment Safeguards
The Golden Rule: When running MCP servers on production clusters, ⚠️🛑 always turn off auto-approvals for write operations. Here's why this matters:
In live demonstrations, AWS engineers showed scenarios where:
- An incorrect API endpoint was automatically corrected ✅ (helpful)
- But in another case, when an endpoint was wrong, the system also changed the container image saying "maybe use another image" and patched the deployment ❌ (potentially dangerous)
Recommendation: Approve write operations one by one in production environments to maintain control over what gets deployed.
Data Protection and Privacy
Redacting Sensitive Information
One of the most significant security features being implemented is automatic redaction of PII and sensitive data. This includes:
- Passwords and secret keys
- API tokens and credentials
- Personal identifiable information
- Sensitive configuration data
This data is redacted from both logs and AI model outputs, addressing concerns about secure data being passed to LLMs.
IAM Integration and Best Practices
Principle of Least Privilege in Practice
The MCP server follows AWS security best practices through:
- Dedicated IAM roles designed specifically for MCP operations with minimal required permissions
- Separate roles for read-only versus write operations
- Resource tagging strategies to limit actions to MCP-managed resources
- Regular permission audits using IAM Access Analyzer to identify and remove unused permissions
Kubernetes RBAC: Your Safety Net
Remember that even with proper IAM permissions, Kubernetes API access must be correctly configured. The MCP server operates within the same RBAC constraints that govern your manual kubectl operations.
Operational Security: The Human Element
The Importance of Vigilance
As AWS's product manager candidly shared: "I'm not an engineer by trade... I'm not exactly sure all of the guidelines that I need to make sure that I check. Since I'm not an engineer, I don't know what I don't know."
This honest admission highlights a critical point: monitoring and vigilance are essential. Whether you're a pro or new to Kubernetes world, always:
- Review what's being deployed to your account
- Understand the changes before approving them
- Set up proper monitoring and alerting
- Implement resource limits and quotas
Guardrails and Control Mechanisms
The MCP server includes several built-in safety features:
- Resource validation before deploying infrastructure
- Template verification to prevent arbitrary stack deletion
- Allowlists and denylists for specific resources
- Consent requirements for sensitive operations
Future Potential and AWS Vision: The Evolution of AI-Driven Infrastructure
Where We Are Today vs. Tomorrow
Currently, we're in in a "supervised state" with AI integrations, as AWS call it. As one AWS engineer noted: "We're not quite there yet for unsupervised agents just monitoring your clusters and making actions. It'll be some time before we fully trust agents."
But the trend is evident and the opportunities are vast.
Near-Term Evolution
Improved Remote Features
Obstacle: Some AI tooling doesn't work too well with remote MCP hosts.
Solution: The tendency in the industry is:
- Improved remote MCP server design
- Pre-defined best practise templates
- Automatic updating and maintenance
- Enhanced reliability for distributed deployments
Agent to agent communication
One of the more promising ones is agent communication. Imagine agents that can:
- Communicate with one another without direct user action
- Partner with Delivery for complicated deployment scenarios – Discuss ideas and help others troubleshoot issues - Keep audit trails of all inter-agent operations
The Problem: What guidelines should you put in place so that agents are responsive while you still end up seeing the final results?
Addressing the Context Window Problem
The Current Limitation
As of now there is a restriction on addition of no of MCP tools IDE can work at a time. This poses a problem when you have to use the right tools for the job.
The Future Solution
AWS is exploring:
Dynamic tool, switch: It selects the correct MCP server depending on the current context Automatically generates tool switching, the default workflow tool selects the correct MCP server depending on the current context spinning up a dev environment outside an IDE, where you can directly modify project files.
Smart tool routing: Selection of the most appropriate tool based on context
-Standardised interfaces: MCP servers are easier to be interchanged and's more reliable.
Final Thoughts: The Pull of the Long-Term ValueError — The path from Supervised to Autonomous
Today: AI Help That’s Monitored
- AI suggests actions
- Humans review and approve
- Clear audit trails
- Safety nets and guardrails
Tomorrow: Smart Autonomous Operations
- Proactive monitoring of the cluster health
- Self-healing infrastructure
- Predictive issue resolution
- Oversight over humans with exception intervention
The Big Idea: Trust via Transparency
The journey to autonomy is not one of eliminating human overseers, but of designing AI systems so trustworthy, transparent and predictable that these overseers become strategic rather than simply tactical.
Benefiting Industry: Best Practices, Accelerated Innovation
The Feedback Loop Effect
Early feedback to AWS has revealed that supervisedcustomers are executing better practices when setting up MCP. This forms a positive feedback cycle:
- AI advises the right moves → Better practice.rewire.
- Improved applications → Better applications
- Reliability of systems rise → More reliance on AI supporting us
- More confidence- Greater acceptance of automation
Innovation Acceleration
Developers spend more time on: Here is how you spend your time more on the following and less on the previous section: Infrastructure complexity.
- Business logic and functionality
- User experience enhancements
- Crafty problem-solver
- Quick prototyping & iteration
Difficulties and Self-Reflection
Challenge of Summarization
As AWS engineers said while testing: ``When the LLM is trying to diagnose the problem it is asking multiple things and trying to summarize the result. Sometimes the summarization isn’t a match of what we intended to do.” *
Example: And in an EKS Auto Mode investigation, where the AI correctly figured out which policies were needed, it thought they should be added to the node role first, not the cluster role. On the second proofing, it fixed this.
The Challenge Ahead: Getting the balance on data for AI models right – enough for the right troubleshooting without clogging up the context window.
Problem Of Consistent Installation
Current problem: Not all MCP servers install the same, even on the same server config. The industry is heading toward standardization to try to make these interactions more predictable and reliable.
The Bigger Picture: Democratizing Cloud Expertise
The ultimate vision extends beyond just making Kubernetes easier. It's about:
- Democratizing cloud expertise: Making advanced cloud capabilities accessible to developers regardless of their infrastructure background
- Reducing the expertise gap: Helping junior developers learn through AI-guided practice
- Improving security posture: Making security best practices the default, not the exception
- Accelerating innovation: Removing infrastructure complexity as a barrier to creativity
The convergence of AI and cloud infrastructure management represents one of the most significant shifts in how we build and operate systems. Amazon EKS MCP is positioned at the forefront of this transformation, providing both the power to accelerate development and the guardrails to do so safely.
Conclusion
The Amazon EKS Model Context Protocol represents a transformative advancement in cloud-native development, fundamentally changing how developers interact with Kubernetes infrastructure. By bridging the gap between natural language and complex cluster operations, MCP democratizes access to enterprise-grade container orchestration while maintaining the security and operational excellence that AWS customers demand.
Key Benefits Realized
- Accelerated Development Cycles: Reducing deployment times from hours to minutes
- Lowered Barrier to Entry: Making Kubernetes accessible to developers of all skill levels
- Enhanced Operational Excellence: Integrating best practices into every interaction
- Improved Security Posture: Implementing security-by-default with granular controls
- Cost Optimization: Intelligent resource management reducing unnecessary expenses
Strategic Implications
The introduction of MCP signals AWS's commitment to AI-driven infrastructure management, positioning the platform for the next generation of cloud-native applications. Organizations adopting MCP early will gain competitive advantages through:
- Faster Time-to-Market: Reduced complexity in deployment pipelines
- Improved Developer Satisfaction: Focus on business logic rather than infrastructure management
- Enhanced Reliability: AI-assisted troubleshooting and preventive maintenance
- Future-Proof Architecture: Foundation for emerging AI and ML workloads
What's Next?
To explore Amazon EKS MCP in your environment:
- Start with Evaluation: Deploy MCP in read-only mode for risk-free exploration
- Pilot Project: Choose a non-critical application for initial testing
- Team Training: Invest in AI-assisted development practices
- Gradual Adoption: Expand usage based on success metrics and team confidence
- Community Engagement: Contribute feedback and use cases to shape future development
The convergence of artificial intelligence and cloud infrastructure management is no longer a future possibility—it's today's reality. Amazon EKS MCP provides the foundation for this transformation, enabling organizations to harness the full potential of AI-assisted development while maintaining the reliability, security, and scalability that modern applications demand.
References
Official AWS Documentation
- Amazon EKS User Guide
- EKS Auto Mode Documentation
- Model Context Protocol (MCP) Server for EKS
- AWS IAM Best Practices
- AWS Well-Architected Framework
Technical Resources
- EKS MCP Server GitHub Repository
- MCP Server Source Code on GitHub
- EKS MCP Server README
- MCP Server Web Documentation
- Model Context Protocol Overview
- Model Context Protocol GitHub Organization
- Kubernetes Official Documentation
- CNCF Security Best Practices
Community and Learning
- AWS EKS Workshop
- Kubernetes Academy
- CNCF Training and Certification
- AWS Containers Blog – EKS MCP Overview
- AWS Containers Blog – ECS MCP Overview
Related Whitepapers
- "AI-Driven Infrastructure Management: The Future of Cloud Operations"
- "Security Best Practices for AI-Integrated Development Workflows"
- "Cost Optimization Strategies for Modern Kubernetes Deployments"
- "The Evolution of Developer Experience in Cloud-Native Environments"
Top comments (0)