DEV Community

Cover image for From Kubernetes Chaos to Clarity: How EKS MCP Server Simplifies Cluster Management
Rohith Reddy Anugu
Rohith Reddy Anugu

Posted on

From Kubernetes Chaos to Clarity: How EKS MCP Server Simplifies Cluster Management

Last week, I came across this official AWS article announcing the launch of the Amazon EKS Model Context Protocol (MCP) Server, and I’ll be honest—I was blown away.

As a software engineer who has spent the last couple of years working on Kubernetes across multiple projects, I’ve seen firsthand how complex and time-consuming EKS management can be. Between juggling kubectl commands, configuring IAM roles, writing endless YAML files, and debugging obscure errors, it's often felt like more of an operational obstacle course than a developer-friendly platform.

So, when I read that AWS is now empowering us to create and manage EKS clusters using natural language, I knew I had to drop everything and experience this for myself. This wasn't just a new feature; it felt like a paradigm shift.

In this article, I want to take you on a journey—a "before and after" comparison that vividly illustrates the traditional approach to EKS management versus the revolutionary conversational simplicity and intelligence the MCP Server brings to the table. Prepare to see a significant leap forward in developer experience.

Let's start with our star application for this demonstration: a classic "Hello World" Python Flask app.

app.py

from flask import Flask
import os

app = Flask(__name__)

@app.route('/')
def hello_world():
    return f'Hello World from Python! Pod: {os.environ.get("HOSTNAME", "unknown")}'

@app.route('/health')
def health_check():
    return {'status': 'healthy'}, 200

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080)
Enter fullscreen mode Exit fullscreen mode

Dockerfile

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY app.py .

EXPOSE 8080

CMD ["python", "app.py"]

Enter fullscreen mode Exit fullscreen mode

requirements.txt

Flask==2.3.3

Nothing fancy - just a basic web app that returns "Hello World" and includes a health check endpoint.

The Traditional Way (Before MCP Server)

Step 1: Infrastructure Setup

# Step 1: Complete Infrastructure Setup (90 minutes)
# **Create VPC and Networking:**

# 1. Create VPC and capture VPC ID
VPC_ID=$(aws ec2 create-vpc --cidr-block 10.0.0.0/16 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=eks-vpc}]' \
  --query 'Vpc.VpcId' --output text)

echo "Created VPC: $VPC_ID"

# 2. Create Internet Gateway
IGW_ID=$(aws ec2 create-internet-gateway \
  --query 'InternetGateway.InternetGatewayId' --output text)

# 3. Attach Internet Gateway to VPC
aws ec2 attach-internet-gateway --vpc-id $VPC_ID --internet-gateway-id $IGW_ID

# 4. Create Public Subnets (for NAT Gateways and Load Balancers)
PUBLIC_SUBNET_1=$(aws ec2 create-subnet --vpc-id $VPC_ID \
  --cidr-block 10.0.1.0/24 --availability-zone us-west-2a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-public-subnet-1}]' \
  --query 'Subnet.SubnetId' --output text)

PUBLIC_SUBNET_2=$(aws ec2 create-subnet --vpc-id $VPC_ID \
  --cidr-block 10.0.2.0/24 --availability-zone us-west-2b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-public-subnet-2}]' \
  --query 'Subnet.SubnetId' --output text)

# 5. Create Private Subnets (for EKS worker nodes)
PRIVATE_SUBNET_1=$(aws ec2 create-subnet --vpc-id $VPC_ID \
  --cidr-block 10.0.3.0/24 --availability-zone us-west-2a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-private-subnet-1}]' \
  --query 'Subnet.SubnetId' --output text)

PRIVATE_SUBNET_2=$(aws ec2 create-subnet --vpc-id $VPC_ID \
  --cidr-block 10.0.4.0/24 --availability-zone us-west-2b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-private-subnet-2}]' \
  --query 'Subnet.SubnetId' --output text)

# 6. Enable auto-assign public IP for public subnets
aws ec2 modify-subnet-attribute --subnet-id $PUBLIC_SUBNET_1 --map-public-ip-on-launch
aws ec2 modify-subnet-attribute --subnet-id $PUBLIC_SUBNET_2 --map-public-ip-on-launch

# 7. Allocate Elastic IPs for NAT Gateways
EIP_1=$(aws ec2 allocate-address --domain vpc --query 'AllocationId' --output text)
EIP_2=$(aws ec2 allocate-address --domain vpc --query 'AllocationId' --output text)

# 8. Create NAT Gateways in public subnets
NAT_GW_1=$(aws ec2 create-nat-gateway --subnet-id $PUBLIC_SUBNET_1 \
  --allocation-id $EIP_1 \
  --tag-specifications 'ResourceType=nat-gateway,Tags=[{Key=Name,Value=eks-nat-gateway-1}]' \
  --query 'NatGateway.NatGatewayId' --output text)

NAT_GW_2=$(aws ec2 create-nat-gateway --subnet-id $PUBLIC_SUBNET_2 \
  --allocation-id $EIP_2 \
  --tag-specifications 'ResourceType=nat-gateway,Tags=[{Key=Name,Value=eks-nat-gateway-2}]' \
  --query 'NatGateway.NatGatewayId' --output text)

# 9. Wait for NAT Gateways to be available (this can take 2-5 minutes)
echo "Waiting for NAT Gateways to be available..."
aws ec2 wait nat-gateway-available --nat-gateway-ids $NAT_GW_1 $NAT_GW_2

# 10. Create Route Table for Public Subnets
PUBLIC_RT=$(aws ec2 create-route-table --vpc-id $VPC_ID \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=eks-public-rt}]' \
  --query 'RouteTable.RouteTableId' --output text)

# 11. Create Route Tables for Private Subnets (one per AZ for high availability)
PRIVATE_RT_1=$(aws ec2 create-route-table --vpc-id $VPC_ID \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=eks-private-rt-1}]' \
  --query 'RouteTable.RouteTableId' --output text)

PRIVATE_RT_2=$(aws ec2 create-route-table --vpc-id $VPC_ID \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=eks-private-rt-2}]' \
  --query 'RouteTable.RouteTableId' --output text)

# 12. Create routes in public route table (Internet Gateway)
aws ec2 create-route --route-table-id $PUBLIC_RT \
  --destination-cidr-block 0.0.0.0/0 --gateway-id $IGW_ID

# 13. Create routes in private route tables (NAT Gateways)
aws ec2 create-route --route-table-id $PRIVATE_RT_1 \
  --destination-cidr-block 0.0.0.0/0 --nat-gateway-id $NAT_GW_1

aws ec2 create-route --route-table-id $PRIVATE_RT_2 \
  --destination-cidr-block 0.0.0.0/0 --nat-gateway-id $NAT_GW_2

# 14. Associate subnets with route tables
aws ec2 associate-route-table --subnet-id $PUBLIC_SUBNET_1 --route-table-id $PUBLIC_RT
aws ec2 associate-route-table --subnet-id $PUBLIC_SUBNET_2 --route-table-id $PUBLIC_RT
aws ec2 associate-route-table --subnet-id $PRIVATE_SUBNET_1 --route-table-id $PRIVATE_RT_1
aws ec2 associate-route-table --subnet-id $PRIVATE_SUBNET_2 --route-table-id $PRIVATE_RT_2

# 15. Tag subnets for EKS discovery (required for Load Balancer Controller)
aws ec2 create-tags --resources $PUBLIC_SUBNET_1 $PUBLIC_SUBNET_2 \
  --tags Key=kubernetes.io/role/elb,Value=1

aws ec2 create-tags --resources $PRIVATE_SUBNET_1 $PRIVATE_SUBNET_2 \
  --tags Key=kubernetes.io/role/internal-elb,Value=1

aws ec2 create-tags --resources $PUBLIC_SUBNET_1 $PUBLIC_SUBNET_2 $PRIVATE_SUBNET_1 $PRIVATE_SUBNET_2 \
  --tags Key=kubernetes.io/cluster/python-hello-cluster,Value=shared

# 16. Create Security Group for EKS Control Plane
CONTROL_PLANE_SG=$(aws ec2 create-security-group \
  --group-name eks-control-plane-sg \
  --description "Security group for EKS control plane" \
  --vpc-id $VPC_ID \
  --query 'GroupId' --output text)

# 17. Create Security Group for EKS Worker Nodes
WORKER_NODE_SG=$(aws ec2 create-security-group \
  --group-name eks-worker-node-sg \
  --description "Security group for EKS worker nodes" \
  --vpc-id $VPC_ID \
  --query 'GroupId' --output text)

# 18. Configure security group rules for control plane
aws ec2 authorize-security-group-ingress \
  --group-id $CONTROL_PLANE_SG \
  --protocol tcp \
  --port 443 \
  --source-group $WORKER_NODE_SG

# 19. Configure security group rules for worker nodes
aws ec2 authorize-security-group-ingress \
  --group-id $WORKER_NODE_SG \
  --protocol -1 \
  --source-group $WORKER_NODE_SG

aws ec2 authorize-security-group-ingress \
  --group-id $WORKER_NODE_SG \
  --protocol tcp \
  --port 1025-65535 \
  --source-group $CONTROL_PLANE_SG

aws ec2 authorize-security-group-ingress \
  --group-id $WORKER_NODE_SG \
  --protocol tcp \
  --port 443 \
  --source-group $CONTROL_PLANE_SG

# Output important values for next steps
echo "=== Infrastructure Setup Complete ==="
echo "VPC ID: $VPC_ID"
echo "Public Subnets: $PUBLIC_SUBNET_1, $PUBLIC_SUBNET_2"
echo "Private Subnets: $PRIVATE_SUBNET_1, $PRIVATE_SUBNET_2"
echo "Control Plane Security Group: $CONTROL_PLANE_SG"
echo "Worker Node Security Group: $WORKER_NODE_SG"
echo "=== Save these values for EKS cluster creation ==="

# **Create IAM Roles (continued from the document):**

# 20. Create EKS Cluster Service Role trust policy
cat > cluster-trust-policy.json << EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

# 21. Create EKS Node Group trust policy
cat > node-trust-policy.json << EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

# 22. Create and configure cluster role
aws iam create-role --role-name EKSClusterRole --assume-role-policy-document file://cluster-trust-policy.json
aws iam attach-role-policy --role-name EKSClusterRole --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy

# 23. Create and configure node group role
aws iam create-role --role-name EKSNodeGroupRole --assume-role-policy-document file://node-trust-policy.json
aws iam attach-role-policy --role-name EKSNodeGroupRole --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
aws iam attach-role-policy --role-name EKSNodeGroupRole --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
aws iam attach-role-policy --role-name EKSNodeGroupRole --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

Enter fullscreen mode Exit fullscreen mode

Step 2: Create EKS Cluster

# Create the cluster
aws eks create-cluster \
  --name python-hello-cluster \
  --version 1.28 \
  --role-arn arn:aws:iam::ACCOUNT:role/EKSClusterRole \
  --resources-vpc-config subnetIds=subnet-xxxxx,subnet-xxxxx,subnet-xxxxx,subnet-xxxxx

# Wait for cluster to be active
aws eks wait cluster-active --name python-hello-cluster

# Update kubeconfig
aws eks update-kubeconfig --region us-west-2 --name python-hello-cluster
Enter fullscreen mode Exit fullscreen mode

Step 3: Create Node Group

# Create node group
aws eks create-nodegroup \
  --cluster-name python-hello-cluster \
  --nodegroup-name python-hello-nodes \
  --node-role arn:aws:iam::ACCOUNT:role/EKSNodeGroupRole \
  --subnets subnet-xxxxx subnet-xxxxx \
  --instance-types t3.medium \
  --scaling-config minSize=1,maxSize=3,desiredSize=2

# Wait for node group to be active
aws eks wait nodegroup-active --cluster-name python-hello-cluster --nodegroup-name python-hello-nodes
Enter fullscreen mode Exit fullscreen mode

Step 4: Build and Push Container

# Create ECR repository
aws ecr create-repository --repository-name python-hello-world

# Get login token
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.us-west-2.amazonaws.com

# Build and push image
docker build -t python-hello-world .
docker tag python-hello-world:latest ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/python-hello-world:latest
docker push ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/python-hello-world:latest
Enter fullscreen mode Exit fullscreen mode

Step 5: Create Kubernetes Manifests

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-hello-deployment
  labels:
    app: python-hello
spec:
  replicas: 2
  selector:
    matchLabels:
      app: python-hello
  template:
    metadata:
      labels:
        app: python-hello
    spec:
      containers:
      - name: python-hello
        image: ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/python-hello-world:latest
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 500m
            memory: 512Mi
          requests:
            cpu: 250m
            memory: 256Mi
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: python-hello-service
spec:
  selector:
    app: python-hello
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Step 6: Deploy and Debug

# Apply manifests
kubectl apply -f deployment.yaml

# Check pod status
kubectl get pods
# Output: python-hello-deployment-xxxxx-xxxxx   0/1     ImagePullBackOff   0          2m

# Debug ImagePullBackOff
kubectl describe pod python-hello-deployment-xxxxx-xxxxx
# Error: Failed to pull image... unauthorized

# Fix IAM permissions for ECR access
# Update node group IAM role
# Wait for pods to restart

# Check again
kubectl get pods
# Output: python-hello-deployment-xxxxx-xxxxx   0/1     CrashLoopBackOff   3          5m

# Debug CrashLoopBackOff  
kubectl logs python-hello-deployment-xxxxx-xxxxx
# Error: Port 5000 in use (Flask default)

# Fix port configuration in app.py
# Rebuild image, push again
# Update deployment

# Finally working after multiple iterations
kubectl get service python-hello-service
# Get LoadBalancer URL and test
Enter fullscreen mode Exit fullscreen mode

The New Way (With EKS MCP Server and Cline)

Prerequisites

Before we dive into the magic, a few simple setup steps:

  • Install Cline (a conversational AI tool) in your VS Code environment.
  • Configure the EKS MCP server within Cline settings (this establishes the connection to the powerful backend).
  • Ensure your AWS CLI is already configured with appropriate credentials (this still needs to be done once, but it's standard practice).

*The Magic Happens
*

I simply opened the Cline interface in VS Code and posed my request in plain English:

Create an EKS cluster and deploy my Python Flask hello world app.
The app is in app.py, uses port 8080, and needs a load balancer.
Make it production-ready with proper health checks and resource limits.

What Cline, powered by the EKS MCP Server, did automatically was nothing short of brilliant:

1. Understood My Application Context:

Cline: "I can see you have a Flask app with a health endpoint. I'll create an EKS cluster and deploy this with proper configuration."

The MCP Server intelligently analyzed my app.py and Dockerfile, recognizing the Flask framework, the health endpoint (/health), and the exposed port (8080). This context was vital for generating accurate Kubernetes manifests.

2 . Orchestrated EKS Cluster Creation:

Cline: "Creating EKS cluster with Auto Mode enabled and best practices... Using manage_eks_stacks tool to generate CloudFormation template..."

Behind the scenes, the MCP Server:

  • Generated a complete CloudFormation template: This wasn't a few lines; it was a comprehensive template encompassing all the VPC, subnet, route table, NAT Gateway, and security group configurations we painstakingly scripted earlier.
  • Created VPC with public/private subnets: Automatically provisioned the necessary network topology for a resilient EKS cluster.
  • Set up all IAM roles and policies: Handled the complex IAM roles for both the EKS service and worker nodes, attaching the correct managed policies (like AmazonEKSClusterPolicy, AmazonEKSWorkerNodePolicy, AmazonEKS_CNI_Policy, and yes, AmazonEC2ContainerRegistryReadOnly without me explicitly asking!).
  • Enabled EKS Auto Mode for simplified management: This is a fantastic detail; Auto Mode streamlines node group management and scaling.
  • Applied security best practices: Inherently configured security groups and network ACLs according to AWS recommendations.

3. Managed Container Registry and Image Build:

Cline: "Creating ECR repository and building your container image... Building multi-architecture image for better compatibility..."

The MCP Server orchestrated the creation of my ECR repository, then seamlessly handled the Docker build process, even opting for a multi-architecture image for broader compatibility – a detail I might have forgotten or deferred. It then pushed the image to the newly created ECR repository.

4. Crafted Perfect Kubernetes Manifests:

Cline: "Generating Kubernetes manifests with proper resource limits and health checks... Using apply_yaml tool to deploy your application..."

This was truly impressive. The generated YAML wasn't just functional; it was optimized and contained best practices that I would often manually refine over several iterations. It demonstrated a deep understanding of my request for "production-ready" deployment.

# Auto-generated by Cline via MCP server
apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-hello-world
  labels:
    app: python-hello-world
spec:
  replicas: 3
  selector:
    matchLabels:
      app: python-hello-world
  template:
    metadata:
      labels:
        app: python-hello-world
    spec:
      containers:
      - name: python-hello-world
        image: ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/python-hello-world:latest
        ports:
        - containerPort: 8080
          name: http
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi" 
            cpu: "200m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
        env:
        - name: PORT
          value: "8080"


Enter fullscreen mode Exit fullscreen mode

5. Deployed and verified everything worked

Cline: Application deployed successfully!

Load balancer URL: http://xxxxx.us-west-2.elb.amazonaws.com

Health check: ✅ Passing
All pods: ✅ Running

Total Time: A Stunning ~12 Minutes

2 minutes: Initial cluster creation overhead (waiting time).
8 minutes: Actual cluster provisioning (fully automated, including all the network and IAM resources).
2 minutes: Application deployment and immediate verification.

Compare this to the hours, or even days, of iteration and debugging I'd typically spend. It's a game-changer.

Code Comparison

Traditional Approach Files-

You'd need to create and manage:

project/
├── app.py
├── Dockerfile  
├── requirements.txt
├── cluster-trust-policy.json
├── node-trust-policy.json
├── cluster-cloudformation.yaml (200+ lines)
├── deployment.yaml
├── service.yaml
├── ingress.yaml (if needed)
├── setup-cluster.sh
├── deploy-app.sh
└── cleanup.sh
Enter fullscreen mode Exit fullscreen mode

With EKS MCP Server you only need:

project/
├── app.py
├── Dockerfile
├── requirements.txt
└── (everything else handled by AI)

Enter fullscreen mode Exit fullscreen mode

Conclusion: The Future of Cloud Operations is Conversational and Intelligent

The AWS EKS Model Context Protocol Server isn't just another service launch; it represents a fundamental philosophical shift in how we interact with our infrastructure. It's moving us away from the tedious, error-prone world of memorizing CLI commands and wrestling with YAML files, towards a future where we simply articulate our needs and let intelligent systems handle the complex implementation details.

This isn't about replacing skilled engineers. Far from it. It's about amplifying our capabilities. The time I used to spend meticulously configuring VPCs, troubleshooting IAM permissions, or debugging subtle YAML syntax errors, I can now dedicate to higher-value activities: designing robust application architectures, optimizing performance, exploring innovative solutions, and most importantly, solving actual business problems.

For organizations or teams who've been hesitant to adopt Kubernetes due to its notorious complexity, the EKS MCP Server effectively dismantles that barrier.

The future of infrastructure management is undeniably conversational and context-aware. And based on my own eye-opening experience, that future isn't a distant dream—it's already here, transforming the way we build and deploy on AWS.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.