"Production-ready EKS deployment with Terraform — Karpenter autoscaling, self-healing nodes, pod security standards, and multi-AZ high availability."
EKS is the most popular managed Kubernetes service, but most deployments I've seen in production audits are dangerously under-configured. Missing node auto-remediation, no pod security standards, manual scaling — the list goes on.
This guide covers everything you need for production EKS.
EKS vs AKS vs GKE
| Feature | EKS | AKS | GKE |
|---|---|---|---|
| Control Plane Cost | $0.10/hr | Free | Free (Standard) |
| Autopilot Mode | No (use Karpenter) | No | Yes |
| Node Auto-Repair | Manual/Lambda | Built-in | Built-in |
| Service Mesh | App Mesh / Istio | Istio | Anthos / Istio |
| GPU Support | p4d, g5 | NC, ND series | T4, A100 |
Terraform Module
module "eks" {
source = "github.com/kogunlowo123/terraform-aws-auto-healing-eks"
cluster_name = "production-cluster"
cluster_version = "1.29"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
node_groups = [{
name = "general"
instance_types = ["m6i.xlarge", "m6i.2xlarge"]
min_size = 3
max_size = 20
desired_size = 5
}]
enable_karpenter = true
enable_cluster_autoscaler = false # Use Karpenter instead
enable_node_termination_handler = true
enable_auto_remediation = true
}
Best Practices
- Use Karpenter over Cluster Autoscaler for faster scaling and better bin-packing
- Enable pod disruption budgets for every production workload
- Use node termination handler for spot instance graceful shutdown
- Implement network policies with Calico or Cilium
- Enable control plane logging to CloudWatch
- Use IRSA (IAM Roles for Service Accounts) instead of node-level IAM
Open Source
- terraform-aws-auto-healing-eks — Self-healing EKS
- terraform-aws-eks — Standard EKS module
- terraform-aws-vpc-complete — VPC for EKS
Full guide: kogunlowo123.github.io
Top comments (0)