EKS Node Groups (Where Your Cluster Actually Gets Compute)
In the previous part, we created the EKS control plane.
At that point:
- Kubernetes API exists
- Cluster is reachable
But:
👉 There are no machines to run workloads
That’s where Node Groups come in.
This module creates the actual EC2 instances that:
- join the cluster
- run pods
- execute your applications
📂 Module Files
modules/eks-nodes/
├── main.tf
├── variables.tf
└── outputs.tf
📄 variables.tf
variable "cluster_name" {
description = "Name of the EKS cluster"
type = string
}
variable "node_group_name" {
description = "Name of the EKS node group"
type = string
}
variable "node_role_arn" {
description = "ARN of the EKS node group IAM role"
type = string
}
variable "subnet_ids" {
description = "List of subnet IDs"
type = list(string)
}
variable "instance_types" {
description = "List of instance types"
type = list(string)
default = ["t3.large"]
}
variable "desired_size" {
description = "Desired number of nodes"
type = number
default = 1
}
variable "max_size" {
description = "Maximum number of nodes"
type = number
default = 2
}
variable "min_size" {
description = "Minimum number of nodes"
type = number
default = 1
}
variable "max_unavailable" {
description = "Maximum number of nodes unavailable during update"
type = number
default = 1
}
variable "labels" {
description = "Key-value map of Kubernetes labels"
type = map(string)
default = {}
}
🧠 What these variables control
This module is completely configurable:
- cluster → which EKS cluster to join
- instance_types → what machines to use
- scaling → how many nodes
- labels → Kubernetes scheduling
Example thinking
dev → small nodes (t3.medium)
prod → bigger nodes (m5.large)
👉 Same code, different behavior.
📄 main.tf
1. Node Group Resource
resource "aws_eks_node_group" "nodes" {
cluster_name = var.cluster_name
node_group_name = var.node_group_name
node_role_arn = var.node_role_arn
subnet_ids = var.subnet_ids
What this does
Creates:
👉 EC2 instances managed by EKS
These instances:
- automatically join the cluster
- register as Kubernetes nodes
Important inputs
-
cluster_name→ which cluster to join -
node_role_arn→ permissions for nodes -
subnet_ids→ where nodes are created
Subnet choice
You are passing:
👉 private subnets
This means:
- nodes do NOT have public IP
- more secure
- traffic goes through NAT
2. Instance Types
instance_types = var.instance_types
What this controls
Defines:
- CPU
- Memory
- cost
Example:
t3.large → 2 vCPU, 8GB RAM
3. Scaling Configuration
scaling_config {
desired_size = var.desired_size
max_size = var.max_size
min_size = var.min_size
}
Meaning
- desired_size → current running nodes
- min_size → minimum nodes
- max_size → maximum nodes
Example
min = 1
desired = 2
max = 5
👉 Cluster can scale between 1–5 nodes
4. Update Configuration
update_config {
max_unavailable = var.max_unavailable
}
Why this matters
Controls rolling updates.
Example:
max_unavailable = 1
👉 Only 1 node can be down during update
Why important
- prevents downtime
- controls deployment safety
5. Labels
labels = var.labels
What labels do
Labels are used by Kubernetes for:
- scheduling
- targeting workloads
Example:
env = dev
type = backend
👉 Later you can do:
nodeSelector:
type: backend
6. Dependency
depends_on = [var.node_role_arn]
Why this is needed
Even though role is passed:
Terraform might not always guarantee order.
So this ensures:
👉 IAM role exists before node creation
📄 outputs.tf
output "node_group_arn" {
description = "Amazon Resource Name (ARN) of the EKS Node Group"
value = aws_eks_node_group.nodes.arn
}
output "node_group_status" {
description = "Status of the EKS Node Group"
value = aws_eks_node_group.nodes.status
}
🧠 Why outputs matter
These outputs help in:
- debugging
- monitoring
- integration with other modules
🔥 What You Actually Built
EKS Control Plane
│
│
Managed Node Group (EC2 Instances)
│
│
Kubernetes Pods run here
⚠️ Real Issues People Face
- Wrong subnet → nodes can’t join
- Missing IAM role → node creation fails
- No NAT → nodes can’t pull images
- Too small instance → pods crash
🧠 Key Takeaways
- Node group = actual compute layer
- Control plane alone is useless without nodes
- Scaling config controls capacity
- Labels help in workload placement
🚀 Next
In Part 5:
👉 Addons + CSI Driver
👉 How storage works in EKS
👉 Why IRSA becomes critical
At this point, your cluster is alive — now we make it usable.
Top comments (0)