Phu Hoang

Posted on Jan 10

Amazon EKS From The Ground Up - Part 2: Worker Nodes with AWS Managed Nodes

#aws #eks #devops

Introduction

In Part 1, we successfully finished building the EKS Control Plane, we set up the VPC, Subnets, NAT Gateway, the Kubernetes API Server, and configured kubectl connectivity.

VPC, Subnets, NAT
Kubernetes API Server
An IAM role with enough permissions for the Control Plane
kubectl already connected to the cluster

But if you run:

kubectl get nodes

you’ll see… nothing.

That’s the real state of the cluster we created: it has a brain (Control Plane), but no hands and feet (Worker Nodes). Without compute capacity, the Kubernetes Scheduler can only stare at Pods stuck in the Pending state.

In this Part 2, we’ll “grow limbs” for your EKS cluster by deploying Worker Nodes using AWS Managed Nodes - the most practical middle ground between production readiness and operational effort.

This post won’t stop at “click and it works.” We’ll also explore what AWS quietly builds for you in a Managed Node Group, and contrast it with the Self-managed way so you understand the fundamentals and can debug with confidence.

📝A Quick Primer on Worker Nodes

What is a Worker Node, really?

In the most accurate - and simplest - definition:

A Worker Node is an EC2 instance configured to run Kubernetes workloads (Pods).

At minimum, every worker node runs:

kubelet — the agent that talks to the Kubernetes API Server
a container runtime — typically containerd
kube-proxy — manages service/network rules
AWS VPC CNI — the plugin that allocates Pod IPs from your VPC

Following production best practices, EKS worker nodes usually live inside private subnets without public IPs. That significantly reduces exposure: workloads are not directly reachable from the Internet, and outbound connectivity is handled through a NAT Gateway.

Worker Node Deployment Models

AWS offers three main ways to run compute for EKS, each with a different balance of control, operational cost, and “serverless-ness”:

Criteria	Managed Nodes	Self-managed Nodes	AWS Fargate
Infrastructure	EC2 Instances (partly managed by AWS)	EC2 Instances (fully managed by you)	Serverless (no nodes to manage)
Operational cost (OpEx)	Low	High	Very low
Control	Medium	Highest	Lowest
Billing	Per EC2 instance	Per EC2 instance	Per Pod (CPU/Mem)

Fargate hides almost everything—including the “node layer.” That’s convenient, but if your goal is to understand EKS fundamentals, the interesting engineering happens in Managed Nodes vs Self-managed Nodes.

Criteria	Managed Nodes (Managed Node Group)	Self-managed Nodes
Node creation	EKS Console or CLI command creates a Node Group	EC2 Launch Template + ASG + User Data (bootstrap)
ASG management	AWS manages the Auto Scaling Group lifecycle	You manage the ASG entirely
Cluster Join	AWS automatically handles the wiring and credentials for join	You must provide user data calling `/etc/eks/bootstrap.sh`
Node auth mapping (IAM→ RBAC)	AWS typically maps the Node Role automatically	You must manually update `aws-auth` to map the Node Role
Upgrades/Updates	Built-in, managed rolling update workflows	You design and manage `drain` / `replace` strategies
Debug level	Fewer common traps, higher abstraction	More control, but more responsibility for low-level configuration

What AWS does for you in a Managed Node Group

When you click Create Node Group, AWS typically handles a long checklist that Self-managed nodes would require you to build manually:

Creates an Auto Scaling Group
Picks an EKS-Optimized AMI compatible with your cluster version
Creates/manages a Launch Template (or uses one you provide)
Attaches an IAM Instance Profile using your Node IAM Role
Injects the bootstrap configuration so nodes can join the cluster
Automates the node registration path
Provides a rolling update workflow for node group upgrades
Typically handles node role mapping into the cluster auth mechanism

Recommendation: Use Managed Nodes for most production setups to reduce operational overhead. Choose Self-managed only when you have very specific requirements (custom OS hardening, special bootstrap, deep control of the node lifecycle).

Cluster IAM Role vs Node IAM Role

This is one of the most common points of confusion, so let’s make it crystal clear.

Cluster IAM Role

Used by the EKS Control Plane
Allows the Control Plane to manage ENIs and interact with your VPC resources

This role is not meant for workloads.

Node IAM Role

Worker nodes need a Node IAM Role to:

Join the cluster
Allow the VPC CNI to attach ENIs and allocate Pod IPs
Pull images from ECR
Access other required AWS APIs (later: secrets, parameters, logs, etc.)

Your worker nodes won’t become Ready without (at least) these managed policies:

Policy	Purpose
`AmazonEKSWorkerNodePolicy`	Join cluster, talk to the API Server
`AmazonEKS_CNI_Policy`	Attach ENIs, allocate Pod IPs
`AmazonEC2ContainerRegistryReadOnly`	Pull images from ECR

AWS IAM & Kubernetes Authentication

Access to an EKS cluster is a combination of two layers: AWS IAM & Kubernetes RBAC.

IAM = Authentication

IAM answers: “Who are you?”

When a principal (an EC2 node or a human user) calls the Kubernetes API Server, EKS uses IAM authentication to verify:

Which IAM principal (Role/User) the request comes from
Whether the request has a valid SigV4 signature

✅ This is why worker nodes must have a proper Node IAM Role attached.

Kubernetes RBAC = Authorization

RBAC answers: “What are you allowed to do?”

Even if IAM authentication succeeds, the API call can still fail with Forbidden if RBAC doesn’t grant the required permissions.

Bridging the two worlds: mapping IAM → Kubernetes identity

After IAM authentication, EKS maps IAM identity into Kubernetes users/groups so RBAC can evaluate permissions. Two common mechanisms exist:

aws-auth ConfigMap (classic, still widely used), example:

mapRoles: |
  - rolearn: arn:aws:iam::<account-id>:role/EKSNodeRole
    username: system:node:{{EC2PrivateDNSName}}
    groups:
      - system:bootstrappers
      - system:nodes

EKS Access Entries / Access Policies (newer Console-based approach)

For this article:

Nodes must be mapped into groups like system:bootstrappers and system:nodes
Humans/admins are commonly mapped to system:masters or granted an equivalent Access Policy

Hands-on Time

The AWS architecture after completing the steps below:

Step 0 — Preparation

Make sure all resources from Part 1 are created correctly and your cluster is ACTIVE.

Step 1 — Create the IAM Role for Worker Nodes

Open the AWS Console:

→ Go to IAM

→ Choose Roles → Create role

→ Configure Trusted entity:

Trusted entity type: AWS service
Service or use case: EC2
Use case: EC2

→ Click Next

→ Attach policies:

AmazonEKSWorkerNodePolicy
AmazonEKS_CNI_Policy
AmazonEC2ContainerRegistryReadOnly

→ Role name: EKSNodeRole

→ Click Create role

Step 2 — Create a Managed Node Group

→ Open the EKS service

→ Select cluster demo-eks-cluster

→ Go to Compute → Add node group

Configure node group

Name: eks-mng-general
Node IAM role: EKSNodeRole

→ Click Next.

Configure compute & scaling

AMI type: EKS optimized (Amazon Linux / Bottlerocket)
Capacity type: On-Demand
Instance type: t3.medium
Disk size: 20 GiB
Scaling:
- Desired: 2
- Min: 1
- Max: 3

→ Keep other settings as default.

→ Click Next.

Configure networking

Select only the two private subnets.

This is critical: subnet selection here determines where your worker nodes live. Private subnets are ideal for production worker nodes because they don’t expose instances to the Internet.

→ Click Next → Create.

Wait until the node group status becomes Active.

Step 3 - Join the Cluster - AWS handles it

You don’t need to manually configure bootstrap steps for a Managed Node Group - but understanding the join flow is what makes you effective at troubleshooting.

3.1 Node join flow

When you create a Managed Node Group, AWS launches EC2 worker nodes. Each instance gets an Instance Profile containing your Node IAM Role (EKSNodeRole). The join flow looks like this:

The EC2 instance boots using an EKS-Optimized AMI
Bootstrap config provides kubelet with:
- cluster name
- API endpoint
- cluster CA certificate
kubelet calls the Kubernetes API Server to register the node
EKS performs IAM authentication and identifies the IAM Role from the instance profile
EKS maps IAM identity → Kubernetes identity via aws-auth / Access Entries
If mapping is valid (node is in system:nodes), the node becomes Ready

In short: nodes don’t “join by Kubernetes magic.” They join because:

IAM authentication proves identity + Kubernetes group mapping allows the node role to function.

3.2 What Managed Node Group eliminates compared to Self-managed

With Self-managed nodes, you must build the join path yourself:

Create Launch Template (AMI, instance profile, user data)
Ensure bootstrap via /etc/eks/bootstrap.sh <cluster-name>
Create ASG and subnet placement
Manually update aws-auth to map the node role into:
- system:bootstrappers
- system:nodes

Managed Node Groups remove most of this plumbing.

Step 4 — Verify

This is the moment your EKS cluster finally starts to feel alive.

4.1 Verify with kubectl

→ Open a terminal and run:

kubectl get nodes -o wide

→ Then check system pods:

kubectl get pods -n kube-system

Expected results:

You should see two nodes (based on desired size), in Ready state
coredns, metrics-server, and kube-proxy should transition to Running

4.2 What AWS resources were created behind the scenes?

Now let’s satisfy curiosity and confirm what AWS created inside your account.

(1) Auto Scaling Group

→ Open EKS Service

→ Select EKS cluster demo-eks-cluster

→ Click Compute tab

→ Select node group eks-mng-general

→ In Details, click the Auto Scaling Group

Inside the ASG page, you’ll find:

Desired / Min / Max configuration
EC2 instances in InService
Launch Template reference
Security Groups

(2) Launch Template

From the ASG page:

→ Click the Launch template link.

You’ll see:

AMI ID
Instance type
Security groups attached
User data/bootstrap wiring (partly hidden, but it’s there)

(3) Security Group for worker nodes

From the ASG details page

→ Click the Security group IDs.

Review inbound/outbound rules applied to worker nodes.

Common Pitfalls

1) Node group is Active, but `kubectl get nodes` shows nothing

Likely causes:

The node group is using the wrong IAM role (not EKSNodeRole)
Node IAM role is missing required policies
Wrong subnet selection or private subnet route tables are incorrect

2) Instances keep launching and terminating in the ASG

Likely causes:

Instance type capacity shortage → try a more common type (t3.large, m5.large, etc.)
Subnet/AZ constraints → expand to more AZs/subnets
EC2 quota limits → request quota increase

3) Pods stuck in `Pending`

Likely causes:

Insufficient node resources (CPU/memory) → choose a larger instance type
Taints/labels preventing scheduling → remove taints or adjust selectors

4) `ImagePullBackOff` / `ErrImagePull`

Likely causes:

Private subnets have no NAT gateway, or routes are wrong
DNS resolution is broken → check VPC settings (DNS resolution and DNS hostnames)

Summary

In Part 2, we:

Added production-style worker nodes (private subnets) so workloads finally have somewhere to run
Clearly separated Cluster Role vs Node Role
Covered the IAM → Kubernetes authentication story
Explored what a Managed Node Group creates behind the scenes

Next, in Part 3, we’ll go deeper into EKS networking: VPC CNI, ENIs, Pod IP allocation, and traffic flow debugging.

DEV Community

Amazon EKS From The Ground Up - Part 2: Worker Nodes with AWS Managed Nodes

Introduction

📝A Quick Primer on Worker Nodes

What is a Worker Node, really?

Worker Node Deployment Models

What AWS does for you in a Managed Node Group

Cluster IAM Role vs Node IAM Role

Cluster IAM Role

Node IAM Role

AWS IAM & Kubernetes Authentication

IAM = Authentication

Kubernetes RBAC = Authorization

Bridging the two worlds: mapping IAM → Kubernetes identity

Hands-on Time

Step 0 — Preparation

Step 1 — Create the IAM Role for Worker Nodes

Step 2 — Create a Managed Node Group

Configure node group

Configure compute & scaling

Configure networking

Step 3 - Join the Cluster - AWS handles it

3.1 Node join flow

3.2 What Managed Node Group eliminates compared to Self-managed

Step 4 — Verify

4.1 Verify with kubectl

4.2 What AWS resources were created behind the scenes?

Common Pitfalls

1) Node group is Active, but `kubectl get nodes` shows nothing

2) Instances keep launching and terminating in the ASG

3) Pods stuck in `Pending`

4) `ImagePullBackOff` / `ErrImagePull`

Summary

Top comments (0)

Introduction

📝A Quick Primer on Worker Nodes

What is a Worker Node, really?

Worker Node Deployment Models

What AWS does for you in a Managed Node Group

Cluster IAM Role vs Node IAM Role

Cluster IAM Role

Node IAM Role

AWS IAM & Kubernetes Authentication

IAM = Authentication

Kubernetes RBAC = Authorization

Bridging the two worlds: mapping IAM → Kubernetes identity

Hands-on Time

Step 0 — Preparation

Step 1 — Create the IAM Role for Worker Nodes

Step 2 — Create a Managed Node Group

Configure node group

Configure compute & scaling

Configure networking

Step 3 - Join the Cluster - AWS handles it

3.1 Node join flow

3.2 What Managed Node Group eliminates compared to Self-managed

Step 4 — Verify

4.1 Verify with kubectl

4.2 What AWS resources were created behind the scenes?

Common Pitfalls

1) Node group is Active, but kubectl get nodes shows nothing

2) Instances keep launching and terminating in the ASG

3) Pods stuck in Pending

4) ImagePullBackOff / ErrImagePull

Summary

1) Node group is Active, but `kubectl get nodes` shows nothing

3) Pods stuck in `Pending`

4) `ImagePullBackOff` / `ErrImagePull`