DEV Community

Cover image for Amazon EKS From The Ground Up - Part 2: Worker Nodes with AWS Managed Nodes
Phu Hoang
Phu Hoang

Posted on

Amazon EKS From The Ground Up - Part 2: Worker Nodes with AWS Managed Nodes

Introduction

In Part 1, we successfully finished building the EKS Control Plane, we set up the VPC, Subnets, NAT Gateway, the Kubernetes API Server, and configured kubectl connectivity.

  • VPC, Subnets, NAT
  • Kubernetes API Server
  • An IAM role with enough permissions for the Control Plane
  • kubectl already connected to the cluster

But if you run:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

you’ll see… nothing.

That’s the real state of the cluster we created: it has a brain (Control Plane), but no hands and feet (Worker Nodes). Without compute capacity, the Kubernetes Scheduler can only stare at Pods stuck in the Pending state.

In this Part 2, we’ll “grow limbs” for your EKS cluster by deploying Worker Nodes using AWS Managed Nodes - the most practical middle ground between production readiness and operational effort.

This post won’t stop at “click and it works.” We’ll also explore what AWS quietly builds for you in a Managed Node Group, and contrast it with the Self-managed way so you understand the fundamentals and can debug with confidence.

📝A Quick Primer on Worker Nodes

What is a Worker Node, really?

In the most accurate - and simplest - definition:

A Worker Node is an EC2 instance configured to run Kubernetes workloads (Pods).

At minimum, every worker node runs:

  • kubelet — the agent that talks to the Kubernetes API Server
  • a container runtime — typically containerd
  • kube-proxy — manages service/network rules
  • AWS VPC CNI — the plugin that allocates Pod IPs from your VPC

Following production best practices, EKS worker nodes usually live inside private subnets without public IPs. That significantly reduces exposure: workloads are not directly reachable from the Internet, and outbound connectivity is handled through a NAT Gateway.

Worker Node Deployment Models

AWS offers three main ways to run compute for EKS, each with a different balance of control, operational cost, and “serverless-ness”:

Criteria Managed Nodes Self-managed Nodes AWS Fargate
Infrastructure EC2 Instances (partly managed by AWS) EC2 Instances (fully managed by you) Serverless (no nodes to manage)
Operational cost (OpEx) Low High Very low
Control Medium Highest Lowest
Billing Per EC2 instance Per EC2 instance Per Pod (CPU/Mem)

Fargate hides almost everything—including the “node layer.” That’s convenient, but if your goal is to understand EKS fundamentals, the interesting engineering happens in Managed Nodes vs Self-managed Nodes.

Criteria Managed Nodes (Managed Node Group) Self-managed Nodes
Node creation EKS Console or CLI command creates a Node Group EC2 Launch Template + ASG + User Data (bootstrap)
ASG management AWS manages the Auto Scaling Group lifecycle You manage the ASG entirely
Cluster Join AWS automatically handles the wiring and credentials for join You must provide user data calling /etc/eks/bootstrap.sh
Node auth mapping (IAM→ RBAC) AWS typically maps the Node Role automatically You must manually update aws-auth to map the Node Role
Upgrades/Updates Built-in, managed rolling update workflows You design and manage drain / replace strategies
Debug level Fewer common traps, higher abstraction More control, but more responsibility for low-level configuration

What AWS does for you in a Managed Node Group

When you click Create Node Group, AWS typically handles a long checklist that Self-managed nodes would require you to build manually:

  1. Creates an Auto Scaling Group
  2. Picks an EKS-Optimized AMI compatible with your cluster version
  3. Creates/manages a Launch Template (or uses one you provide)
  4. Attaches an IAM Instance Profile using your Node IAM Role
  5. Injects the bootstrap configuration so nodes can join the cluster
  6. Automates the node registration path
  7. Provides a rolling update workflow for node group upgrades
  8. Typically handles node role mapping into the cluster auth mechanism

Recommendation: Use Managed Nodes for most production setups to reduce operational overhead. Choose Self-managed only when you have very specific requirements (custom OS hardening, special bootstrap, deep control of the node lifecycle).

Cluster IAM Role vs Node IAM Role

This is one of the most common points of confusion, so let’s make it crystal clear.

Cluster IAM Role

  • Used by the EKS Control Plane
  • Allows the Control Plane to manage ENIs and interact with your VPC resources

This role is not meant for workloads.

Node IAM Role

Worker nodes need a Node IAM Role to:

  • Join the cluster
  • Allow the VPC CNI to attach ENIs and allocate Pod IPs
  • Pull images from ECR
  • Access other required AWS APIs (later: secrets, parameters, logs, etc.)

Your worker nodes won’t become Ready without (at least) these managed policies:

Policy Purpose
AmazonEKSWorkerNodePolicy Join cluster, talk to the API Server
AmazonEKS_CNI_Policy Attach ENIs, allocate Pod IPs
AmazonEC2ContainerRegistryReadOnly Pull images from ECR

AWS IAM & Kubernetes Authentication

Access to an EKS cluster is a combination of two layers: AWS IAM & Kubernetes RBAC.

IAM = Authentication

IAM answers: “Who are you?”

When a principal (an EC2 node or a human user) calls the Kubernetes API Server, EKS uses IAM authentication to verify:

  • Which IAM principal (Role/User) the request comes from
  • Whether the request has a valid SigV4 signature

✅ This is why worker nodes must have a proper Node IAM Role attached.

Kubernetes RBAC = Authorization

RBAC answers: “What are you allowed to do?”

Even if IAM authentication succeeds, the API call can still fail with Forbidden if RBAC doesn’t grant the required permissions.

Bridging the two worlds: mapping IAM → Kubernetes identity

After IAM authentication, EKS maps IAM identity into Kubernetes users/groups so RBAC can evaluate permissions. Two common mechanisms exist:

  • aws-auth ConfigMap (classic, still widely used), example:

    mapRoles: |
      - rolearn: arn:aws:iam::<account-id>:role/EKSNodeRole
        username: system:node:{{EC2PrivateDNSName}}
        groups:
          - system:bootstrappers
          - system:nodes
    
  • EKS Access Entries / Access Policies (newer Console-based approach)

For this article:

  • Nodes must be mapped into groups like system:bootstrappers and system:nodes
  • Humans/admins are commonly mapped to system:masters or granted an equivalent Access Policy

Hands-on Time

The AWS architecture after completing the steps below:

part_2_aws_architecture.png

Step 0 — Preparation

Make sure all resources from Part 1 are created correctly and your cluster is ACTIVE.

Step 1 — Create the IAM Role for Worker Nodes

Open the AWS Console:

→ Go to IAM

→ Choose RolesCreate role

→ Configure Trusted entity:

  • Trusted entity type: AWS service
  • Service or use case: EC2
  • Use case: EC2

    part_2_IAM_Role_create_step_1.png

→ Click Next

→ Attach policies:

  • AmazonEKSWorkerNodePolicy
  • AmazonEKS_CNI_Policy
  • AmazonEC2ContainerRegistryReadOnly

→ Role name: EKSNodeRole

→ Click Create role

part_2_IAM_Role_create_step_3.png

Step 2 — Create a Managed Node Group

→ Open the EKS service

→ Select cluster demo-eks-cluster

→ Go to ComputeAdd node group

part_2_eks_compute_add.png

Configure node group

  • Name: eks-mng-general
  • Node IAM role: EKSNodeRole

    part_2_eks_add_node_group_step_1.png

→ Click Next.

Configure compute & scaling

  • AMI type: EKS optimized (Amazon Linux / Bottlerocket)
  • Capacity type: On-Demand
  • Instance type: t3.medium
  • Disk size: 20 GiB
  • Scaling:
    • Desired: 2
    • Min: 1
    • Max: 3

→ Keep other settings as default.

part_2_eks_add_node_group_step_2.png

→ Click Next.

Configure networking

Select only the two private subnets.

This is critical: subnet selection here determines where your worker nodes live. Private subnets are ideal for production worker nodes because they don’t expose instances to the Internet.

part_2_eks_add_node_group_step_3.png

→ Click NextCreate.

Wait until the node group status becomes Active.

part_2_eks_node_group_active.png

Step 3 - Join the Cluster - AWS handles it

You don’t need to manually configure bootstrap steps for a Managed Node Group - but understanding the join flow is what makes you effective at troubleshooting.

3.1 Node join flow

When you create a Managed Node Group, AWS launches EC2 worker nodes. Each instance gets an Instance Profile containing your Node IAM Role (EKSNodeRole). The join flow looks like this:

  1. The EC2 instance boots using an EKS-Optimized AMI
  2. Bootstrap config provides kubelet with:
    • cluster name
    • API endpoint
    • cluster CA certificate
  3. kubelet calls the Kubernetes API Server to register the node
  4. EKS performs IAM authentication and identifies the IAM Role from the instance profile
  5. EKS maps IAM identity → Kubernetes identity via aws-auth / Access Entries
  6. If mapping is valid (node is in system:nodes), the node becomes Ready

In short: nodes don’t “join by Kubernetes magic.” They join because:

IAM authentication proves identity + Kubernetes group mapping allows the node role to function.

3.2 What Managed Node Group eliminates compared to Self-managed

With Self-managed nodes, you must build the join path yourself:

  • Create Launch Template (AMI, instance profile, user data)
  • Ensure bootstrap via /etc/eks/bootstrap.sh <cluster-name>
  • Create ASG and subnet placement
  • Manually update aws-auth to map the node role into:
    • system:bootstrappers
    • system:nodes

Managed Node Groups remove most of this plumbing.

Step 4 — Verify

This is the moment your EKS cluster finally starts to feel alive.

4.1 Verify with kubectl

→ Open a terminal and run:

kubectl get nodes -o wide
Enter fullscreen mode Exit fullscreen mode

→ Then check system pods:

kubectl get pods -n kube-system

Enter fullscreen mode Exit fullscreen mode

Expected results:

  • You should see two nodes (based on desired size), in Ready state
  • coredns, metrics-server, and kube-proxy should transition to Running

part_2_verify_kubectl.png

4.2 What AWS resources were created behind the scenes?

Now let’s satisfy curiosity and confirm what AWS created inside your account.

(1) Auto Scaling Group

→ Open EKS Service

→ Select EKS cluster demo-eks-cluster

→ Click Compute tab

→ Select node group eks-mng-general

part_2_verify_node_group.png

→ In Details, click the Auto Scaling Group

part_2_verify_asg.png

Inside the ASG page, you’ll find:

  • Desired / Min / Max configuration
  • EC2 instances in InService
  • Launch Template reference
  • Security Groups

part_2_verify_asg_detail.png

(2) Launch Template

From the ASG page:

→ Click the Launch template link.

You’ll see:

  • AMI ID
  • Instance type
  • Security groups attached
  • User data/bootstrap wiring (partly hidden, but it’s there)

part_2_verify_launch_template.png

(3) Security Group for worker nodes

From the ASG details page

→ Click the Security group IDs.

Review inbound/outbound rules applied to worker nodes.

Common Pitfalls

1) Node group is Active, but kubectl get nodes shows nothing

Likely causes:

  • The node group is using the wrong IAM role (not EKSNodeRole)
  • Node IAM role is missing required policies
  • Wrong subnet selection or private subnet route tables are incorrect

2) Instances keep launching and terminating in the ASG

Likely causes:

  • Instance type capacity shortage → try a more common type (t3.large, m5.large, etc.)
  • Subnet/AZ constraints → expand to more AZs/subnets
  • EC2 quota limits → request quota increase

3) Pods stuck in Pending

Likely causes:

  • Insufficient node resources (CPU/memory) → choose a larger instance type
  • Taints/labels preventing scheduling → remove taints or adjust selectors

4) ImagePullBackOff / ErrImagePull

Likely causes:

  • Private subnets have no NAT gateway, or routes are wrong
  • DNS resolution is broken → check VPC settings (DNS resolution and DNS hostnames)

Summary

In Part 2, we:

  • Added production-style worker nodes (private subnets) so workloads finally have somewhere to run
  • Clearly separated Cluster Role vs Node Role
  • Covered the IAM → Kubernetes authentication story
  • Explored what a Managed Node Group creates behind the scenes

Next, in Part 3, we’ll go deeper into EKS networking: VPC CNI, ENIs, Pod IP allocation, and traffic flow debugging.

Top comments (0)