Habeeb

Posted on Nov 3

EKS Networking Explained: Why am I running out of IPs? (Part 1)

#kubernetes #aws #eks #devops

This is a two part series, Part 1 explains WHY IP exhaustion happens, Part 2 covers solutions and prevention.

It's a casual morning, your staging environment is working perfectly, you have just deployed 5 new microservices to test a feature, then suddenly, pods are stuck in 'Pending' state. You describe the pod to and you see:

0/3 nodes are available: 3 too many pods

But you only have 20 pods total and a 3 m5.large nodes. What gives?

Welcome to EKS IP exhaustions: one of the most confusing problems for beginners, you have plenty of CPU and memory available, but Kubernetes refuses to schedule pods, the culprit? IP addresses.

In this two part series, we'll demystify EKS networking completely. Part 1 ( this post ) helps you understand WHY this happens and how to diagnose it. Part 2 covers solutions and prevention.

The Apartment Building Analogy

Let's make this sink in instantly with an easy analogy, think of your EKS cluster as city with apartment buildings:

Nodes = Apartment Buildings
Each building (EC2 instances) can hold multiple tenants and has specific infrastructure capacity.
Pods = Apartments/Units
Where your containerized applications actually live.
IP Addresses = Mailing Addresses
Each apartment needs its unique address to receive mail (network traffic).
ENIs = Mailbox Clusters
Limited number of mailbox installations per building floor (Elastic Network Interfaces).
VPC CIDR = City's Zip Code Range
Finite pool of available addresses for the entire city.

The problem in simpler terms

Your city (VPC) only has so many mailing addresses, each building (node) can only install limited mailbox clusters (ENIs), and each cluster serves a specific number of apartments (IPs per ENI).

Here is the kicker: Even if your building has empty apartments (available CPU/Memory), you can't rent them out if you've run out of mailboxes! That's exactly what happens with EKS.

Key Insight: In EKS, IP address limits often become the bottleneck before CPU or memory. This is counterintuitive for beginners.

*How EKS actually Assigns IPs
*
Let's peek behind the curtain and see what really happens when you create a pod.

The pod creation Flow

You create a pod (via yaml manifest or kubectl apply)
Kubernetes scheduler finds a node with available capacity.
VPC CNI plugin needs to assign the pod an IP from your VPC.
VPC CNI checks: Do I have a free IP on this node? YES: Assign IP immediately - Pods starts running NO: Check if we can attach another ENI Under ENI limit: Request new ENI from AWS - Assign IP At ENI limit: Pod stays PENDING forever.

What is VPC CNI?

The VPC Container Network Interface (CNI) is an AWS's plugin for EKS. Unlike other Kubernetes CNI plugins that use overlay networks, VPC CNI gives pod real IP addresses from your VPC subnets.
Why? so your pods can communicate directly with other AWS resources without network translation. It's operationally brilliant but comes with IP limits.

Technical detail: VPC CNI uses 'Secondary IP mode' by default. Each pod gets a secondary IP from an ENI attached to the node, the node uses the primary IP of the ENI.

The math behind it all

Time to get mathematical, let's calculate exactly how many pods you can run on a node.

The formula:

Max pods = (number of ENIs x (IPS per ENI -1)) + 2

Let's break it down:

Number of ENIs: Max network interfaces for your instance type.
IPS per ENI: Max IPs each interface supports.
Why -1?: One IP per ENI is the ENI's primary address.
Why +2?: Kube-proxy and VPC CNI run in host network mode (they don't consume pod IPs).

Quick Example: m5.large instance type

Specification:

Max ENIs: 3
IPs per ENI: 10

Common wrong calculation: 3 x 10 = 30 pods

Correct calculation using our formula:
(3 x (10-1) + 2)
= (3 x 9) + 2
= 27 + 2 ===> 29 pods (max)

Troubleshooting: Is this the error I am facing?

Step 1: Check pod status

kubectl describe pod <pod-name>

Look for these in events:

"0/X nodes available: X too many pods"
"Unable to allocate IP address"
"InsufficientFreeAddressesInSubnet"

Step 2: Check node capacity

kubectl get nodes -o custom-columns= NAME:.metadata.name, PODS:.status.capacity.pods

Step 3: Count pods per node

kubectl get pods -A -o wide | awk '{print $8}' | sort | uniq -c

Step 4: Check VPC CNI logs

kubectl logs -n kube-system -l k8s-app=aws-node | grep NetworkInterfaceLimitExceeded

Step 5: Check subnet availability

aws ec2 descibe-subnets --subnet-ids <id-of-your-subnet> --query 'Subnets[0].AvailableIpAddressCount

If under 100 IPs are available, you're running low.

What's next: Part 2

You now understand WHY this happens and HOW to diagnose it.

In part 2, we'll cover:

Quick fixes (10-30 minutes).
Prefix delegation (29 ---> 110 pods per node).
Advanced solutions and prevention.
The cost connection (how this reveals wasteful spending).

Coming next: Part 2: 'Solving EKS IP exhaustion' will show you exactly how to fix this and prevent it forever.

Happy clustering...

DEV Community

EKS Networking Explained: Why am I running out of IPs? (Part 1)

Top comments (0)