DEV Community

iapilgrim
iapilgrim

Posted on

Understanding AKS Networking: Underlay Network

If you’ve ever tried to curl a Kubernetes Service IP from a VM and it just… hangs — this guide is for you.

We’ll break down:

  • AKS network design
  • CIDR layout (VNet, Subnet, Service CIDR, Pod CIDR)
  • Why ClusterIP fails from a VM
  • Why NodePort works
  • Step-by-step packet flow
  • Full Azure CLI setup

All tested on Azure Kubernetes Service (AKS) in Microsoft Azure.


🧱 1️⃣ Network Design Overview

We’ll use this lab topology:

  • VNet: 10.0.0.0/16
  • AKS Subnet: 10.0.1.0/24
  • VM Subnet: 10.0.2.0/24
  • Service CIDR: 10.240.0.0/16
  • Underlay mode (Azure CNI)

🗺️ Architecture Diagram (PlantUML)


🧠 Understanding the CIDRs

CIDR Purpose
10.0.0.0/16 Azure VNet
10.0.1.0/24 AKS Nodes
10.0.2.0/24 Test VM
10.240.0.0/16 Kubernetes Services (Virtual)
192.168.0.0/16 Overlay Pods (if enabled)

Critical concept:

Service CIDR is NOT part of Azure VNet routing.


⚙️ 2️⃣ Full Azure CLI Setup

Variables

LOCATION=eastus2
RG=aks-networking-lab
VNET_NAME=aks-vnet
UNDERLAY_SUBNET=aks-underlay-subnet
VM_SUBNET=vm-subnet
AKS_NAME=aks-underlay
Enter fullscreen mode Exit fullscreen mode

Create Resource Group

az group create \
  --name $RG \
  --location $LOCATION
Enter fullscreen mode Exit fullscreen mode

Create VNet + AKS Subnet

az network vnet create \
  --resource-group $RG \
  --name $VNET_NAME \
  --address-prefix 10.0.0.0/16 \
  --subnet-name $UNDERLAY_SUBNET \
  --subnet-prefix 10.0.1.0/24
Enter fullscreen mode Exit fullscreen mode

Create VM Subnet

az network vnet subnet create \
  --resource-group $RG \
  --vnet-name $VNET_NAME \
  --name $VM_SUBNET \
  --address-prefix 10.0.2.0/24
Enter fullscreen mode Exit fullscreen mode

Get Subnet ID

SUBNET_ID=$(az network vnet subnet show \
  --resource-group $RG \
  --vnet-name $VNET_NAME \
  --name $UNDERLAY_SUBNET \
  --query id -o tsv)
Enter fullscreen mode Exit fullscreen mode

Create AKS Cluster

az aks create \
  --resource-group $RG \
  --name $AKS_NAME \
  --network-plugin azure \
  --vnet-subnet-id $SUBNET_ID \
  --service-cidr 10.240.0.0/16 \
  --dns-service-ip 10.240.0.10 \
  --node-count 2 \
  --generate-ssh-keys
Enter fullscreen mode Exit fullscreen mode

Connect to Cluster

az aks get-credentials \
  --resource-group $RG \
  --name $AKS_NAME
Enter fullscreen mode Exit fullscreen mode

🚀 3️⃣ Deploy Test Application

kubectl create deployment nginx --image=nginx
kubectl scale deployment nginx --replicas=2
Enter fullscreen mode Exit fullscreen mode

Expose as ClusterIP:

kubectl expose deployment nginx \
  --name nginx-svc \
  --port 80 \
  --type ClusterIP
Enter fullscreen mode Exit fullscreen mode

Check:

kubectl get svc
Enter fullscreen mode Exit fullscreen mode

Example:

nginx-svc   ClusterIP   10.240.225.54   80/TCP
Enter fullscreen mode Exit fullscreen mode

That IP comes from the Service CIDR.


🔥 4️⃣ Packet Flow: ClusterIP (Why VM Access Fails)

If you try from VM:

curl 10.240.225.54
Enter fullscreen mode Exit fullscreen mode

It hangs.

Why?

Because Azure routing checks:

Is 10.240.0.0/16 in VNet?
No.
→ Drop packet.
Enter fullscreen mode Exit fullscreen mode

The packet never reaches the node.


🧭 Packet Flow Diagram


🧪 5️⃣ Convert to NodePort

kubectl patch svc nginx-svc \
  -p '{"spec":{"type":"NodePort"}}'
Enter fullscreen mode Exit fullscreen mode

Check:

kubectl get svc
Enter fullscreen mode Exit fullscreen mode

Example:

nginx-svc   NodePort   10.240.225.54   80:31598/TCP
Enter fullscreen mode Exit fullscreen mode

✅ 6️⃣ Correct Way to Test from VM

First get node IP:

kubectl get nodes -o wide
Enter fullscreen mode Exit fullscreen mode

eg

NAME                                STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-42091994-vmss000000   Ready    <none>   34m   v1.33.6   10.0.1.33     <none>        Ubuntu 22.04.5 LTS   5.15.0-1103-azure   containerd://1.7.30-2
aks-nodepool1-42091994-vmss000001   Ready    <none>   34m   v1.33.6   10.0.1.4      <none>        Ubuntu 22.04.5 LTS   5.15.0-1103-azure   containerd://1.7.30-2
Enter fullscreen mode Exit fullscreen mode

Then:

curl http://10.0.1.4:31598
Enter fullscreen mode Exit fullscreen mode

eg

azureuser@test-vm:~$ curl -s 10.0.1.33:31598 | grep -i "welcome"
<title>Welcome to nginx!</title>
<h1>Welcome to nginx!</h1>
Enter fullscreen mode Exit fullscreen mode

Flow:

  1. VM → Node IP
  2. Node receives traffic
  3. kube-proxy matches NodePort rule
  4. DNAT to Pod IP
  5. Response returned

🧠 Deep Technical Breakdown

When packet hits node:

kube-proxy installs iptables rules like:

KUBE-NODEPORTS
KUBE-SERVICES
KUBE-SEP-XXXX
Enter fullscreen mode Exit fullscreen mode

DNAT example:

10.0.1.4:31598 → 10.0.1.10:80
Enter fullscreen mode Exit fullscreen mode

ClusterIP works inside Pods because:

  • Packet reaches node first
  • kube-proxy rewrites destination

From VM:

  • Packet never reaches node
  • Azure routing drops it

🎯 Key Takeaways

  • ClusterIP = virtual internal Kubernetes IP
  • NodePort = node listens on real VNet IP
  • Service CIDR must NOT overlap with VNet
  • Azure only routes VNet CIDRs
  • kube-proxy handles Service IP translation

🏁 Final Mental Model

Azure handles:

10.0.0.0/16
Enter fullscreen mode Exit fullscreen mode

Kubernetes handles:

10.240.0.0/16
Enter fullscreen mode Exit fullscreen mode

Different routing domains.

Top comments (0)