Yuva

Posted on Jan 4 • Originally published at ypeavler.github.io

Kubernetes in a Hurry: From kube-proxy to ServiceMesh(Q&A Format)

Test Your Knowledge
Ready to test your understanding? Take the Networking Quiz — 34 questions covering everything from ARP to service mesh.

Part 1: Kubernetes Networking

Q: What are the three fundamental requirements of Kubernetes networking?

Kubernetes has a simple networking model with three fundamental requirements:

Every pod gets its own IP address
Pods can communicate with all other pods: Without NAT
Node-to-Pod Communication: All cluster Nodes must be able to communicate with all Pods without NAT

This model is implemented by CNI (Container Network Interface) plugins.

Q: What are the basic connectivity requirements to run k8s?

1.Static IP Addresses: All nodes must be assigned static IP addresses or DHCP reservations. Using dynamic IPs that change can break cluster communication and etcd quorum.

Full L2/L3 Connectivity: Every node must have full network connectivity to every other node in the cluster. This can be over a private or public network, provided there is no NAT between nodes.
Unique Identifiers: Each node must have a unique hostname, MAC address, and product_uuid (found in /sys/class/dmi/id/product_uuid)

Q: What are the OS requirements requirements to run k8s?

Disable Swap: Swap must be disabled on all nodes for the kubelet to function correctly.
Kernel Modules: Ensure br_netfilter and overlay modules are loaded to allow bridged traffic to be processed by iptables.
Time Synchronization: Highly accurate time sync (e.g., via Chrony or NTP) is required across all nodes to prevent certificate validation failures and etcd instability.

Q: What is CNI?

CNI (Container Network Interface) is the standard for Kubernetes networking plugins. When a pod starts, it's like a new apartment being built. The CNI plugin is like the city planning department that assigns the new apartment an address (IP address), connects it to the street (creates veth pair), and gives the resident a mailbox (network namespace). The pod can now send and receive letters (packets) just like any other apartment in the city!

Q: What are the CNI plugin responsibilities?

Create network namespace for the pod
Create veth pair (one end in pod, one in host)
Assign IP address from IPAM (IP Address Management) — IPAM allocates IPs from a configured CIDR range
Configure routes so pod can reach other pods and services
Set up overlay network (if needed) for cross-node communication

Q: What are popular CNI plugins?

CNI Plugin	Approach	Overlay Protocol	Key Features
Cilium	eBPF-based routing	Geneve, VXLAN, or native	Network policies, observability, service mesh integration
Calico	BGP routing	VXLAN or native	Network policies, BGP integration
Flannel	Simple overlay	VXLAN	Simple, minimal configuration
Weave	Mesh overlay	Custom (sleeve/fastdp)	Automatic mesh networking

For enterprise and multi-tenant Kubernetes deployments, Cilium and Calico are often preferred due to their robust network policy enforcement, performance benefits (eBPF for Cilium), and integration with BGP for native routing, which are critical for security and scalability.

Q: What makes cilium special?

Cilium uses eBPF (extended Berkeley Packet Filter) for high-performance networking:

Key features:

eBPF-based routing: Faster than iptables, no kernel bypass needed
Network policies: Enforced at the kernel level using eBPF programs
Observability: Built-in metrics and tracing
Service mesh integration: Can replace kube-proxy

Routing modes:

Geneve overlay (default) — Uses Geneve encapsulation for cross-node pod communication. Default choice when overlay is needed. Provides rich metadata in TLV options for advanced network policy enforcement.
VXLAN overlay — Alternative to Geneve for compatibility with environments that don't support Geneve. Similar functionality but without TLV extensibility.
Native routing — No overlay encapsulation. Routes pod IPs directly through the underlying network. Requires routable pod IPs and BGP or static routes.
BGP routing — Uses BGP to advertise pod CIDR routes to network infrastructure. Enables native routing with dynamic route distribution. Works with routers, cloud provider route tables, and other BGP-speaking devices.

Q: What is IPAM (IP Address Management)?

IPAM is the component of CNI plugins that manages IP address allocation. Each CNI plugin includes an IPAM plugin (or uses a standalone one like host-local) that:

Allocates IP addresses from a configured CIDR range (e.g., 10.244.0.0/16)
Tracks which IPs are assigned to which pods
Releases IPs when pods are deleted
Prevents IP conflicts by ensuring each pod gets a unique IP

Q: How do pods on the same node communicate?

When two pods are on the same node, they communicate through the node's bridge. When two pods are on the same node, it's like two apartments in the same building. You write a letter to your neighbor, drop it in the building's mailroom (bridge), and the mailroom immediately delivers it to your neighbor's apartment. No need for the postal service—it's all handled within the building!

Q: How do pods on different nodes communicate?

When pods are on different nodes, the CNI plugin uses an overlay network (VXLAN or Geneve). It's like sending a letter from one building to another across town. You write your letter (original packet) and put it in an inner envelope addressed to your friend's apartment (destination pod IP). The building's mailroom (CNI plugin) puts that inner envelope inside an outer envelope addressed to the destination building (node IP). The postal service (underlay network) delivers the outer envelope to the destination building, where the mailroom there opens it and delivers the inner envelope to your friend's apartment. Your friend never sees the outer envelope—they just receive your letter!

Q: Why do we need Kubernetes Services?

Pods are ephemeral — they can be created, destroyed, and moved. Services provide a stable endpoint for pods. Instead of tracking individual pod IPs (which change constantly), applications use a Service IP that remains constant. kube-proxy maintains the mapping between Service IPs and pod IPs, automatically updating it as pods are created or destroyed.

Q: What are the different Service types?

Service Type	Use Case	How It Works
ClusterIP	Internal communication	Virtual IP (10.96.0.0/12) that kube-proxy routes to pod IPs
NodePort	External access via node IP	Opens a port (30000-32767) on all nodes, routes to ClusterIP
LoadBalancer	Cloud provider integration	Creates external load balancer, routes to NodePort
ExternalName	External service alias	DNS CNAME to external service
Headless	Direct pod access	No ClusterIP, DNS returns pod IPs directly

Q: How does ClusterIP work?

When a Service is created, Kubernetes assigns it a ClusterIP (a virtual IP, e.g., 10.96.0.100) from the cluster's service CIDR (typically 10.96.0.0/12)
kube-proxy on each node creates iptables rules that map the ClusterIP → Pod IPs (based on Endpoints/EndpointSlices)
When traffic arrives at a node destined for the ClusterIP, kube-proxy's iptables rules perform DNAT (Destination NAT), rewriting the destination IP from the ClusterIP to a selected pod IP (load balanced across available pods)
If the selected pod is on a different node, the overlay network (VXLAN/Geneve) handles routing the packet to the destination pod's node

Q: What is kube-proxy?

Kube-proxy manages Service-to-Pod connectivity. It is a network proxy that runs on each node and maintains network rules (usually via iptables or IPVS) to map Kubernetes Service virtual IPs to the actual Pod IPs assigned by the CNI. It has three modes:

iptables mode (default): Creates iptables rules for Service IP → Pod IP mapping. Fast and efficient, but rules can become large with many services.
ipvs mode: Uses Linux IPVS (IP Virtual Server) for load balancing. Better performance and scalability than iptables for large clusters.
userspace mode (deprecated): Proxy runs in userspace. Slower and rarely used.

Q: Can you show an example of how kube-proxy redirects Service traffic?

# Service 10.96.0.100:80 → Pods 10.244.1.5:8080, 10.244.2.7:8080
$ iptables -t nat -L KUBE-SERVICES
Chain KUBE-SERVICES
KUBE-SVC-XXX  tcp  --  anywhere  10.96.0.100  tcp dpt:80

$ iptables -t nat -L KUBE-SVC-XXX
Chain KUBE-SVC-XXX
KUBE-SEP-AAA  all  --  anywhere  anywhere  statistic mode random probability 0.5
KUBE-SEP-BBB  all  --  anywhere  anywhere  # remaining 50%

Q: Why do we need Endpoints?

Endpoints solve a fundamental problem in Kubernetes: Services have stable IPs, but Pods have ephemeral IPs that change constantly.

The problem:

A Service provides a stable virtual IP (e.g., 10.96.0.100) that applications can use
But pods are ephemeral—they get new IPs every time they start, restart, or move to a different node
When a pod is created, destroyed, or scaled, its IP changes
kube-proxy needs to know which actual pod IPs to route traffic to

Without Endpoints:

kube-proxy would have to constantly query the Kubernetes API to find pod IPs
This would be inefficient and slow
There would be no single source of truth for "which pods belong to this Service?"

With Endpoints:

Kubernetes automatically creates and maintains an Endpoints resource for each Service
The Endpoints resource lists all current pod IPs that match the Service selector
kube-proxy watches the Endpoints resource (not individual pods)
When pods change, the Endpoints resource is updated automatically
kube-proxy gets notified of changes and updates its routing rules (iptables/IPVS)

Q: What are EndpointSlices?

A newer, more scalable alternative to Endpoints (introduced in Kubernetes 1.16, GA in 1.21)
Splits endpoints across multiple slice resources (up to 100 endpoints per slice)
Reduces the size of individual resources, improving performance in large clusters
Provides better scalability: a Service with 1000 pods creates ~10 EndpointSlices instead of 1 large Endpoints resource
Includes additional metadata like topology hints (which zone/node pods are in)

Example EndpointSlice:

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: backend-abc123
  labels:
    kubernetes.io/service-name: backend
addressType: IPv4
ports:
- name: http
  port: 8080
  protocol: TCP
endpoints:
- addresses:
  - "10.244.1.5"
  conditions:
    ready: true
  nodeName: node-1
  zone: us-west-1a
- addresses:
  - "10.244.2.10"
  conditions:
    ready: true
  nodeName: node-2
  zone: us-west-1b

How kube-proxy uses them:

kube-proxy watches Endpoints or EndpointSlices (EndpointSlices preferred in modern clusters)
When endpoints change (pod created/destroyed), kube-proxy updates iptables or IPVS rules
Rules map Service IP → Pod IPs, enabling load balancing across pods
The watch mechanism ensures rules stay synchronized with actual pod state

Why EndpointSlices matter:

Performance: Smaller resources mean faster API server processing and less network traffic
Scalability: Can handle services with thousands of pods without creating massive single resources
Topology awareness: Includes zone/node information for better routing decisions
Future-proof: Foundation for advanced features like topology-aware routing

Q: What is CoreDNS?

CoreDNS is the default DNS server in Kubernetes. It:

Runs as a Deployment in the kube-system namespace
Watches Kubernetes Services and Endpoints
Automatically creates DNS records for all Services
Resolves service names to Service IPs (ClusterIP)
Supports custom DNS entries via ConfigMaps

When a pod queries backend.default.svc.cluster.local, CoreDNS returns the Service IP (e.g., 10.96.0.100), which kube-proxy then routes to an actual pod IP.

Q: How does service discovery work with DNS?

Kubernetes provides DNS for services via CoreDNS. Applications can use service names (e.g., backend.default.svc.cluster.local) instead of IP addresses. CoreDNS resolves service names to Service IPs, which kube-proxy then routes to pod IPs.

Q: What is the DNS naming format?

Format: <service>.<namespace>.svc.cluster.local
Short form: <service>.<namespace> or just <service> (same namespace)
Example: backend.default.svc.cluster.local → 10.96.0.100

Q: What is Ingress and how does it relate to Services?

Ingress provides HTTP/HTTPS routing from outside the cluster to Services. Unlike Services (which provide internal cluster networking), Ingress handles external access.

Ingress Controller: A reverse proxy (e.g., NGINX, Traefik, Envoy) that runs in the cluster and implements Ingress rules
Ingress Resource: Defines routing rules (host, path → Service)
Flow: External request → Ingress Controller → Service → Pod

Ingress works on top of Services—it routes external traffic to the appropriate Service, which then routes to pods. For advanced routing (canary, A/B testing, mTLS), service mesh is often used instead of or alongside Ingress.

Q: What is gateway api and how is it different from ingress controller?

The API replaces the "one-size-fits-all" Ingress object with three distinct resources, each designed for a specific organizational role:
GatewayClass (Infrastructure Provider): A cluster-scoped resource that defines a specific type of load balancer or proxy implementation (e.g., an AWS NLB, NGINX, or Istio).
Gateway (Cluster Operator): An instantiation of a GatewayClass. It defines the actual entry point where traffic is received, including configuration for specific listeners (ports and protocols like HTTP, HTTPS, TCP, or UDP) and TLS termination.
HTTPRoute / GRPCRoute (App Developer): Resource-specific rules that define how traffic should be routed from a Gateway to backend Services based on hostnames, paths, or headers

Key Benefits over Ingress:

Built-in Advanced Routing: Natively supports features like header-based matching, traffic splitting (canary rollouts), and request mirroring without requiring custom, non-portable annotations.
Broader Protocol Support: Beyond HTTP/HTTPS, it officially supports gRPC, TCP, UDP, and WebSockets.
Separation of Concerns: Teams can manage their own routing rules (via HTTPRoute) independently from the shared infrastructure (via Gateway), reducing the risk of accidental misconfigurations across a cluster.
Portability: As a standardized specification, configurations are portable between different vendors (e.g., from Envoy Gateway to Traefik) without needing to rewrite complex vendor-specific rules.
Cross-Namespace Routing: Allows a single Gateway to route traffic to Services in different namespaces securely through the use of ReferenceGrant objects.

Q: What are Network Policies?

Network Policies allow you to control traffic between pods using label selectors. They act as pod-level firewalls, allowing or denying traffic based on source pod labels, destination pod labels, and ports. Network Policies are enforced by CNI plugins (Cilium, Calico) at the kernel level, before packets reach the pod.

Q: Can you show an example NetworkPolicy?

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

This policy says: "Only pods labeled app: frontend can talk to pods labeled app: backend on port 8080."

Q: How are Network Policies implemented?

Cilium: Uses eBPF and Geneve TLV options for policy enforcement
Calico: Uses iptables rules and BGP for policy distribution
Flannel: Does not support Network Policies (needs Calico or Cilium)

NetworkPolicy vs service mesh authorization (multi-tenant isolation)

Aspect	NetworkPolicy (CNI)	Service mesh auth (sidecar)
Scope	L3/L4 (IP, port)	L7 (HTTP/gRPC) + identity
Enforcer	CNI dataplane (node)	Sidecar proxy (pod)
Best for	Blast-radius limits between namespaces/tenants; default-deny	App-level allow/deny, mTLS, per-route rules
Debug with	`kubectl get networkpolicies -A`, `cilium monitor`/`calicoctl`	Sidecar logs, `AuthorizationPolicy`/`ServerAuthorization`

Q: What is the Kubernetes networking stack?

The Kubernetes networking stack is built in layers, with each layer providing specific functionality:

┌─────────────────────────────────────┐
│  Service Mesh (L7)                  │  ← Identity, observability, traffic mgmt
│  (Istio, Linkerd)                   │
├─────────────────────────────────────┤
│  Services (kube-proxy)              │  ← Service discovery, load balancing
│  ClusterIP, NodePort, LoadBalancer  │
├─────────────────────────────────────┤
│  CNI Plugin (L2/L3)                 │  ← Pod networking, overlay networks
│  (Cilium, Calico, Flannel)          │
├─────────────────────────────────────┤
│  Linux Networking Primitives        │  ← Network namespaces, veth, bridges
└─────────────────────────────────────┘

Key takeaways:

CNI plugins provide pod networking using native or overlay networks (VXLAN/Geneve)
Services abstract pod IPs using virtual IPs and kube-proxy
Network Policies control traffic between pods
Service mesh adds Layer 7 capabilities on top of everything

Part 2: Cilium Special

Q: What are the requirements for Cilium native routing mode?

Native routing mode eliminates overlay encapsulation, routing pod IPs directly through the underlying network. This requires:

Routable pod IPs: Pod IPs (from the pod CIDR, e.g., 10.244.0.0/16) must be routable in your network infrastructure. The underlying network (routers, switches, cloud provider networking) must know how to route traffic to pod IPs.
BGP or static routes: You need either:
- BGP: Cilium can use BGP to advertise pod CIDR routes to your network infrastructure (routers, cloud provider route tables)
- Static routes: Manually configure routes in your network infrastructure pointing pod CIDRs to Kubernetes nodes
No IP conflicts: Pod IPs must not conflict with existing IPs in your network.

When to use native routing:

You have control over network infrastructure (on-premises, custom cloud networking)
You want maximum performance (no encapsulation overhead)
Your network supports BGP or you can configure static routes
Pod IPs can be made routable in your network

When to use overlay (Geneve/VXLAN):

Cloud provider environments where pod IPs aren't routable
You want simplicity (no BGP/static route configuration)
Network infrastructure doesn't support routing pod CIDRs
You need the rich metadata capabilities of Geneve TLV options

Q: How does Cilium use BGP?

Cilium supports BGP for native routing mode, allowing it to advertise pod CIDR routes to network infrastructure without using overlay encapsulation. Each Cilium node runs a BGP daemon that advertises its pod CIDR to BGP peers (routers, cloud provider route tables), enabling the network infrastructure to learn pod IP routes and route traffic directly to pods without encapsulation overhead.

Use cases for Cilium BGP:

On-premises deployments: Advertise pod routes to physical routers/switches
Cloud environments: Integrate with cloud provider route tables (AWS Route Tables, Azure Route Tables, GCP Routes)
Hybrid cloud: Connect on-premises and cloud networks via BGP
Large-scale clusters: Native routing performs better than overlay at scale
Integration with existing BGP infrastructure: Works with existing network equipment

BGP vs overlay in Cilium:

BGP (native routing): Better performance, no encapsulation overhead, requires BGP-capable infrastructure
Overlay (Geneve/VXLAN): Works everywhere, simpler setup, adds encapsulation overhead

Configuration example:
Cilium can be configured to use BGP by enabling the BGP control plane and specifying BGP peers (routers or route reflectors). The BGP daemon then advertises pod CIDRs to peers, enabling native routing.

Q: How does Cilium use Geneve for network policy?

Cilium is a popular CNI plugin that uses Geneve overlay and eBPF for advanced networking. With Cilium and Geneve, when you send a letter, the building's security system (Cilium agent) checks your ID, looks up the security policy, and attaches security stickers to the outer envelope: "From: frontend-workload", "Policy: allow-frontend-to-backend", "Security Clearance: Level 3". When the letter arrives at the destination building, the security guard there reads the stickers, verifies the policy allows this communication, and only then delivers the letter. If the stickers don't match the policy, the letter is rejected!

Part 3: Service Mesh and Workload Identity

Q: What problem does service mesh solve?

As microservices architectures became the norm in the 2010s, developers faced new challenges that infrastructure-layer networking (VLAN, VXLAN, Geneve) couldn't solve. While overlay networks could route packets between hosts using IP addresses, they operated at Layer 2/3 and couldn't help with application-layer concerns.

The fundamental problem: In a microservices world, services are ephemeral—they start, stop, scale, and move constantly. IP addresses change. Network boundaries are fluid. Traditional networking assumptions broke down.

Why this was painful:

Service discovery: You couldn't hardcode IPs because containers/pods get new IPs every time they restart or scale. Every service needed to implement its own service discovery (Consul, etcd, custom solutions), leading to inconsistency across teams. When "backend" had 10 instances, which one should you call? How do you load balance? Infrastructure networking could route packets, but couldn't answer "where is the backend service?"
Security between services: Every team was implementing their own TLS, authentication, and authorization logic. Some services used certificates, others used API keys, some had no security at all. This created security gaps, inconsistent implementations, and maintenance nightmares. When you had 50 microservices, you had 50 different security implementations to maintain.
Observability: When a user request failed, which service was the problem? Was it the frontend? The API gateway? The auth service? The database? There was no way to trace a request as it flowed through multiple services. Each service logged independently, but correlating logs across services was nearly impossible. You couldn't see the "big picture" of how services communicated.
Traffic management: Every service needed to implement retry logic, timeouts, circuit breakers, and load balancing. When backend was slow, frontend would retry—but how many times? With what backoff? What if backend was completely down—should you fail fast or keep retrying? Each team made different decisions, leading to cascading failures and inconsistent behavior.
Zero-trust security: Traditional network security relied on firewalls and network boundaries: "trust everything inside the network, block everything outside." But in microservices, there is no "inside"—services move, IPs change, and the network boundary is meaningless. An attacker who compromised one service could access all services on the same network. You needed identity-based security: "trust based on who you are, not where you are."

The breaking point: Developers were writing the same networking code (retry logic, TLS, metrics, tracing, service discovery) in every microservice. This was expensive, error-prone, and inconsistent. Service mesh emerged around 2016-2018 (with projects like Linkerd and Istio) to solve these problems by moving networking concerns out of application code and into a dedicated infrastructure layer that worked transparently for all services.

Q: What is a service mesh?

A service mesh is a dedicated infrastructure layer that handles service-to-service communication, security, observability, and traffic management for microservices. Unlike VLAN/VXLAN/Geneve which route packets at Layer 2/3 (infrastructure layer) using IP addresses, service mesh routes requests at Layer 7 (application layer) using service names and DNS.

Architecture: A service mesh consists of a control plane (e.g., Istio's istiod) that distributes configuration to sidecar proxies (e.g., Envoy) running alongside each application container. The sidecars handle mTLS, routing, and observability transparently. For detailed architecture diagrams, see the Istio Architecture documentation.

Q: How does service mesh integrate with Kubernetes networking?

Service mesh works on top of Kubernetes networking, adding Layer 7 capabilities. The integration flow:

App uses service name (backend.default.svc.cluster.local or just backend)
Sidecar resolves via Kubernetes DNS → Service IP (10.96.0.100)
Service discovery → Pod IP (10.244.2.10)
Overlay network (VXLAN/Geneve) routes packet to destination pod
Service mesh adds mTLS, observability, traffic management

Service mesh leverages Kubernetes service discovery and DNS, then adds identity-based security, request-level observability, and traffic management on top of the existing pod-to-pod networking.

Q: How does service mesh relate to overlay networks?

💡 Key Insight

Service mesh works on top of overlay networks. You still need VXLAN/Geneve to route packets between hosts/containers at the infrastructure layer. Service mesh then adds Layer 7 routing and capabilities transparently to applications.

While VLAN/VXLAN/Geneve are like the postal service that routes envelopes based on addresses (infrastructure layer), service mesh is like a smart assistant who reads your letter and routes it based on what you wrote (application layer). You write "Send this to the accounting department" (service name), and the assistant looks up which building and apartment that is, puts your letter in the right envelope (with security and tracking), and sends it. The assistant also adds a return envelope with your identity certificate, so the recipient knows it's really from you. The postal service (overlay network) still delivers the physical envelope, but the assistant (service mesh) handles the "who talks to whom" and "is this allowed" logic.

Q: What is GAMMA?

The GAMMA (Gateway API for Mesh Management and Administration) routes internal (East-West) traffic by repurposing standard Gateway API Route objects (like HTTPRoute or GRPCRoute) but changing their Parent Reference.

Graduation: GAMMA's work for supporting service mesh use cases (East-West traffic) graduated to the Standard Channel (GA) starting with Gateway API v1.1.0 in early 2024.
Core Feature: The primary GAMMA feature—binding a Route (like HTTPRoute) directly to a Service as a parent—is fully stable and supported by major service meshes like Cilium, Istio, and Linkerd.

Q: What is workload identity?

Workload Identity is a way to identify and authenticate workloads (containers, VMs, processes) using cryptographically verifiable certificates rather than IP addresses. Instead of saying "Allow mail from 10.0.0.5" (IP-based), we now say "Allow mail from the Accounting Department" (identity-based). Each workload gets a certificate (like a company ID badge) that proves who they are. When you send a letter, you include a copy of your ID badge in the envelope. The recipient checks: "Is this person from Accounting? Yes, they're allowed to send me mail." It's like moving from checking return addresses (which can be faked) to checking photo IDs (which can't be easily forged).

Q: What is the evolution of workload identity and what problem was it solving?

Workload identity evolved to solve the fundamental problem that IP addresses are not a reliable way to identify workloads in modern, dynamic environments.

The old way: IP-based security (pre-2010s)

Firewall rules: "Allow 10.0.0.5 to access database on port 5432"
Network segmentation: "Everything in subnet 10.0.1.0/24 is trusted"
This worked when servers had static IPs and rarely moved

Why IP-based security broke down:

Containers and VMs are ephemeral: A container gets a new IP every time it starts. Your firewall rule for 10.0.0.5 is useless when the container restarts and gets 10.0.0.47.
Scaling breaks rules: When you scale from 1 backend instance to 10, you can't maintain firewall rules for each IP. You'd need to update rules constantly.
Workloads move: A pod moves from Node A to Node B? Its IP changes. Your security rules break.
IPs can be spoofed: An attacker who compromises one workload can spoof IPs to appear as another workload.
No context: An IP address tells you nothing about what the workload is or who it belongs to. Is 10.0.0.5 the frontend? The backend? A test service? You can't tell from the IP.

The evolution:

Service accounts (early 2010s): Platforms like Kubernetes introduced service accounts—a step toward identity, but platform-specific. Kubernetes service accounts only work in Kubernetes. AWS IAM roles only work in AWS. No portability.
Workload identity (2018-present): Standards like SPIFFE emerged to provide portable, verifiable workload identity. Each workload gets a certificate (SVID - SPIFFE Verifiable Identity Document) that proves:
- Who it is: spiffe://cluster.local/ns/prod/sa/frontend
- Where it came from: The certificate is cryptographically signed, so it can't be forged
- What it can do: Policies can be written in terms of identity, not IPs

The breakthrough: Instead of "Allow IP 10.0.0.5", you now say "Allow workloads with identity spiffe://cluster.local/ns/prod/sa/frontend". The identity stays the same even when the IP changes. It works across platforms (Kubernetes, VMs, bare metal). It's cryptographically verifiable, so it can't be spoofed.

Q: When did service mesh and workload identity get integrated?

Service mesh and workload identity evolved separately at first, then became tightly integrated:

2016-2017: Early service meshes (Linkerd 1.0 in 2016, Istio 0.1 in 2017) initially used platform-specific identities:
- Kubernetes service accounts in Kubernetes environments
- No standard identity format
- Identity was tied to the platform
2017-2018: SPIFFE emerges: The SPIFFE project started in 2017 to create a standard for workload identity that works across platforms. SPIFFE provided the foundation, but service meshes weren't using it yet.
2019-2020: The integration begins: Service meshes started adopting SPIFFE:
- Istio 1.4 (2020): Added SPIFFE integration, allowing Istio to issue SPIFFE SVIDs
- Linkerd 2.7+ (2020): Integrated SPIFFE for workload identity
- This was the "marriage" - service meshes could now use standardized, portable workload identity
2020-present: Deep integration: Modern service meshes are built around workload identity:
- Identity is no longer optional—it's core to how service mesh works
- mTLS uses workload identity certificates (SVIDs)
- Authorization policies are written in terms of workload identity
- Works across Kubernetes, VMs, and bare metal

Why the integration matters: Before SPIFFE integration, service meshes were platform-locked. A Kubernetes service account identity couldn't be verified by a VM-based service. With SPIFFE, the same workload identity works everywhere, enabling true multi-platform service mesh deployments and zero-trust security across heterogeneous environments.

Q: What are the identity models used by service mesh?

Service meshes support three main identity models, each with different trade-offs:

SPIFFE/SVID: SPIFFE (Secure Production Identity Framework for Everyone) provides a standard, portable workload identity via SVIDs (SPIFFE Verifiable Identity Documents). SPIFFE identities are cryptographically verifiable certificates that work across platforms (Kubernetes, VMs, bare metal). This is the most portable and future-proof approach. Used by Istio (with SPIFFE integration) and Linkerd. Learn more: SPIFFE Documentation, SPIFFE Specification
Platform service accounts: Service meshes can use platform-specific service accounts (e.g., Kubernetes service accounts, AWS IAM roles, Azure managed identities) as workload identity. This is simpler to set up but ties you to a specific platform—a Kubernetes service account identity can't be verified by a VM-based service. Good for single-platform deployments.

Kubernetes service accounts are the most common example. Each pod can be assigned a service account, and the service mesh uses this to identify the workload. The service account name (e.g., frontend-sa) becomes part of the workload identity. However, this only works within Kubernetes—you can't use a Kubernetes service account identity to authenticate to a VM-based service. Learn more: Kubernetes Service Accounts, Using Service Accounts with Istio, Linkerd Service Account Identity
Custom identity providers: Some meshes integrate with cloud provider IAM (AWS IAM, Azure AD, GCP IAM) or custom identity systems. This allows leveraging existing identity infrastructure but requires custom integration work and may not be portable across platforms. Learn more: AWS App Mesh IAM Integration, Azure Service Mesh Identity

Q: What are the constraints of service mesh?

CPU overhead: Every request goes through a proxy, adding 1-5ms latency and consuming CPU
Complexity: Debugging distributed systems with sidecars is harder; requires understanding both application and mesh behavior
Latency: Additional hops add milliseconds (though often acceptable for the benefits)
Resource consumption: Each workload requires a sidecar proxy, increasing memory and CPU usage

Q: Do I need service mesh?

Short answer: It depends. Service mesh solves real problems, but it's not always the right solution.

The pragmatic approach:

Start simple: Use API gateways, load balancers, and basic monitoring first
Add service mesh when you feel the pain: When you find yourself writing the same networking code in every service, or when observability becomes impossible
Consider alternatives: API gateways (Kong, Ambassador) can provide some service mesh features without the complexity
Evaluate the trade-offs: Service mesh adds complexity and overhead. Make sure the benefits (security, observability, traffic management) justify the cost

Remember: Service mesh is infrastructure. Like any infrastructure, it should solve problems you actually have, not problems you might have someday. If you're not experiencing the pain points service mesh solves, you probably don't need it yet.

Q: Can you walk through an example of frontend calling backend using identity?

This section details the complete packet journey in a Kubernetes cluster augmented with a service mesh, illustrating the interplay of all the components discussed so far. The diagram below shows the flow using SPIFFE identities for workload authentication:

Q: What's the difference between IP-based and identity-based security in Kubernetes?

The key principle: identity-based security replaces IP-based security in modern Kubernetes deployments.

Old (IP-based):

Allow 10.0.0.5
Deny 10.0.0.6
Network Policies based on pod IPs (which change constantly)

New (Identity-based):

Allow spiffe://cluster.local/ns/prod/sa/frontend
Deny spiffe://cluster.local/ns/dev/*
Authorization policies based on workload identity (which stays constant)

Q: Where do I create identity-based security rules and who enforces them in Kubernetes?

Identity-based security rules are created as Kubernetes Custom Resources (CRDs) and enforced by the service mesh data plane (sidecar proxies).

Where rules are created:

Istio: Create AuthorizationPolicy resources in Kubernetes:

  apiVersion: security.istio.io/v1beta1
  kind: AuthorizationPolicy
  metadata:
    name: allow-frontend-to-backend
    namespace: prod
  spec:
    selector:
      matchLabels:
        app: backend
    action: ALLOW
    rules:
    - from:
      - source:
          principals: ["cluster.local/ns/prod/sa/frontend"]

Linkerd: Create Server and ServerAuthorization resources:

  apiVersion: policy.linkerd.io/v1beta2
  kind: ServerAuthorization
  metadata:
    name: backend-authz
    namespace: prod
  spec:
    server:
      name: backend
    client:
      meshTLS:
        identities:
        - "frontend.prod.serviceaccount.identity.linkerd.cluster.local"

General pattern: Rules are defined as YAML manifests and applied via kubectl apply, just like any Kubernetes resource.

Who enforces them:

Service mesh control plane (istiod, Linkerd control plane): Distributes the rules to all sidecar proxies
Sidecar proxies (Envoy, Linkerd proxy): Enforce the rules at runtime—they intercept traffic, verify workload identity, and allow/deny requests based on the policies
No application code changes: The application doesn't know about the rules—the sidecar proxy handles enforcement transparently

Example flow:

You create an AuthorizationPolicy with kubectl apply
Istio control plane (istiod) reads the policy and distributes it to all Envoy sidecars
When frontend tries to call backend, Envoy sidecar checks: "Does this request come from spiffe://cluster.local/ns/prod/sa/frontend?"
Envoy looks up the policy: "Yes, frontend is allowed to talk to backend" → Request proceeds
If the identity doesn't match, Envoy rejects the request with HTTP 403

Key insight: The rules are Kubernetes resources (like Deployments or Services), but enforcement happens in the service mesh data plane (sidecar proxies), not in Kubernetes itself. This gives you identity-based security without modifying application code.

Q: What is the complete packet flow from app to app via overlay network in Kubernetes with service mesh?

Here's the complete journey of a packet from one pod to another:

References

Service Mesh and Identity

SPIFFE: Secure Production Identity Framework for Everyone
Istio: Connect, Secure, Control, and Observe Services
Linkerd: Ultra Lightweight Service Mesh for Kubernetes

Container Networking

CNI Specification: Container Network Interface
Cilium: eBPF-based Networking, Security, and Observability

This article is part of the "Learning in a Hurry" series, designed to help engineers quickly understand complex technical concepts through analogies and practical examples.