daniel jeong

Posted on May 28 • Originally published at manoit.co.kr

Cilium 1.19 Deep Dive — 10-Year Anniversary: IPsec/WireGuard Strict Mode, Ztunnel Beta, Policy-Default-Local-Cluster, Multi-Pool IPAM Stable

#kubernetes #security #observability #servicemesh

Cilium 1.19 Deep Dive — 10-Year Anniversary Release: IPsec/WireGuard Strict Mode, Ztunnel Transparent Encryption Beta, Policy-Default-Local-Cluster, Multi-Pool IPAM Stable, and Hubble Drop Tagging Redefining the 2026 eBPF Networking, Security, and Observability Standard

TL;DR

Cilium v1.19 (May 13, 2026) is the 10-year anniversary release and flips multiple defaults toward operational safety.

IPsec / WireGuard strict mode drops unencrypted traffic by default, ending best-effort encryption gaps.

ClusterMesh policy-default-local-cluster is now the default — audit your existing NetworkPolicies before upgrading or you will silently cut multi-cluster traffic.

Ztunnel Transparent Encryption (Beta) brings sidecarless workload-identity mTLS to Cilium, interoperable with Istio Ambient.

Multi-Pool IPAM graduates to Stable, Hubble adds drop policy tagging + encrypted-flow filters + Trace IP Options, and Network Policy denials can now return ICMPv4 Destination Unreachable to skip the 30-second TCP retry loop.

Cilium hit a clean ten years since its first commit, and v1.19 lands as the anniversary release. v1.19.0 dropped in mid-May 2026 and patches rolled to v1.19.4 within two weeks. There is no single flagship feature on the cover — instead, six axes evolve simultaneously to prove the promise of "an eBPF dataplane you can actually operate quarter after quarter." (1) Strict modes for IPsec and WireGuard turn node-to-node encryption from best-effort into a hard requirement. (2) Ztunnel Transparent Encryption lands as a beta integration, opening a sidecarless workload-identity encryption path next to the node-level encryption story. (3) ClusterMesh policy-default-local-cluster becomes the new default, structurally blocking the "I wrote a local policy that quietly fanned out across the mesh" class of incidents. (4) Multi-Pool IPAM graduates from Beta to Stable and now works with IPsec and direct routing. (5) Hubble adds drop event policy tagging, encrypted-flow filters, and Trace IP Options so "why was this packet dropped?" is answerable in one command. (6) Network Policy denials can now return ICMPv4 Destination Unreachable, ending the dumb 30-second TCP retry loop. This article decomposes the root cause of each of the six changes at the eBPF datapath / policy compilation / CRD schema level and lays out the nine-step upgrade, observation, and rollback playbook ManoIT applied across three internal Kubernetes clusters (prod / stage / dev).

1. Why May 2026's v1.19 is an inflection point for Cilium

Cilium started in April 2016 when Thomas Graf rewrote the Kubernetes dataplane in eBPF instead of iptables. v1.0 in 2018, CNCF Sandbox in 2019, Incubating in 2021, and Graduated in October 2023 — by now Cilium is the dataplane behind or recommended by GKE Dataplane v2, EKS Anywhere, OpenShift, Talos Linux, K3s, and most other major Kubernetes distributions. v1.19 is the inflection point where the 10-year anniversary symbolism meets a deliberate maintainer pivot: "operational safety nets become the default."

Date	Event	Operational meaning
2016.04	Cilium first commit (Thomas Graf)	eBPF-based K8s dataplane launches
2018.04	v1.0 — Production-ready	"L7 visibility + identity-based" model settles
2019.06	CNCF Sandbox accepted	Community governance stage 1
2021.10	CNCF Incubating	Hubble · ClusterMesh stabilization era
2023.10	CNCF Graduated	Enterprise adoption guidelines formalized
2024.04	v1.16 — Gateway API Beta, Multi-Pool IPAM Beta	Service mesh + multi-CIDR operations activated
2025.05	v1.17 — Gateway API GA, BGPv2 Stable	Accelerated Ingress NGINX retirement flow
2025.10	v1.18 — ClusterMesh API server v2, KVStoreMesh stable	Simplified large-scale multi-cluster control plane
2026.05.13	v1.19 — Strict Mode, Ztunnel Beta, policy-default-local-cluster, Multi-Pool IPAM Stable, Hubble drop tagging, ICMP friendly deny	Operational safety nets become the default
2026.05.27	v1.19.4 patch release	Rapid 0.x stabilization in progress

Two messages matter for operators. (1) "Default changes are the biggest changes." — ClusterMesh's policy-default-local-cluster flipping from false to true is not a feature addition; it is the default safety posture of multi-cluster policy flipping. (2) "Strict mode is the fastest path through a compliance audit." — Once IPsec or WireGuard is in strict mode, unencrypted traffic is dropped on the wire, so the "we encrypted, but some packets leaked in plaintext" audit finding disappears structurally.

2. IPsec/WireGuard Strict Mode — best-effort encryption becomes hard requirement

The longest section in the v1.19 release notes. Cilium's transparent encryption has supported IPsec since v1.4 and WireGuard since v1.10. But both modes were best-effort: "encrypt where we can, fall back to plaintext when peer keys aren't established or the protocol can't negotiate." That fallback was the most common finding in security audits.

2.1 Three gaps of the best-effort era

Scenario	v1.18 behavior	Audit verdict
New node joins cluster, key exchange still in progress	Plaintext until key negotiation completes, then encryption	"Plaintext window exists" finding
WireGuard peer key missing on a discovered node	Plaintext fallback	"Cannot enforce encryption" finding
IPsec XFRM policy partially expired (SPI rotation)	Plaintext fallback during renegotiation	"Plaintext traffic visible in audit log" finding

2.2 v1.19 fix — strict mode drops unencrypted traffic

v1.19 adds encryption.strictMode to both IPsec and WireGuard. With it enabled, the following behavior is enforced:

# helm/cilium-values.yaml — IPsec strict mode
# WARNING: Enable only after keys are distributed to every node.
# Partial rollout will drop plaintext and cut communication.
encryption:
  enabled: true
  type: ipsec
  ipsec:
    interface: ""
    keyFile: keys
    mountPath: /etc/ipsec
  strictMode:
    enabled: true                # v1.19 new — best-effort -> hard requirement
    cidr: "10.0.0.0/8"           # CIDR strict applies to (usually covers PodCIDR)
    allowRemoteNodeIdentities: false   # new nodes without keys are dropped immediately
nodeinit:
  enabled: true

# helm/cilium-values.yaml — WireGuard strict mode
encryption:
  enabled: true
  type: wireguard
  nodeEncryption: true
  wireguard:
    persistentKeepalive: "0s"
  strictMode:
    enabled: true                # v1.19 new
    cidr: "10.0.0.0/8"
    allowRemoteNodeIdentities: false

# Verify after applying
helm upgrade cilium cilium/cilium \
  --version 1.19.4 \
  --namespace kube-system \
  -f helm/cilium-values.yaml \
  --reuse-values

# Per-node strict status
kubectl -n kube-system exec -it ds/cilium -- cilium status --verbose | grep -A 3 Encryption
# Encryption:               Wireguard [strict]
# Strict mode CIDR:         10.0.0.0/8
# Allowed remote identities: 0
# Unencrypted drops (last 1m): 0

# Intentional plaintext blocking check
kubectl exec -it test-pod -- ping -c 3 unencrypted-peer-ip
# PING ... 100% packet loss   ← strict is doing its job

2.3 Operational rollout — 4-step gradient to avoid cluster-wide outage

Strict mode, if flipped at the wrong time, instantly takes the cluster offline. ManoIT's internal standard is a 4-step gradient:

Step	Action	Verification	Rollback trigger
1	Distribute keyFile to every node, restart cilium in plaintext mode	`cilium status` reports keys OK on every node	If any single node lacks keys, abort
2	Set `strictMode.enabled=true` with `allowRemoteNodeIdentities=true`	Hubble drop counters unchanged	Drops appear → flip back to false immediately
3	After 1 week stable, flip `allowRemoteNodeIdentities=false`	Join a fresh node, verify post-key-registration traffic flows	If new nodes must join without keys, temporarily set true
4	Add Prometheus alert on `cilium_encryption_unencrypted_packets_dropped_total` increasing	Zero alert fires for 14 days	On a fire, root-cause first, then re-enable

3. Ztunnel Transparent Encryption Beta — sidecarless workload authentication

The second big change is aligned with the service-mesh ecosystem's direction. v1.19 ships a beta integration of Ztunnel (zero-trust tunnel), the same primitive Istio Ambient Mode standardized. This is not just "Istio compatibility" — it means the Cilium node agent coordinates directly with ztunnel to run a separate mTLS dataplane wrapping workload-to-workload TCP.

3.1 What is different from IPsec/WireGuard?

Axis	IPsec/WireGuard (node-to-node)	Ztunnel (workload-to-workload)
Scope	Node ↔ Node (L3/L4)	Workload ↔ Workload (L4 / mTLS)
Auth unit	Node ID (Cilium identity)	SPIFFE SVID (workload ID)
Key management	IPsec SA / WG peer key	SPIRE-compatible SDS
Sidecars required	No	No (ztunnel runs as a node DaemonSet)
Granularity	Cluster-wide	Per-namespace enrollment
Mesh interop	—	Works with Istio Ambient L4 or Cilium Ztunnel

3.2 Enabling — namespace-scoped enrollment

# helm/cilium-values.yaml — Ztunnel beta
# WARNING: Beta — recommend 4 weeks of staging validation before production
encryption:
  enabled: true
  type: ztunnel
  ztunnel:
    enabled: true                       # v1.19 new beta gate
    image:
      repository: quay.io/cilium/ztunnel
      tag: v1.19.4
    spire:
      enabled: true                     # SPIFFE SVID issuance — requires SPIRE server
      serverAddress: spire-server.spire-system:8081
      trustDomain: cluster.local

# Enroll a namespace into Ztunnel
kubectl label namespace payments cilium.io/ztunnel-enabled=true
kubectl rollout restart -n payments deploy

# Verify enrollment
kubectl -n kube-system get pods -l app=ztunnel
# NAME            READY   STATUS    AGE
# ztunnel-abc12   1/1     Running   1m
# ztunnel-def34   1/1     Running   1m

# Verify enrolled-workload mTLS
kubectl -n payments exec -it api-pod -- curl -v http://db:5432
# * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
# * Server certificate: spiffe://cluster.local/ns/payments/sa/db

4. ClusterMesh policy-default-local-cluster — default change blocks incidents

The quietest but most impactful change in v1.19. When a NetworkPolicy selector did not specify a cluster, v1.18 matched the entire mesh. So if one cluster wrote allow from app=frontend, workloads in another cluster labeled app=frontend were also implicitly allowed. Even when operators meant "only inside my cluster," the policy quietly fanned out through the mesh.

4.1 The accidental cross-cluster exposure pattern

# Pre-v1.19: unintentionally fanned out across the mesh
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-allow-frontend
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend    # WARNING: in v1.18 this matched app=frontend across the entire mesh

4.2 New default — local cluster only

# v1.19 implicitly adds io.cilium.k8s.policy.cluster=<local>
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-allow-frontend
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend    # v1.19 narrows to the local cluster

# Explicit opt-in for mesh-wide matching
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-allow-frontend-mesh
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
            io.cilium.k8s.policy.cluster: cluster-east   # explicit match

4.3 Upgrade action — audit existing mesh policies first

Upgrading to v1.19 may suddenly narrow policies that implicitly traversed the mesh, breaking communication. The maintainers recommend the following procedure in the upgrade guide:

# Step 1: Find CiliumNetworkPolicy rules that don't specify the cluster label
kubectl get ciliumnetworkpolicy -A -o json \
  | jq -r '.items[] | select(.spec.ingress // [] | .[].fromEndpoints // [] | .[].matchLabels | has("io.cilium.k8s.policy.cluster") | not) | .metadata.namespace + "/" + .metadata.name'

# Step 2: Ask each policy owner whether the intent was mesh or local
# Step 3: For mesh intent, PR explicit cluster labels
# Step 4: Upgrade to v1.19 — missing mesh policies will sever communication immediately
helm upgrade cilium cilium/cilium --version 1.19.4 --namespace kube-system --reuse-values

5. Multi-Pool IPAM Stable — works with IPsec and direct routing

Multi-Pool IPAM was introduced as Beta in v1.16, opening operational autonomy to allocate different CIDRs to different workloads in the same cluster. But up to v1.18 it had no stability guarantees on IPsec or direct-routing environments, which limited production use. v1.19 graduates it to Stable, and both environments are officially supported.

5.1 CiliumPodIPPool example

# Payments workload pool — non-overlapping CIDR with corporate VPC
apiVersion: cilium.io/v2alpha1
kind: CiliumPodIPPool
metadata:
  name: payments-pool
spec:
  ipv4:
    cidrs:
      - 10.20.0.0/16
    maskSize: 24
  ipv6:
    cidrs:
      - fd00:payments::/56
    maskSize: 64

# Pod chooses pool via annotation
apiVersion: v1
kind: Pod
metadata:
  name: api-server
  namespace: payments
  annotations:
    ipam.cilium.io/ip-pool: payments-pool   # v1.19 Stable
spec:
  containers:
    - name: api
      image: api:1.0

5.2 IPsec strict mode + Multi-Pool combo — set strict CIDR wide enough

# When combining the two, the strict CIDR must cover every pool
encryption:
  enabled: true
  type: ipsec
  strictMode:
    enabled: true
    cidr: "10.0.0.0/8"    # WARNING: must encompass all CiliumPodIPPool CIDRs

6. Hubble drop event policy tagging, encrypted-flow filters, Trace IP Options

The three observability additions in v1.19 cut debugging time directly.

6.1 Drop events automatically carry the denying policy name

# v1.18: drop reason only — "which policy denied?" needs manual correlation
hubble observe --verdict DROPPED --since 5m
# Aug 12 12:34:56 default/api-1234 :: default/db-5678 DROPPED (Policy denied)

# v1.19: policy name and namespace attached to the verdict label
hubble observe --verdict DROPPED --since 5m -o json | jq '.flow.dropReasonDesc'
# {
#   "reason": "PolicyDenied",
#   "policy_name": "default-deny-egress",
#   "policy_namespace": "production",
#   "policy_kind": "CiliumNetworkPolicy"
# }

6.2 Encrypted vs unencrypted flow filtering

# Show only unencrypted traffic — essential before enabling strict mode
hubble observe --unencrypted --since 1h | tee unencrypted.log

# Show only encrypted traffic for analysis
hubble observe --encrypted --since 1h --output json > encrypted.jsonl

6.3 Trace IP Options — mark specific packets for path tracing

# Mark packets with IPv4 options to trace their datapath hops
# WARNING: some NICs/switches drop packets with IPv4 options — validate in test env
kubectl -n kube-system patch cm cilium-config --type merge -p '{"data":{"trace-ip-options":"true"}}'
kubectl -n kube-system rollout restart ds/cilium

# Show per-hop trace for marked packets
hubble observe --ip-option-marked --output table

7. Network Policy ICMPv4 Destination Unreachable — ending the dumb 30-second retry

In v1.18 and earlier, a Network Policy denial silently dropped the packet and the client retried TCP for about 30 seconds. v1.19 adds an option to return ICMPv4 Destination Unreachable (code 13 — Communication Administratively Prohibited). The client OS immediately maps that to connection refused and debugging latency collapses.

# helm/cilium-values.yaml
# WARNING: external firewalls blocking ICMPv4 will swallow the response
policyEnforcementMode: default
policyAuditMode: false
icmpUnreachable:
  enabled: true       # v1.19 new — friendly deny response

# Verify the friendly deny
kubectl exec -it test-pod -- curl -v http://api:8080
# * connect to api port 8080 failed: Connection refused   ← terminates immediately, no 30s wait

8. Visualization — how v1.19's six axes combine in the deployment flow

The diagram below shows how the six axes of v1.19 combine when a new workload is deployed.

flowchart LR
    A[New Pod deploy] --> B{Which IP Pool?}
    B -->|payments-pool| C[Multi-Pool IPAM Stable<br/>allocate from 10.20.0.0/16]
    C --> D{Inside strict mode CIDR?}
    D -->|Yes| E[IPsec/WireGuard<br/>strict encryption enforced]
    D -->|No| F[Plaintext blocked → cut traffic]
    E --> G{Namespace enrolled in Ztunnel?}
    G -->|Yes| H[Ztunnel mTLS Beta<br/>SPIFFE SVID issued]
    G -->|No| I[L4 only]
    H --> J[Evaluate CiliumNetworkPolicy]
    I --> J
    J -->|allow| K[Hubble flow OK]
    J -->|deny| L[ICMPv4 friendly deny<br/>Hubble drop + policy name tagged]

9. ManoIT internal checklist — 3 clusters × 9 steps

The checklist below extends the seven sections above into an operations procedure. ManoIT runs three clusters (prod / stage / dev) and validates alpha/beta features in staging for 2 weeks and prod for 1 week before progressive rollout.

#	Item	Owner	Completion criteria
1	Inventory Cilium · Hubble · ClusterMesh API server versions across 3 clusters	Platform team	PR listing instances below v1.19
2	Audit CiliumNetworkPolicy — extract rules with no cluster label	Platform team	jq script output + contact each policy owner
3	Add explicit cluster labels to policies whose intent was mesh-wide	Each service owner	All policy PRs merged
4	Upgrade dev to v1.19.4 (strict OFF, Ztunnel OFF)	Platform team	`cilium version` = 1.19.4
5	Validate mesh-policy regression in dev — zero unintended communication breaks	Each service owner	Hubble drop counter delta report
6	Enable Multi-Pool IPAM Stable in staging with v1.19.4	Platform team	Verify allocation from payments-pool for new pods
7	Enable IPsec strict mode in staging via 4-step gradient	Platform team	14-day report with unencrypted drops = 0
8	Enable Ztunnel Beta in staging — only one namespace enrolled	Platform team	SPIRE integration OK, mTLS flow visible in Hubble
9	Verify Hubble drop tagging, encrypted filter, Trace IP Options	Observability team	Operations runbook updated for the 3 features
10	Enable ICMPv4 friendly deny — check external firewall ICMP rules	Network + Platform team	Immediate termination verified (curl/ping tests)
11	Upgrade prod to v1.19.4 (strict OFF, Ztunnel OFF)	Platform team	prod `cilium version` = 1.19.4
12	Enable Multi-Pool IPAM in prod — payments and logs workloads first	Platform team	Per-pool IP usage exported as Prometheus metric
13	Gradually enable IPsec strict mode in prod — 4-step standard procedure	Platform team	30-day unencrypted drops = 0 + compliance audit evidence
14	Enable ICMPv4 friendly deny in prod — paired with step 7	Platform team	Average denial termination time 30s → 1s measurement
15	Add Prometheus alerts — `cilium_encryption_unencrypted_packets_dropped_total` increase, ClusterMesh policy drop spikes, Multi-Pool exhaustion	Observability team	Alert rule PR merged, fire/resolve test passes
16	Operational RFC — Ztunnel Beta enrollment for new workloads only, existing workloads after Beta exits	Platform team	RFC merged, scheduled for quarterly security review

10. Conclusion — the 10-year inflection point that flipped defaults toward safety

Wrap the six changes of v1.19 in one line: "Cilium spent ten years getting to the point where it can ship operational safety nets as defaults." Strict mode for IPsec and WireGuard structurally erases the plaintext window of best-effort encryption. Ztunnel integration brings sidecarless workload authentication to beta and aligns with the Istio Ambient camp. ClusterMesh policy-default-local-cluster inverts the most dangerous default of the past six years. Multi-Pool IPAM Stable hands back CIDR autonomy in a safe form. Hubble drop tagging, encrypted-flow filters, and Trace IP Options answer "why was this dropped?" in one command. ICMPv4 friendly deny collapses 30-second retry loops to 1 second.

Three reminders for operators as we close. (1) Audit ClusterMesh policies before upgrading — the policy-default-local-cluster default flip is the most common v1.19 incident cause, and it can cut traffic without warning. (2) Roll out strict mode in four steps — key distribution → enable strict (allow remote = true) → 1-week soak → allow remote = false → 30-day stability monitoring is the safe progression. (3) Adopt Ztunnel Beta starting from new namespaces — SPIRE / SPIFFE SVID integration is operationally heavy, so enroll payments and high-sensitivity workloads first and revisit the rest after v1.20 GA. The 16-item checklist in §9 is exactly that, expressed as an internal procedure. The shortest one-line recommendation: "Upgrade dev to v1.19.4 today, and open the ClusterMesh policy audit PR this week."

Found this useful? Hit the ❤️ reaction to help others find it too!

What's your experience with Cilium strict mode or Ztunnel? Share in the comments — I'd love to hear about your production rollout and the lessons you learned.

ⓘ This article was produced by ManoIT's automated blogging pipeline (Claude Opus 4.6 + Cowork Agent) by analyzing the Cilium v1.19.0 release notes (GitHub Discussions #44191) published on May 13, 2026, the subsequent v1.19.4 patch (2026-05-27), the Encryption / IPAM / Hubble / ClusterMesh docs at docs.cilium.io, Isovalent's v1.19 release blog, and InfoQ's 10-year retrospective as primary sources. The alpha/beta gate flag names, behaviors, and metrics in this article reflect the official documentation as of the publication date (2026-05-28); Beta features may change in subsequent releases. Verify against cilium/cilium GitHub Releases and docs.cilium.io before applying to production. The internal-adoption examples cite an adapted ManoIT platform-team RFC.

Originally published at ManoIT Tech Blog.

DEV Community