eBPF in Production Kubernetes: Ditch Your Sidecars in 2026

#kubernetes #cloudnative #devops

How I cut 75GB of sidecar RAM to 12GB using Cilium, Hubble, Pixie, and Tetragon — with zero app code changes.

I'm not going to tell you eBPF is the future. It's already the present. The CNCF Observability TAG survey shows 67% of teams running Kubernetes at scale have adopted at least one eBPF-based observability tool in production. If you're not in that 67%, you're paying for it — literally.

Here's what convinced me to migrate our cluster.

The sidecar tax nobody talks about

We were running Istio. Standard setup — Envoy sidecar in every pod, Jaeger for traces, Prometheus scraping everything. Worked fine until our cluster hit 500 pods.

Each Envoy proxy consumes approximately 50–150MB RAM baseline, scaling with connection count. For a 500-pod cluster, that's the difference between over 75GB RAM for sidecars versus roughly 12GB for the entire eBPF stack.

That's not a rounding error. That's a billing line item.

The stack I run today

Four tools, all CNCF projects, all production-grade.

Cilium + Hubble — replaces your CNI and gives you L3–L7 network visibility. Run kernel 6.1+ for CO-RE support so you're not recompiling eBPF programs per node.

helm install cilium cilium/cilium --version 1.15.0 
--namespace kube-system 
--set hubble.relay.enabled=true 
--set hubble.ui.enabled=true 
--set kubeProxyReplacement=true

**Pixie** — zero-instrumentation APM. Attach it to your cluster and immediately get service maps, request traces, and flame graphs. No SDK, no code changes, no redeploy.

bashpx deploy --cluster-name my-cluster

Tetragon — security observability and runtime enforcement at the kernel layer. Unlike Falco reading audit logs in userspace, Tetragon hooks directly into the kernel — it can block before the action completes, not after.

Grafana Beyla — emits standard OpenTelemetry spans automatically. Donated to the OTel project at KubeCon EU 2026. Your replacement for manual SDK instrumentation.

Kernel version matters more than you think

Run uname -r on every node. You need 5.10 LTS minimum, 6.1+ recommended. On GKE use Container-Optimized OS. Verify before you start.

Migration playbook (8 weeks, not 8 months)

Week 1–2: Install Cilium on staging, migrate from Calico/Flannel, verify Hubble UI shows your service graph
Week 3–4: Deploy Tetragon, apply TracingPolicies for sensitive file access and privilege escalation detection
Week 5–6: Deploy Beyla, run parallel with existing instrumentation to verify data consistency
Week 7–8: Build Grafana dashboards, configure OTel Collector pipeline, cut over fully

The sidecar fleet came down in week 6. Our p99 latency dropped 18ms. The platform team stopped asking for bigger node pools.

References

Cilium Documentation — eBPF-based networking and observability reference
Pixie — Open Source Kubernetes Observability — zero-instrumentation APM for Kubernetes
Tetragon Security Observability — kernel-level security enforcement
Grafana Beyla — eBPF-based auto-instrumentation for OpenTelemetry
CNCF eBPF Applications Landscape — production eBPF tooling overview
KubeCon EU 2026: Splunk OBI Beta — zero-code observability announcement