OpenTelemetry eBPF Instrumentation (OBI) — The Complete Guide: KubeCon EU 2026 Beta Launch, Zero-Code Observability, and the 1.0 GA Roadmap

#observability #devops #opentelemetry #kubernetes

OpenTelemetry eBPF Instrumentation (OBI) — The Complete Guide: KubeCon EU 2026 Beta Launch, Zero-Code Observability, and the 1.0 GA Roadmap

Published on the ManoIT Tech Blog (Korean original). On April 2026 at KubeCon + CloudNativeCon Europe in Amsterdam, Splunk formally announced the beta launch of OpenTelemetry eBPF Instrumentation (OBI) — the OpenTelemetry community's successor to Grafana Beyla. This post walks through v0.8.0 architecture, Kubernetes Helm deployment, HTTP header enrichment for multi-tenant incident response, the 2026 roadmap toward 1.0 GA, how OBI relates to Beyla/Pixie/Tetragon/Hubble, and a production adoption checklist grounded in what ManoIT ships to customers.

1. Why OBI matters — the zero-code observability inflection point

CNCF's Observability TAG reported in Q1 2026 that 67% of production Kubernetes clusters are already running at least one eBPF-based observability tool. But the existing landscape was fragmented: Pixie was tied to New Relic, Grafana Beyla skewed toward Grafana Cloud, and Cilium Hubble stopped at L3/L4 network flows without application-level tracing. OBI cleans this up using the OpenTelemetry Protocol (OTLP) and an Apache 2.0 license.

Pain point	Traditional workaround	OBI answer	Operational effect
Go/Rust/C++ binary auto-instrumentation	OTel SDK insertion, rebuild required	eBPF uprobes + kprobes, zero code	Legacy and third-party binaries get visibility immediately
TLS-encrypted traffic tracing	Sidecar proxy (Envoy/Istio) injection	Kernel-level SSL_read/SSL_write hooks	HTTPS payloads observable without sidecars
Multi-tenant SaaS incident triage	"Error rate up" — no idea which tenant	HTTP header enrichment (v0.7.0+)	Filter by `x-tenant-id` / `x-user-segment`
SQL / Redis / Mongo query analysis	ORM instrumentation + sampling	Native server spans (pgx, mysql, mongo, redis, couchbase)	DB latency linked to app traces
OpenAI / Anthropic call tracing	Manual wrappers + custom token counting	GenAI instrumentation with payload extraction	LLM cost and latency collected automatically

In one line: OBI collects only what the kernel can tell it, and leaves the SDK alone. If you still need custom business events or application-specific attributes, OBI is designed to run alongside the OpenTelemetry SDKs — it fills visibility gaps, it doesn't replace language-level instrumentation.

2. OBI architecture — from Beyla to OBI

OBI's technical lineage is Grafana Beyla. Grafana Labs donated Beyla to OpenTelemetry in 2025; a weekly SIG formed, test pipeline speeds improved 10×, and after a late-2025 alpha release the project reached v0.8.0 on April 16, 2026. It ships as a binary, as a Docker image (otel/ebpf-instrument), and as a Helm chart.

┌─────────────────────────── User-Space Agent ───────────────────────────┐
│  (obi binary, written in Go)                                           │
│                                                                        │
│  ┌──────────────┐  ┌───────────────┐  ┌────────────────────────────┐ │
│  │ eBPF Map     │→ │ Span Builder  │→ │ OTLP Exporter (gRPC/HTTP) │ │
│  │ Reader       │  │ (HTTP/gRPC/DB)│  │ → OTel Collector           │ │
│  └──────────────┘  └───────────────┘  └────────────────────────────┘ │
│         ↑                                                              │
│         │ eBPF maps (perf_event_array, ring_buffer)                    │
└─────────┼──────────────────────────────────────────────────────────────┘
          │
┌─────────┼────────────────── Kernel-Space Probes ───────────────────────┐
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │
│   │ uprobes      │  │ kprobes      │  │ tracepoints  │               │
│   │ (SSL_read,   │  │ (tcp_sendmsg,│  │ (sched, fs)  │               │
│   │  SSL_write)  │  │  tcp_recvmsg)│  │              │               │
│   └──────────────┘  └──────────────┘  └──────────────┘               │
│           │                                                            │
│           ▼ Linux 5.8+ kernel (RHEL 4.18+ backport), BTF required      │
└────────────────────────────────────────────────────────────────────────┘

Two design decisions drive the low overhead: kernel probes only capture raw events, leaving heavy parsing, filtering, and mapping to the user-space agent — which keeps kernel-side CPU cost minimal. And because the output is OTLP, a single OBI deployment can feed Jaeger, Tempo, Splunk APM, Grafana Cloud, or Honeycomb through the same OpenTelemetry Collector.

System requirement	Detail	Notes
Linux kernel	5.8+ (RHEL/Rocky/Alma 4.18+ with eBPF backport)	BTF (BPF Type Format) required
Architecture	amd64, arm64	Graviton / Ampere supported
Privileges	root or `CAP_BPF + CAP_SYS_PTRACE`	Configure DaemonSet securityContext
Pod settings	`hostPID: true` recommended	Required to discover host-namespace processes
Container image	`otel/ebpf-instrument:v0.8.0`	CycloneDX SBOM included
License	Apache 2.0	No commercial restrictions

3. Kubernetes Helm deployment — cluster visibility in 15 minutes

The officially recommended deployment topology is Helm + DaemonSet. A DaemonSet is required because OBI must reach every node's process namespace, and hostNetwork / hostPID are wired automatically when deployed this way.

# Step 1: add the OpenTelemetry Helm repository
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update

# Step 2: default install (DaemonSet in the obi namespace)
helm install obi \
  -n obi --create-namespace \
  open-telemetry/opentelemetry-ebpf-instrumentation

# Step 3: verify the install
kubectl -n obi get daemonset
kubectl -n obi logs -l app.kubernetes.io/name=obi --tail=50

# Step 4: confirm probes were loaded on each node
kubectl -n obi exec ds/obi -- ls /sys/fs/bpf/obi/

The default install is enough to start collecting RED metrics and traces for every HTTP/gRPC request. In production you'll want to combine OTLP endpoint selection, service discovery, and header enrichment into a custom values file.

# helm-obi-prod.yaml — ManoIT production values
config:
  data:
    # OTLP destination (e.g. OTel Collector ClusterIP service)
    otel:
      endpoint: http://otel-collector.observability.svc:4318
      protocol: http/protobuf
    # Automatic process discovery
    discovery:
      services:
        - k8s_namespace: "^(shop|payment|auth)$"
          k8s_pod_labels:
            obi.enabled: "true"
      exclude_services:
        - exe_path_regex: ".*/istio-proxy$"
    # ⚠️ CNIs using eBPF datapaths (Cilium eBPF, Calico eBPF) can collide
    network:
      enabled: true
      cidrs:
        - 10.0.0.0/8
    # Protocol-specific instrumentation
    routes:
      unmatched: heuristic   # generate spans even for unknown HTTP paths
    log_level: info
    log_format: json

# Pod securityContext — required eBPF capabilities
securityContext:
  privileged: false
  capabilities:
    add: ["BPF", "PERFMON", "SYS_PTRACE", "NET_ADMIN"]
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

# DaemonSet resource limits
resources:
  requests: { cpu: 100m, memory: 256Mi }
  limits:   { cpu: 500m, memory: 512Mi }

# Node selection (exclude ARM-only clusters, GPU nodes, etc.)
nodeSelector:
  kubernetes.io/arch: amd64
tolerations:
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300

helm upgrade --install obi \
  -n obi --create-namespace \
  -f helm-obi-prod.yaml \
  open-telemetry/opentelemetry-ebpf-instrumentation

⚠️ Watch out: on clusters running Cilium eBPF mode or the Calico eBPF dataplane, some of OBI's network probes can collide with CNI programs. The official recommendation is to set network.enabled: false and split responsibilities: OBI for L7 application traces, Cilium Hubble or Tetragon for network flows and kernel-level security. This role-split has become the CNCF-recommended pattern in 2026.

4. HTTP header enrichment — multi-tenant SaaS incident triage

Header enrichment, introduced in v0.7.0, is the feature that moves OBI from "monitoring tool" to incident-response platform. Raw traces only tell you "error rate rose to 5% on this endpoint." Header enrichment tells you which tenants and which user segments are affected — without any code change.

# OBI values.yaml — header enrichment policy
config:
  data:
    ebpf:
      track_request_headers: true
      payload_extraction:
        http:
          enrichment:
            enabled: true
            policy:
              default_action: exclude         # never collect by default
              obfuscation_string: "***"       # mask sensitive headers
            rules:
              # 1) tenant / segment identifiers → attach as span attributes
              - action: include
                type: headers
                match:
                  patterns:
                    - "x-tenant-id"
                    - "x-user-segment"
                    - "x-org-id"
              # 2) auth tokens → record a hash but mask the value
              - action: obfuscate
                type: headers
                match:
                  patterns:
                    - "authorization"
                    - "cookie"
                    - "x-api-key"

The policy design is deliberately explicit allowlist + selective obfuscation. The default is exclude, so PII and tokens can't leak by accident. Pattern matching is case-insensitive and supports wildcards (x-manoit-*).

Incident scenario	With raw traces only	With header enrichment
Latency spike for one tenant	"Payment API latency up" → page the entire support team	Filter by `x-tenant-id` → only two enterprise customers affected → notify their CSM directly
B2B free vs. paid segmentation	Only sees "error rate 1.2%"	Filter `x-user-segment=paid` → track SLA cohort separately
Regional issue	Needs a separate ALB-log analysis	`x-region` header + OBI spans on the same Grafana panel
Per-API-key rate limiting	401/429 visible, which key is unclear	Masked `x-api-key` hash identifies the top offenders

5. 2026 roadmap — four axes toward 1.0 GA

The OBI SIG published its official 2026 roadmap with four axes, each with a named sponsor and GitHub milestones — making it possible to back-calculate adoption timelines.

Goal	Sponsor	Key deliverables	Production impact
① 1.0 Stable Release	@MrAlias	JSON Schema validation, declarative config standard, telemetry schema, versioning policy, test coverage targets	End of v0 breaking changes, LTS track becomes possible
② Protocol expansion	@marctc, @NimrodAvni78	MQTT, AMQP, NATS, Redis Pub/Sub, MongoDB enhancements, GCP/AWS/Azure SDKs, full gRPC context propagation	Coverage of message brokers and cloud SDKs
③ .NET support	@rafaelroquetto	.NET 8+, .NET Framework 4.x, 3.5 SP1 validation, distributed tracing + RED metrics verification	Enterprise Windows workloads covered
④ Hybrid instrumentation	@grcevski	Consistent labels with SDK traces, metric exemplars, multi-language composition	Organizations with existing SDK instrumentation can add OBI on top

The 1.0 GA checklist itself boils down to configuration documentation complete, JSON Schema validation defined, per-service + per-process configuration supported, telemetry schema adopted, versioning policy formalized, test coverage targets hit. v0.8.0 today is roughly at 60% of that checklist; the community is targeting late-2026 for 1.0 GA.

6. OBI vs. Beyla / Pixie / Tetragon / Cilium — drawing the lines

With the 2026 eBPF observability space this crowded, positioning OBI correctly matters. Here's how it relates to the four most common adjacent projects.

Tool	Primary role	Relationship to OBI	Run side-by-side?
Grafana Beyla	HTTP/gRPC auto-tracing	OBI's direct ancestor — Grafana Cloud users can stay on Beyla, OBI is the upstream successor	New projects → OBI
Pixie (New Relic)	Auto APM + scripting (PxL)	Pixie ties to New Relic's backend, OBI is vendor-neutral via OTLP	Existing NR customers stay on Pixie; OTLP-first teams choose OBI
Cilium Tetragon	Process / file / network security + real-time enforcement (LSM)	Different role — OBI observes applications, Tetragon detects and enforces security	Run both (two DaemonSets, different jobs)
Cilium Hubble	L3/L4 network flow + service map	OBI is L7 (HTTP/gRPC) payloads; Hubble is L3/L4 packets	Run both — layered responsibility
Grafana Alloy + Pyroscope	Continuous profiling (CPU, memory)	OBI traces + RED; Pyroscope function-level profiles	Run together → trace-to-profile drilldown

In one sentence: OBI is the OTLP-standard implementation of L7 application observability. L3/L4 security belongs to Tetragon, network flows to Hubble, profiling to Pyroscope, and OBI handles application traces, RED metrics, SQL, and GenAI. That role-separated architecture has become the default CNCF observability stack in 2026.

7. Production adoption checklist (8 steps)

Step	Check	Tool / command
① Kernel	All nodes on 5.8+ with BTF enabled	`uname -r`, `ls /sys/kernel/btf/vmlinux`
② CNI collision	Audit for Cilium / Calico eBPF mode collisions	`cilium status`, OBI `network.enabled: false`
③ Privileges	DaemonSet has `CAP_BPF + CAP_PERFMON + CAP_SYS_PTRACE`	Verify against PodSecurityAdmission
④ OTLP pipeline	OTel Collector receives, backend (Tempo/Jaeger/Splunk) wired	`otelcol validate`, inspect receive metrics
⑤ Service discovery	Allowlist of namespaces / labels for instrumentation	Enforce labels via Kyverno / OPA Gatekeeper
⑥ Header enrichment	PII leakage prevented — obfuscate `authorization`, `cookie`, `api-key`	Test `obfuscation_string` + allowlist
⑦ Resource tuning	On high-traffic nodes (10k RPS+) adjust CPU / memory limits	Monitor Prometheus `obi_bpf_*` metrics
⑧ v0 instability	v0 minor releases may break — pin versions	GitOps: pin `v0.8.0`, never floating `main`

8. ManoIT production recommendations

Three-DaemonSet architecture: OBI (L7 traces) + Cilium Hubble (L3/L4) + Tetragon (security). Realistic per-node budget is about cpu: 800m / mem: 1.5Gi combined.
Progressive namespace rollout: attach the obi.enabled=true label only where you need it and expand horizontally over 1–2 weeks of observation. Rolling out to every namespace on day one will bottleneck the OTel Collector.
OTLP backend sampling: OBI is designed for 100% capture, which can explode backend storage costs. Always set tail-based sampling at the Collector (latency > 1s, all error traces, 1% of normal traces).
GenAI payload policy: the OpenAI / Anthropic instrumentation is powerful, but system prompts and user input can land in traces. Make payload_extraction.genai.redact: true the default, and switch to a whitelist-only model when you need the payloads.
Staging verification against v0: v0 minors are allowed to break. Every OBI upgrade should re-validate the full trace field mapping in staging before rolling to prod.
Header enrichment governance: every new header addition should go through security review. Add "PII included? / obfuscation needed?" checkboxes to your Git PR template.

9. Conclusion — observability moves to the kernel

OBI's beta launch is a signal that observability's center of gravity is moving from SDKs to the kernel and from vendor-specific agents to the OTLP standard. Where OpenTelemetry in the early 2020s solved "vendor-locked instrumentation code," OBI in 2026 is solving "don't write that instrumentation code in the first place." Application teams focus on business logic; platform teams deliver consistent observability at the kernel layer. That separation of responsibilities is closer than ever to what SRE and DevSecOps communities have been chasing for a decade. ManoIT recommends evaluating OBI as the default observability layer for every Kubernetes 1.28+ environment, and we plan to adopt it into our customer standard stack once 1.0 GA lands in late 2026. Grafana Labs donating Beyla, @grcevski leading the SIG, and the Splunk observability team carrying it over the KubeCon finish line — that model of community collaboration is the reason OpenTelemetry is still, in 2026, the healthiest CNCF project in the ecosystem.

This article was co-authored by the ManoIT engineering team together with Anthropic's Claude Opus 4.7, based on: OpenTelemetry eBPF Instrumentation official docs, OBI 2026 Goals, OBI HTTP Header Enrichment, GitHub opentelemetry-ebpf-instrumentation v0.8.0, the Splunk KubeCon EU 2026 announcement, and the OBI First Release Announcement.

Originally published at ManoIT Tech Blog.