DEV Community

Cover image for Zero-Code Observability: Using eBPF to Auto-Instrument Services with OpenTelemetry
Nabin Debnath
Nabin Debnath

Posted on

Zero-Code Observability: Using eBPF to Auto-Instrument Services with OpenTelemetry

Instrumenting services for observability often means sprinkling tracing code across hundreds of files which is painful to maintain and easy to forget.
eBPF + OpenTelemetry (OTel): a powerful combination that hooks into your running processes and emits traces, metrics, and logs without touching application code.

In this post, you’ll learn how to:

  • Use an eBPF agent to automatically instrument apps
  • Export telemetry data through OpenTelemetry Collector
  • Visualize it with Grafana
  • Control overhead and
  • Roll it out safely in production

Why observability shouldn’t require rewriting code

Modern apps are stitched together from dozens of microservices. We push features daily, yet visibility into performance often lags.

You’ve probably heard: “We’ll add tracing later.” …and then it never happens.

Manual instrumentation with OpenTelemetry SDKs gives fine-grained control, but it comes with:

  • Code changes across many repos,
  • Version mismatches between SDKs,
  • Extra CI/CD validation.

Wouldn’t it be nice if the system could observe itself, automatically?

That’s what eBPF (extended Berkeley Packet Filter) delivers. It hooks into the Linux kernel, captures runtime events (like syscalls, network, and process activity), and forwards them all with low overhead. Combine that with OpenTelemetry, and you get a zero-code observability pipeline.


eBPF + OpenTelemetry in plain English

eBPF: Think of eBPF as a programmable microscope for the Linux kernel. It lets you attach tiny programs to events such as network packets or function calls and safely collect data in real-time.

OpenTelemetry: OpenTelemetry (OTel) is a vendor-neutral standard for generating and exporting traces, metrics, and logs. It’s supported by almost every major observability backend (Grafana, Datadog, AWS X-Ray, etc.).

An eBPF agent can auto-discover and instrument running services (HTTP, gRPC, database calls, etc.) and emit OTel-formatted data to your collector.

eBPF and OpenTelemetry integration

No SDKs. No code injection. Everything happens in runtime.


Setting up your environment

For today's demo, we’ll use a simple Node.js app and an eBPF agent (Grafana Beyla ) to demonstrate. You can adapt this for Java, Python, Go, etc.

Step 1: Create a minimal service

mkdir ebpf-otel-demo && cd $_
npm init -y
npm install express
Enter fullscreen mode Exit fullscreen mode

index.js

const express = require("express");
const app = express();

app.get("/orders/:id", async (req, res) => {
  await new Promise(r => setTimeout(r, Math.random() * 200));
  res.json({ orderId: req.params.id, status: "OK" });
});

app.listen(3000, () => console.log("Service running on port 3000"));
Enter fullscreen mode Exit fullscreen mode

Dockerfile

FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["node", "index.js"]
Enter fullscreen mode Exit fullscreen mode

Build and run

docker build -t ebpf-otel-demo .
docker run -p 3000:3000 ebpf-otel-demo
Enter fullscreen mode Exit fullscreen mode

Your API is now live at http://localhost:3000/orders/123.

Step 2: Install an eBPF agent
Install Beyla on the host or as a sidecar container. (Requires Linux kernel ≥ 5.8.)

sudo apt-get install linux-headers-$(uname -r)
curl -sSfL https://github.com/grafana/beyla/releases/latest/download/beyla-linux-amd64.tar.gz | tar xz
sudo mv beyla /usr/local/bin/
Enter fullscreen mode Exit fullscreen mode

Step 3: Configure the agent
Create beyla-config.yml:

listen:
  interfaces: [eth0]
otlp:
  endpoint: "localhost:4317"
service:
  name: "orders-service"
instrumentation:
  language: "nodejs"
Enter fullscreen mode Exit fullscreen mode

Run it:

sudo beyla run --config beyla-config.yml
Enter fullscreen mode Exit fullscreen mode

The agent now attaches to your running container, intercepts HTTP calls, and sends spans to your OTel Collector.


Connect OpenTelemetry Collector

The collector acts as a bridge between producers (Beyla) and your observability backend (Grafana, Tempo, or Jaeger).

Create otel-collector-config.yml:

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  logging:
  otlp:
    endpoint: "tempo:4317"
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging, otlp]
Enter fullscreen mode Exit fullscreen mode

Run the collector (in Docker for simplicity):

docker run --rm -p 4317:4317 -v $(pwd)/otel-collector-config.yml:/etc/otel/config.yml \
  otel/opentelemetry-collector:latest --config /etc/otel/config.yml
Enter fullscreen mode Exit fullscreen mode

Visualize traces in Grafana

If you’re using Grafana Tempo + Loki + Grafana OSS:

docker run -d --name=grafana -p 3001:3000 grafana/grafana
Enter fullscreen mode Exit fullscreen mode

Add Tempo as a data source and point it to your collector’s OTLP endpoint. Within seconds, you’ll see spans like:

{
  "traceId": "5dfb0e7c16b6f9c1",
  "spanId": "8aeb32afaa3e41d9",
  "name": "GET /orders/:id",
  "attributes": {
    "http.method": "GET",
    "http.status_code": 200,
    "service.name": "orders-service"
  },
  "duration_ms": 52.8
}
Enter fullscreen mode Exit fullscreen mode

Behind the scenes: what eBPF is doing

eBPF attaches probes (kprobes/uprobes) to kernel and user-space events:

  • Socket reads/writes -> network latency
  • HTTP libraries -> method, route, status
  • Syscalls -> file I/O, DNS, etc.

The agent aggregates these into OTel spans, adds tags (service, method, latency), and exports them asynchronously which usually consuming < 1–2% CPU.

Here’s a simplified view:
Simplified view of eBPF behind the scene


Controlling overhead and noise

Auto-instrumentation is powerful, but it can produce a lot of data. Here’s how to keep it efficient:

Sampling
In beyla-config.yml:

sampling:
  probability: 0.2   # capture 20% of requests
Enter fullscreen mode Exit fullscreen mode

Filtering
Capture only interesting routes:

filters:
  include_paths: ["/orders/*"]
Enter fullscreen mode Exit fullscreen mode

Resource limits
Run the agent with limited CPU/memory:

sudo systemd-run --property=CPUQuota=20% beyla run ...
Enter fullscreen mode Exit fullscreen mode

Security considerations

  • eBPF programs run with kernel privileges.
  • Always use signed binaries or build from source.
  • Test in staging first. Avoid root unless required.

Production Rollout Checklist

  • Test in staging with representative traffic.
  • Enable sampling (< or = 20 %) before full rollout.
  • Run the agent in restricted mode (non-root if possible).
  • Compare baseline latency before/after attach.
  • Use dashboards to monitor agent CPU/memory usage.

Why this approach matters

real world value
You can onboard dozens of services instantly, a huge win for teams with legacy stacks or microservice sprawl.


What is next?

  • Combine with Service Mesh: Use eBPF telemetry to enrich service-mesh metrics (Istio, Linkerd).
  • Join Logs + Traces: Since OTel supports logs too, you can correlate application logs with eBPF spans via trace IDs.
  • Build Compliance Dashboards: In regulated industries (finance, healthcare), eBPF traces create immutable audit trails of service interactions without leaking business data.

Common problems you may face

  • Kernel version too old: upgrade or use COS/Ubuntu 22+.
  • Container visibility: run agent on host or enable --privileged if sidecar fails to attach.
  • Over-collection: fine-tune filters.
  • Trace backend mismatch: ensure OTel Collector exporter matches your backend format (Tempo, Jaeger, Zipkin)

Wrapping up

You’ve now built an observability stack that requires zero code changes yet delivers full visibility.

Key takeaways:
✅ eBPF captures runtime events safely and efficiently.
✅ OpenTelemetry unifies data into a portable format.
✅ Together they let developers focus on features.

Start small, pick one service, attach an agent, visualize traces and scale gradually.
Once you see that first automatic trace appear in Grafana, you’ll realize: observability doesn’t need to slow you down.


Further reading
Grafana Beyla Docs
OpenTelemetry Collector
eBPF.io Guide
CNCF Observability Landscape

Top comments (0)