DEV Community

Aleksey Budaev
Aleksey Budaev

Posted on

Building an eBPF-based SIP Monitor in Go

I recently built a SIP monitoring service that uses eBPF to capture SIP traffic directly in the Linux kernel and export metrics to Prometheus. The entire pipeline from packet to Prometheus metric takes ~3μs in userspace.

Here's how it works and what I learned along the way.

The Problem

Monitoring SIP/VoIP infrastructure at scale requires tracking call success rates, active dialogs, and response codes — without adding latency to the signaling path.

I wanted something that:

  • Processes packets in kernel space
  • Exports standard Prometheus metrics
  • Runs as a single container
  • Tracks SIP dialogs per RFC 3261
  • Implements RFC 6076 performance metrics (Session Establishment Ratio)

Architecture

SIP Traffic → NIC → eBPF socket filter → ringbuf → Go poller → SIP parser → Prometheus
Enter fullscreen mode Exit fullscreen mode

The eBPF program (written in C) attaches as a socket filter via AF_PACKET. It intercepts UDP packets on configurable SIP ports (default 5060/5061), copies them to a ring buffer, and the Go userspace process polls and parses them.

The C program does three things:

  1. Parse Ethernet/IP/UDP headers — handles both regular and VLAN-tagged frames
  2. Filter SIP traffic — checks UDP ports (configurable via environment variables)
  3. Copy to ringbuf — pushes matching packets to userspace

Loaded via cilium/ebpf — the Go library handles BPF map creation, program loading, and ringbuf polling.

Known limitation: The eBPF verifier doesn't allow variable-length bpf_skb_load_bytes, so I copy packets in 64-byte blocks. Planning to migrate to AF_PACKET with PACKET_RX_RING (mmap) for arbitrary sizes.

The Go Part

The Go side is straightforward:

  1. Poll ringbuf for new packets
  2. Parse raw SIP messages (method/status, headers, Call-ID, tags)
  3. Update Prometheus counters
  4. Track SIP dialog lifecycle

Dialog Tracking

SIP dialogs are identified by {Call-ID, From tag, To tag}. Tags are sorted lexicographically for consistent IDs.

  • Dialog created on 200 OK response to INVITE
  • Dialog terminated on 200 OK response to BYE
  • Expired dialogs cleaned up every 1 second (based on Session-Expires header, default 30 min)

Metrics Exported

~30 Prometheus counters:

  • Per-method: sip_exporter_invite_total, sip_exporter_bye_total, sip_exporter_register_total, etc.
  • Per-status: sip_exporter_200_total, sip_exporter_404_total, sip_exporter_500_total, etc.
  • Session count: sip_exporter_sessions (active dialogs gauge)
  • RFC 6076 SER: sip_exporter_ser — Session Establishment Ratio

The SER metric is interesting because it follows RFC 6076 exactly:

SER = (INVITE → 200 OK) / (Total INVITE - INVITE → 3xx) × 100
Enter fullscreen mode Exit fullscreen mode

3xx redirects are excluded from the denominator — they're routing instructions, not failures.

Performance

Benchmarks on Intel i7-8665U (userspace only):

Operation Latency Throughput Memory
Packet parsing (L2→SIP) ~124 ns 8M pkt/sec 32 B/op
SIP header parsing ~1.2 μs 800k pkt/sec 350 B/op
Full processing (with metrics) ~3 μs 300k pkt/sec 1000 B/op

These are userspace numbers. Actual latency depends on kernel eBPF overhead and system load.

E2E Testing

E2E tests use SIPp via testcontainers-go to generate real SIP traffic and verify that metrics match expected values. Tests cover success/failure scenarios and validate proper dialog cleanup.

Quick Start

services:
  sip-exporter:
    image: frzq/sip-exporter:0.5.0
    privileged: true
    network_mode: host
    environment:
      - SIP_EXPORTER_INTERFACE=eth0
Enter fullscreen mode Exit fullscreen mode
docker-compose up -d
curl http://localhost:2112/metrics
Enter fullscreen mode Exit fullscreen mode

What's Next

  • More RFC 6076 metrics (Session Setup Time, Response Time)

Links

Happy to answer questions about the eBPF integration, SIP dialog state machine, or Prometheus metric design. Drop a comment below!

Top comments (0)