<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aleksey Budaev</title>
    <description>The latest articles on DEV Community by Aleksey Budaev (@aibudaevv).</description>
    <link>https://dev.to/aibudaevv</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1610749%2F1d3c11ea-e759-47ec-8495-b146c62b1d1a.jpg</url>
      <title>DEV Community: Aleksey Budaev</title>
      <link>https://dev.to/aibudaevv</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aibudaevv"/>
    <language>en</language>
    <item>
      <title>SIP Telephony Monitoring with eBPF: Full Observability for VoIP Infrastructure</title>
      <dc:creator>Aleksey Budaev</dc:creator>
      <pubDate>Sat, 20 Jun 2026 14:26:42 +0000</pubDate>
      <link>https://dev.to/aibudaevv/sip-telephony-monitoring-with-ebpf-full-observability-for-voip-infrastructure-2h0</link>
      <guid>https://dev.to/aibudaevv/sip-telephony-monitoring-with-ebpf-full-observability-for-voip-infrastructure-2h0</guid>
      <description>&lt;p&gt;At some point I needed a fast way to get SIP traffic monitoring into Prometheus — without installing agents on servers, configuring SPAN ports on switches, or being locked into specific software. Just connect to a network interface and see everything happening. With minimal latency and zero impact on telephony performance — monitoring shouldn't become the source of problems.&lt;/p&gt;

&lt;p&gt;In this article — how I solved SIP telephony monitoring with eBPF: from packet capture in the Linux kernel to RFC 6076 metrics with breakdown by traffic source and device type in Prometheus/VictoriaMetrics.&lt;/p&gt;

&lt;h2&gt;
  
  
  HOW IT WORKS
&lt;/h2&gt;

&lt;p&gt;eBPF (extended Berkeley Packet Filter) allows running small programs directly in the Linux kernel. The eBPF verifier guarantees safety: the program cannot exceed allocated memory, cannot loop indefinitely, cannot modify the kernel.&lt;/p&gt;

&lt;p&gt;My approach — eBPF socket filter on AF_PACKET. This is passive network traffic observation:&lt;/p&gt;

&lt;p&gt;SIP Traffic → NIC → eBPF filter → AF_PACKET socket → Go → SIP Parser → Prometheus Metrics&lt;/p&gt;

&lt;p&gt;Key point: the eBPF filter is a socket filter, not a tc/XDP filter. It only decides whether to copy a packet to the application. The packet continues through the network stack to its destination regardless. The filter cannot modify, block, or redirect traffic. Zero impact on call delivery.&lt;/p&gt;

&lt;p&gt;The entire filter is 100 lines of C. Ports are configurable from Go code via BPF map, defaults are 5060/5061. eBPF drops 99% of traffic in kernel — only SIP packets on the right ports reach userspace.&lt;/p&gt;

&lt;h2&gt;
  
  
  THE FULL METRICS STACK
&lt;/h2&gt;

&lt;p&gt;The exporter provides not just RFC 6076 metrics, but a complete observability stack for SIP infrastructure.&lt;/p&gt;

&lt;p&gt;Real-time traffic:&lt;/p&gt;

&lt;p&gt;• 14 SIP request counters: INVITE, BYE, REGISTER, OPTIONS, CANCEL, ACK, SUBSCRIBE, NOTIFY, PUBLISH, INFO, PRACK, UPDATE, MESSAGE, REFER&lt;/p&gt;

&lt;p&gt;• 30 response code counters: 100, 180, 181, 182, 183, 200, 202, 300, 302, 400, 401, 403, 404, 405, 407, 408, 480, 481, 486, 487, 488, 500, 501, 502, 503, 504, 600, 603, 604, 606&lt;/p&gt;

&lt;p&gt;• Active sessions gauge — current number of active SIP dialogs. Dialog is created on 200 OK to INVITE, removed on 200 OK to BYE or Session-Expires timeout (default 30 minutes)&lt;/p&gt;

&lt;p&gt;Connection quality — RFC 6076:&lt;/p&gt;

&lt;p&gt;RFC 6076 defines standard SIP performance metrics. All metrics are cumulative, computed from atomic counters, updated on every scrape.&lt;/p&gt;

&lt;p&gt;SER (Session Establishment Ratio) — percentage of successfully established sessions:&lt;/p&gt;

&lt;p&gt;SER = (INVITE → 200 OK) / (Total INVITE - INVITE → 3xx) × 100&lt;/p&gt;

&lt;p&gt;3xx (redirect) are excluded from the denominator — they are neither success nor failure, but a routing instruction. SER = 100 means all non-redirect INVITEs received 200 OK.&lt;/p&gt;

&lt;p&gt;SEER (Session Establishment Effectiveness Ratio) — percentage of "effective" responses:&lt;/p&gt;

&lt;p&gt;SEER = (INVITE → 200, 480, 486, 600, 603) / (Total INVITE - INVITE → 3xx) × 100&lt;/p&gt;

&lt;p&gt;The numerator includes responses with a clear outcome: 200 OK (session established), 480 (temporarily unavailable), 486 (busy), 600 (busy everywhere), 603 (declined). SEER is always ≥ SER.&lt;/p&gt;

&lt;p&gt;ISA (Ineffective Session Attempts) — percentage of infrastructure errors:&lt;/p&gt;

&lt;p&gt;ISA = (INVITE → 408, 500, 503, 504) / Total INVITE × 100&lt;/p&gt;

&lt;p&gt;408 (timeout), 500 (internal error), 503 (unavailable), 504 (gateway timeout) — server errors. ISA rising means infrastructure is degrading. Unlike SER/SEER, 3xx are NOT excluded from the denominator.&lt;/p&gt;

&lt;p&gt;SCR (Session Completion Ratio) — percentage of fully completed sessions:&lt;/p&gt;

&lt;p&gt;SCR = (Completed Sessions) / Total INVITE × 100&lt;/p&gt;

&lt;p&gt;A completed session = INVITE → 200 OK → BYE → 200 OK (or Session-Expires timeout). SCR ≤ SER always: not all established sessions terminate correctly.&lt;/p&gt;

&lt;p&gt;ASR (Answer Seizure Ratio) — classic telephony metric (ITU-T E.411):&lt;/p&gt;

&lt;p&gt;ASR = (INVITE → 200 OK) / Total INVITE × 100&lt;/p&gt;

&lt;p&gt;Unlike SER, 3xx are NOT excluded. ASR ≤ SER when redirect responses are present.&lt;/p&gt;

&lt;p&gt;NER (Network Effectiveness Ratio) — network quality (GSMA IR.42):&lt;/p&gt;

&lt;p&gt;NER = 100 − ISA&lt;/p&gt;

&lt;p&gt;NER = 100 means no infrastructure errors. NER &amp;lt; 95 — time to worry.&lt;/p&gt;

&lt;p&gt;Latency at every stage:&lt;/p&gt;

&lt;p&gt;Five histograms cover all SIP transaction phases:&lt;/p&gt;

&lt;p&gt;• RRD — Registration delay: REGISTER → 200 OK&lt;/p&gt;

&lt;p&gt;• TTR — Time to first response: INVITE → first 1xx&lt;/p&gt;

&lt;p&gt;• SPD — Session duration: INVITE 200 OK → BYE 200 OK&lt;/p&gt;

&lt;p&gt;• ORD — OPTIONS response delay: OPTIONS → any response&lt;/p&gt;

&lt;p&gt;• LRD — Registration redirect delay: REGISTER → 3xx&lt;/p&gt;

&lt;p&gt;All histograms support histogram_quantile() for percentile-based alerting: p50, p95, p99.&lt;/p&gt;

&lt;p&gt;Example for VictoriaMetrics / Prometheus:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;95th percentile registration delay:&lt;br&gt;
histogram_quantile(0.95, sum(rate(sip_exporter_rrd_bucket[5m])) by (le))&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;99th percentile session duration (specific carrier and device type):&lt;/p&gt;

&lt;p&gt;&lt;code&gt;histogram_quantile(0.99, sum(rate(sip_exporter_spd_bucket{carrier="mobile-operator-a",ua_type="yealink"}[5m])) by (le))&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Additional metrics:&lt;/p&gt;

&lt;p&gt;• ISS (Ineffective Session Severity) — absolute count of INVITE→408/500/503/504 responses. Unlike ISA (percentage), ISS enables alerting on absolute error volume: rate(sip_exporter_iss_total[5m]) &amp;gt; 20&lt;/p&gt;

&lt;p&gt;• SDC (Session Duration Counter) — Prometheus Counter of completed sessions. Useful for rate queries: rate(sip_exporter_sdc_total[5m])&lt;/p&gt;

&lt;p&gt;• sip_exporter_packets_total — total parsed SIP packets&lt;/p&gt;
&lt;h2&gt;
  
  
  PER-CARRIER: METRICS BY TRAFFIC SOURCE
&lt;/h2&gt;

&lt;p&gt;Aggregated metrics hide problems of specific traffic sources. If SER = 85%, it's unclear — are all sources at 85%, or is one at 50% while others are at 95%?&lt;/p&gt;

&lt;p&gt;The exporter solves this via CIDR mapping: IP subnets → source name → carrier label on every metric.&lt;/p&gt;

&lt;p&gt;Configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;carriers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;telecom-alpha"&lt;/span&gt;
    &lt;span class="na"&gt;cidrs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.1.0.0/16"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.2.0.0/16"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;172.16.0.0/12"&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;telecom-beta"&lt;/span&gt;
    &lt;span class="na"&gt;cidrs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.10.0/24"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.11.0/24"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.12.0/24"&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How it works:
&lt;/h2&gt;

&lt;p&gt;Carrier is determined at request time (INVITE/REGISTER/OPTIONS) by source IP. If INVITE came from 10.1.5.20 — the exporter finds this IP belongs to 10.1.0.0/16 and labels all metrics for this call (including responses and dialog termination) with carrier="mobile-operator-a".&lt;/p&gt;

&lt;p&gt;Responses come from a different IP (the SIP server), but carrier is inherited from the tracker by Call-ID, not determined by response IP. This is correct: metrics belong to the call initiator, not the server.&lt;/p&gt;

&lt;p&gt;Result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight prometheus"&gt;&lt;code&gt;&lt;span class="n"&gt;sip_exporter_invite_total&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mobile-operator-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"other"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="mi"&gt;1523&lt;/span&gt;
&lt;span class="n"&gt;sip_exporter_ser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mobile-operator-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"other"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;           &lt;span class="mf"&gt;95.2&lt;/span&gt;
&lt;span class="n"&gt;sip_exporter_ser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sip-trunk-provider"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"other"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;          &lt;span class="mf"&gt;87.4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now it's clear: the trunk provider has SER = 87.4%, while the mobile operator has 95.2%. You can build separate dashboards and alerts for each traffic source.&lt;/p&gt;

&lt;p&gt;IPs not matching any CIDR subnet get carrier="other".&lt;/p&gt;

&lt;p&gt;PER-UA-TYPE: METRICS BY DEVICE TYPE&lt;br&gt;
Carrier shows who is calling, but not with what. And device type is often the key factor in problems.&lt;/p&gt;

&lt;p&gt;If Yealink phones start getting 408 timeouts while Grandstream works fine — without the ua_type label it would look like a general quality drop. With it — the problem is clearly localized to a specific device type.&lt;/p&gt;

&lt;p&gt;Configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;user_agents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;(?i)^Yealink'&lt;/span&gt;
    &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;yealink&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;(?i)^Grandstream'&lt;/span&gt;
    &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;grandstream&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How it works:
&lt;/h2&gt;

&lt;p&gt;The User-Agent header is extracted from each SIP request and matched against regex patterns. When a phone with "User-Agent: Yealink SIP-T46S 66.15.0.10" sends an INVITE — the exporter matches ^Yealink and labels all call metrics with ua_type="yealink".&lt;/p&gt;

&lt;p&gt;Like carrier, ua_type is determined at request time and inherited by responses through the tracker by Call-ID.&lt;/p&gt;

&lt;p&gt;Result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight prometheus"&gt;&lt;code&gt;&lt;span class="n"&gt;sip_exporter_invite_total&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mobile-operator-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"yealink"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;      &lt;span class="mi"&gt;1523&lt;/span&gt;
&lt;span class="n"&gt;sip_exporter_ser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mobile-operator-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"yealink"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;                &lt;span class="mf"&gt;95.2&lt;/span&gt;
&lt;span class="n"&gt;sip_exporter_ser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;carrier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mobile-operator-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="na"&gt;ua_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"grandstream"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;            &lt;span class="mf"&gt;87.4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combined queries — both labels work together for two-dimensional analysis:&lt;/p&gt;

&lt;p&gt;SER for Yealink phones on a specific carrier:&lt;br&gt;
&lt;code&gt;sip_exporter_ser{carrier="mobile-operator-a",ua_type="yealink"}&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Active sessions by device type:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sum by (ua_type) (sip_exporter_sessions)&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
INVITE rate by carrier and device type:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sum by (carrier, ua_type) (rate(sip_exporter_invite_total[5m]))&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  PERFORMANCE
&lt;/h2&gt;

&lt;p&gt;Load testing was done with SIPp via testcontainers-go — real SIP traffic, not mocks.&lt;/p&gt;

&lt;p&gt;Test environment: Debian 12, Linux kernel 6.x, Docker 29.3.1, Intel i7-8665U (4 cores / 8 threads), Go 1.25.9.&lt;/p&gt;

&lt;p&gt;Full call lifecycle — each call is a complete SIP dialog: INVITE → 100 Trying → 180 Ringing → 200 OK → ACK → BYE → 200 OK. On loopback each packet is duplicated (send + receive), so 7 messages → 14 packets per call.&lt;/p&gt;

&lt;p&gt;With GOMAXPROCS=8 (all cores):&lt;/p&gt;

&lt;p&gt;CPS 1,000 → ~11,800 PPS → 8.7% CPU peak, 13 MB RAM, 0% packet loss&lt;/p&gt;

&lt;p&gt;CPS 2,000 → ~23,600 PPS → 12.2% CPU peak, 15 MB RAM, 0% packet loss&lt;/p&gt;

&lt;p&gt;With GOMAXPROCS=1 (single core):&lt;/p&gt;

&lt;p&gt;CPS 2,000 → ~23,600 PPS → 9.2% CPU peak, 12 MB RAM, 0% packet loss&lt;/p&gt;

&lt;p&gt;2,000 CPS, 0% packet loss, &amp;lt;12% CPU, ~15 MB RAM.&lt;/p&gt;

&lt;p&gt;Scrape performance under 2,000 CPS load (14,000 PPS):&lt;/p&gt;

&lt;p&gt;Min: 1.7 ms | Avg: 4.2 ms | P95: 6.4 ms | Max: 8.4 ms&lt;/p&gt;

&lt;p&gt;Scraping doesn't interfere with packet processing. You can scrape every 5-10 seconds even at maximum load.&lt;/p&gt;

&lt;p&gt;Why it's fast:&lt;br&gt;
• eBPF drops 99% of traffic in kernel — only SIP packets on ports 5060/5061 reach userspace&lt;/p&gt;

&lt;p&gt;• 4 MB socket buffer — fits ~420ms of traffic at 28,000 PPS&lt;/p&gt;

&lt;p&gt;• Go GC pauses &amp;lt;1ms — 400x smaller than buffer capacity, packets never lost due to GC&lt;/p&gt;

&lt;p&gt;• SIP parsing ~1μs — microbenchmarks: INVITE 1.1μs, BYE 860ns, 200 OK 2.0μs&lt;/p&gt;

&lt;p&gt;System requirements:&lt;br&gt;
Up to 500 CPS → 1 core, 128 MB RAM&lt;/p&gt;

&lt;p&gt;Up to 1,000 CPS → 1 core, 128 MB RAM&lt;/p&gt;

&lt;p&gt;Up to 2,000 CPS → 2 cores, 256 MB RAM&lt;/p&gt;

&lt;p&gt;Above 2,000 CPS → 4 cores, 512 MB RAM&lt;/p&gt;

&lt;p&gt;SECURITY: WHY --privileged IS SAFE&lt;br&gt;
The container requires --privileged and network_mode: host. Here's why this is safe.&lt;/p&gt;

&lt;p&gt;What capabilities are needed:&lt;/p&gt;

&lt;p&gt;• CAP_BPF — loading eBPF program into kernel via bpf() syscall&lt;/p&gt;

&lt;p&gt;• CAP_NET_RAW — creating AF_PACKET raw socket for reading packets&lt;/p&gt;

&lt;p&gt;• CAP_NET_ADMIN — binding eBPF filter to socket, configuring buffer&lt;/p&gt;

&lt;p&gt;These are three specific capabilities for specific operations. All eBPF tools (Cilium, Falco, Pixie) require the same — this is a Linux kernel limitation, not a container one.&lt;/p&gt;

&lt;p&gt;What the container does:&lt;br&gt;
• Loads eBPF socket filter into kernel (once, at startup)&lt;/p&gt;

&lt;p&gt;• Creates AF_PACKET raw socket bound to network interface&lt;/p&gt;

&lt;p&gt;• Reads packets from socket into Go channel (10,000 buffer)&lt;/p&gt;

&lt;p&gt;• Parses SIP headers&lt;/p&gt;

&lt;p&gt;• Exports metrics via /metrics endpoint&lt;/p&gt;

&lt;p&gt;What the container does NOT do:&lt;br&gt;
• Does not modify packets — eBPF filter is passive (read-only)&lt;/p&gt;

&lt;p&gt;• Does not send SIP traffic — purely a listener&lt;/p&gt;

&lt;p&gt;• Does not write to host filesystem — all volumes are :ro&lt;/p&gt;

&lt;p&gt;• Does not access other containers, processes, or system resources&lt;/p&gt;

&lt;p&gt;• Does not open ports except /metrics (default 2112)&lt;/p&gt;

&lt;p&gt;• Does not establish outbound connections&lt;/p&gt;

&lt;p&gt;The entire eBPF filter is 100 lines of C — fully auditable. Automated vulnerability scanning (govulncheck + Trivy) runs on every push. Current status: 0 vulnerabilities in code and image.&lt;/p&gt;
&lt;h2&gt;
  
  
  QUICK START
&lt;/h2&gt;

&lt;p&gt;docker run --privileged --network host \&lt;/p&gt;

&lt;p&gt;-e SIP_EXPORTER_INTERFACE=eth0 \&lt;/p&gt;

&lt;p&gt;frzq/sip-exporter:latest&lt;/p&gt;

&lt;p&gt;curl &lt;a href="http://localhost:2112/metrics" rel="noopener noreferrer"&gt;http://localhost:2112/metrics&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or with docker-compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;sip-exporter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;frzq/sip-exporter:latest&lt;/span&gt;
    &lt;span class="na"&gt;privileged&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;network_mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;host&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;SIP_EXPORTER_INTERFACE=eth0&lt;/span&gt;
      &lt;span class="c1"&gt;# Optional: per-carrier metrics&lt;/span&gt;
      &lt;span class="c1"&gt;# - SIP_EXPORTER_CARRIERS_CONFIG=/etc/sip-exporter/carriers.yaml&lt;/span&gt;
      &lt;span class="c1"&gt;# Optional: per-device-type metrics&lt;/span&gt;
      &lt;span class="c1"&gt;# - SIP_EXPORTER_USER_AGENTS_CONFIG=/etc/sip-exporter/user_agents.yaml&lt;/span&gt;
    &lt;span class="c1"&gt;# volumes:&lt;/span&gt;
    &lt;span class="c1"&gt;#   - ./carriers.yaml:/etc/sip-exporter/carriers.yaml:ro&lt;/span&gt;
    &lt;span class="c1"&gt;#   - ./user_agents.yaml:/etc/sip-exporter/user_agents.yaml:ro&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compatible with Prometheus, VictoriaMetrics, and Grafana Cloud — any scraper supporting Prometheus exposition format.&lt;/p&gt;

&lt;p&gt;Project: github.com/aibudaevv/sip-exporter (AGPL-3.0)&lt;/p&gt;

</description>
      <category>sip</category>
      <category>telecom</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Building an eBPF-based SIP Monitor in Go</title>
      <dc:creator>Aleksey Budaev</dc:creator>
      <pubDate>Sun, 05 Apr 2026 11:25:32 +0000</pubDate>
      <link>https://dev.to/aibudaevv/building-an-ebpf-based-sip-monitor-in-go-3igk</link>
      <guid>https://dev.to/aibudaevv/building-an-ebpf-based-sip-monitor-in-go-3igk</guid>
      <description>&lt;p&gt;I recently built a SIP monitoring service that uses eBPF to capture SIP traffic directly in the Linux kernel and export metrics to Prometheus. The entire pipeline from packet to Prometheus metric takes ~3μs in userspace.&lt;/p&gt;

&lt;p&gt;Here's how it works and what I learned along the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Monitoring SIP/VoIP infrastructure at scale requires tracking call success rates, active dialogs, and response codes — without adding latency to the signaling path.&lt;/p&gt;

&lt;p&gt;I wanted something that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processes packets in kernel space&lt;/li&gt;
&lt;li&gt;Exports standard Prometheus metrics&lt;/li&gt;
&lt;li&gt;Runs as a single container&lt;/li&gt;
&lt;li&gt;Tracks SIP dialogs per RFC 3261&lt;/li&gt;
&lt;li&gt;Implements RFC 6076 performance metrics (Session Establishment Ratio)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SIP Traffic → NIC → eBPF socket filter → ringbuf → Go poller → SIP parser → Prometheus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The eBPF program (written in C) attaches as a socket filter via &lt;code&gt;AF_PACKET&lt;/code&gt;. It intercepts UDP packets on configurable SIP ports (default 5060/5061), copies them to a ring buffer, and the Go userspace process polls and parses them.&lt;/p&gt;

&lt;p&gt;The C program does three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parse Ethernet/IP/UDP headers&lt;/strong&gt; — handles both regular and VLAN-tagged frames&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter SIP traffic&lt;/strong&gt; — checks UDP ports (configurable via environment variables)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copy to ringbuf&lt;/strong&gt; — pushes matching packets to userspace&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Loaded via &lt;code&gt;cilium/ebpf&lt;/code&gt; — the Go library handles BPF map creation, program loading, and ringbuf polling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Known limitation:&lt;/strong&gt; The eBPF verifier doesn't allow variable-length &lt;code&gt;bpf_skb_load_bytes&lt;/code&gt;, so I copy packets in 64-byte blocks. Planning to migrate to &lt;code&gt;AF_PACKET&lt;/code&gt; with &lt;code&gt;PACKET_RX_RING&lt;/code&gt; (mmap) for arbitrary sizes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Go Part
&lt;/h2&gt;

&lt;p&gt;The Go side is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Poll ringbuf for new packets&lt;/li&gt;
&lt;li&gt;Parse raw SIP messages (method/status, headers, Call-ID, tags)&lt;/li&gt;
&lt;li&gt;Update Prometheus counters&lt;/li&gt;
&lt;li&gt;Track SIP dialog lifecycle&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Dialog Tracking
&lt;/h3&gt;

&lt;p&gt;SIP dialogs are identified by &lt;code&gt;{Call-ID, From tag, To tag}&lt;/code&gt;. Tags are sorted lexicographically for consistent IDs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dialog &lt;strong&gt;created&lt;/strong&gt; on &lt;code&gt;200 OK&lt;/code&gt; response to &lt;code&gt;INVITE&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Dialog &lt;strong&gt;terminated&lt;/strong&gt; on &lt;code&gt;200 OK&lt;/code&gt; response to &lt;code&gt;BYE&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Expired dialogs cleaned up every 1 second (based on &lt;code&gt;Session-Expires&lt;/code&gt; header, default 30 min)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Metrics Exported
&lt;/h3&gt;

&lt;p&gt;~30 Prometheus counters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Per-method:&lt;/strong&gt; &lt;code&gt;sip_exporter_invite_total&lt;/code&gt;, &lt;code&gt;sip_exporter_bye_total&lt;/code&gt;, &lt;code&gt;sip_exporter_register_total&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-status:&lt;/strong&gt; &lt;code&gt;sip_exporter_200_total&lt;/code&gt;, &lt;code&gt;sip_exporter_404_total&lt;/code&gt;, &lt;code&gt;sip_exporter_500_total&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session count:&lt;/strong&gt; &lt;code&gt;sip_exporter_sessions&lt;/code&gt; (active dialogs gauge)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RFC 6076 SER:&lt;/strong&gt; &lt;code&gt;sip_exporter_ser&lt;/code&gt; — Session Establishment Ratio&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The SER metric is interesting because it follows RFC 6076 exactly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SER = (INVITE → 200 OK) / (Total INVITE - INVITE → 3xx) × 100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;3xx redirects are excluded from the denominator — they're routing instructions, not failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Benchmarks on Intel i7-8665U (userspace only):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Packet parsing (L2→SIP)&lt;/td&gt;
&lt;td&gt;~124 ns&lt;/td&gt;
&lt;td&gt;8M pkt/sec&lt;/td&gt;
&lt;td&gt;32 B/op&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SIP header parsing&lt;/td&gt;
&lt;td&gt;~1.2 μs&lt;/td&gt;
&lt;td&gt;800k pkt/sec&lt;/td&gt;
&lt;td&gt;350 B/op&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full processing (with metrics)&lt;/td&gt;
&lt;td&gt;~3 μs&lt;/td&gt;
&lt;td&gt;300k pkt/sec&lt;/td&gt;
&lt;td&gt;1000 B/op&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are userspace numbers. Actual latency depends on kernel eBPF overhead and system load.&lt;/p&gt;

&lt;h2&gt;
  
  
  E2E Testing
&lt;/h2&gt;

&lt;p&gt;E2E tests use SIPp via &lt;code&gt;testcontainers-go&lt;/code&gt; to generate real SIP traffic and verify that metrics match expected values. Tests cover success/failure scenarios and validate proper dialog cleanup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;sip-exporter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;frzq/sip-exporter:0.5.0&lt;/span&gt;
    &lt;span class="na"&gt;privileged&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;network_mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;host&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;SIP_EXPORTER_INTERFACE=eth0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
curl http://localhost:2112/metrics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;More RFC 6076 metrics (Session Setup Time, Response Time)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/aibudaevv/sip-exporter" rel="noopener noreferrer"&gt;https://github.com/aibudaevv/sip-exporter&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker:&lt;/strong&gt; &lt;code&gt;docker pull frzq/sip-exporter:0.5.0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy to answer questions about the eBPF integration, SIP dialog state machine, or Prometheus metric design. Drop a comment below!&lt;/p&gt;

</description>
      <category>go</category>
      <category>monitoring</category>
      <category>voip</category>
      <category>prometheus</category>
    </item>
  </channel>
</rss>
