<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ivan M</title>
    <description>The latest articles on DEV Community by Ivan M (@ivan-m-tech).</description>
    <link>https://dev.to/ivan-m-tech</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3862729%2F9b0d6e94-bd42-4920-9ede-851ec35debba.jpeg</url>
      <title>DEV Community: Ivan M</title>
      <link>https://dev.to/ivan-m-tech</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ivan-m-tech"/>
    <language>en</language>
    <item>
      <title>Accelerating TURN with eBPF: A Non-Invasive Approach</title>
      <dc:creator>Ivan M</dc:creator>
      <pubDate>Sun, 05 Apr 2026 21:00:11 +0000</pubDate>
      <link>https://dev.to/ivan-m-tech/accelerating-turn-with-ebpf-a-non-invasive-approach-ed1</link>
      <guid>https://dev.to/ivan-m-tech/accelerating-turn-with-ebpf-a-non-invasive-approach-ed1</guid>
      <description>&lt;p&gt;There is much to be said for the merits of eBPF when it comes to the common problems of network filtering, and one cannot help but observe the swift evolution of this splendid technology. However, one must admit that certain undertakings remain decidedly more tiresome than others. While dropping packets comes across as a rather straightforward affair, the pursuit of accelerating a stateful, high-performance userspace relay server presents a most formidable set of challenges.&lt;/p&gt;

&lt;p&gt;Notably, TURN (&lt;a href="https://datatracker.ietf.org/doc/html/rfc8656#name-detailed-example" rel="noopener noreferrer"&gt;RFC 8656&lt;/a&gt;: Traversal Using Relays around NAT) serves as a proverbial specimen of burning CPU cycles owing to an eye-watering amount of kernel-to-user mode switches and vice versa, yet offloading the channel traffic to the eBPF layer demands a most substantial and, one might say, exhaustive quantity of logic to conduct packet processing with surgical precision.&lt;/p&gt;

&lt;p&gt;The proposition to accelerate TURN using an eBPF bypass scheme has been mooted for a considerable duration. Notably, there have been feature requests (like &lt;a href="https://github.com/coturn/coturn/issues/759#issue-875557133" rel="noopener noreferrer"&gt;this&lt;/a&gt; one in the &lt;code&gt;coturn&lt;/code&gt; project) and solutions, like the exemplary work of Tamás Lévai et al., entitled "&lt;a href="https://dl.acm.org/doi/10.1145/3609021.3609296" rel="noopener noreferrer"&gt;Supercharge WebRTC: Accelerate TURN Services with eBPF/XDP&lt;/a&gt;". Although such remediations are thoughtfully designed to handle the protocol with utmost care, they require substantial assistance from the server to do their job. While this approach is certainly commendable, I take a rather different view on what the more beneficial arrangement might be.&lt;/p&gt;

&lt;p&gt;When it comes to unencrypted TURN traffic, one may construct a stateful eBPF component that is fully protocol-aware but learns about new TURN channel bindings by "snooping", ensuring the userspace server stands unmodified and blissfully unaware of the offload. In this post, I should like to present a humble prototype of this very approach, an open-source project that is named &lt;code&gt;TURN-BPF&lt;/code&gt;.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ivanmtech" rel="noopener noreferrer"&gt;
        ivanmtech
      &lt;/a&gt; / &lt;a href="https://github.com/ivanmtech/turn-bpf" rel="noopener noreferrer"&gt;
        turn-bpf
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      RFC 8656 channel accelerator
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;TURN-BPF: research into eBPF offloads of RFC 8656 channels&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;
&lt;pre class="notranslate"&gt;&lt;code&gt;TURN-BPF is a personal development effort aiming to reckon the feasibility
of using XDP programs to bypass the userspace for the TURN channel traffic
without the need to tamper with the code of the TURN implementation itself.

These programs conduct NAT (client &amp;lt;&amp;gt; TURN | relay &amp;lt;&amp;gt; peer), strip/add the
TURN channel tag, update the checksums, rewrite the MAC addresses and send
the resulting packets onto the wire via interfaces chosen based on the FIB.

The tool requires no configuration from the user, except for the interface
name(s) and is supposed to snoop on relay allocations and channel bindings
by capturing the said control packet handshakes at the XDP/TC hooks on the
main network interface. For the sake of keeping the channels active in the
userspace TURN server, the tool employs a 'heartbeat' approach, spilling a
small fraction of data packets&lt;/code&gt;&lt;/pre&gt;…&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ivanmtech/turn-bpf" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        [ CLIENTS ]                                 [ PEERS ]
          |     |                                       |
   (STUN) |     | (Tagged Data)                  (Untagged Data)
          |     |                                       |
 +--------+-----+--------+                     +--------+--------+
 | Interface for Clients |                     | Relay Interface |
 +--------+-----+--------+                     +--------+--------+
          |     |                                       |
          |     +-----+                                 |
          |           |                                 |
 =========|===========|=================================|=========
 KERNEL   |           |                                 |
          |           |                                 |
 +--------+--------+  |   +-------------------------+   |
 | XDP/TC Snoopers |  +--&amp;lt;|&amp;gt; XDP cli2rem / rem2cli &amp;lt;|&amp;gt;--+
 +--------+--------+      +---+---------------------+   |
   (Maps) |                   |   (Fast Path)           |
          |                   |                         ^
          |           (Heartbeat Spill)                 |
          |                   |                         |
 =========|===================|=========================|=========
 USERLAND |                   |                         |
          |                   +----&amp;gt;+-------------+--&amp;gt;--+
          |                         | TURN Server |
          +------------------------&amp;lt;+&amp;gt;------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architecture: The Snooping Approach
&lt;/h2&gt;

&lt;p&gt;The structural underpinnings of the arrangement include a set of XDP and TC (Traffic Control) programs written in C, some of which snoop on control packets to commit the channels for an offload, while others conduct the offload per se. Depending on the configuration (a single network interface acting as both the client endpoint and the relay versus a separate client-facing interface and one or multiple relays), the eBPF component provides either a separate XDP &lt;code&gt;rem2cli&lt;/code&gt; program or a combined XDP section with &lt;code&gt;cli2rem&lt;/code&gt;, &lt;code&gt;rem2cli&lt;/code&gt;, and the STUN snooper baked in.&lt;/p&gt;

&lt;p&gt;One might inquire why the tool employs a TC hook for snooping on the server's responses rather than keeping everything within the XDP layer. The reasoning is as follows: while XDP is unparalleled for raw speed on ingress, the TC egress hook on the client-facing interface allows one to observe the packets after they have been processed by the userspace stack and the kernel's networking subsystem. At this stage, the ChannelBind success response is fully formed and ready to depart. By intercepting it here, one can ensure that a channel is only committed to the fast path once the server has officially sanctioned the allocation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Under the Hood: FIB Lookups &amp;amp; Heartbeats
&lt;/h2&gt;

&lt;p&gt;Indubitably, the implementation possesses considerably more depth than might initially meet the eye. In particular, the knowledge of which IP addresses map to which MAC addresses and, more crucially, network interfaces, comes from the FIB lookup. The lookup is performed in the control path (in the snoopers), when the channel binding is committed, and relies upon a remarkably helpful and powerful building block from the kernel, the &lt;code&gt;bpf_fib_lookup&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;Furthermore, to ensure the longevity of the session within the userspace daemon, the tool employs a "heartbeat spill" mechanism. By deliberately passing a minute fraction of the data packets up the network stack, we allow the TURN server to perceive the channel as active, thereby preventing the premature expiration of the allocation while the bulk of the throughput enjoys the celerity of the eBPF fast path. Regrettably, there are certain limitations, too. The project, which is merely a proof-of-concept, does not handle encrypted channels and can only support IPv4 traffic.&lt;/p&gt;

&lt;p&gt;The tool comes with a loader program written in Rust, which blocks waiting for the &lt;code&gt;Ctrl+C&lt;/code&gt; keystroke upon successful activation of the kernel component. This part is a hodgepodge of foundational knowledge from the textbook I have been reading, some AI advice, and my occupational hazards from the ten years of experience in C programming; therefore, one should take the code quality with a grain of salt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment &amp;amp; Verification
&lt;/h2&gt;

&lt;p&gt;On Debian 13 (kernel 6.12), one may run the tool as follows (which should be done in a separate terminal on the server machine, prior to the launch of TURN):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--yes&lt;/span&gt; cargo clang git libelf-dev pkg-config
git clone https://github.com/ivanmtech/turn-bpf
&lt;span class="nb"&gt;cd &lt;/span&gt;turn-bpf
cargo build
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./target/debug/turn-bpf &amp;lt;main_ifname&amp;gt; &lt;span class="o"&gt;[&lt;/span&gt;relay_ifnames...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My poor man's test rig consisted of a pair of laptops connected via a commodity 100 Megabit/s USB Ethernet link. The server-side &lt;code&gt;coturn&lt;/code&gt; configuration was:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;listening-port&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;3478&lt;/span&gt;
&lt;span class="py"&gt;listening-ip&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;192.168.47.1&lt;/span&gt;
&lt;span class="py"&gt;relay-ip&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;192.168.47.1&lt;/span&gt;
&lt;span class="py"&gt;user&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;user:password&lt;/span&gt;
&lt;span class="py"&gt;realm&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;turn.test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are my configuration steps on the server-side laptop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--yes&lt;/span&gt; coturn
&lt;span class="nb"&gt;sudo &lt;/span&gt;service coturn stop
&lt;span class="nb"&gt;sudo &lt;/span&gt;nmcli device &lt;span class="nb"&gt;set &lt;/span&gt;enx0 managed no
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip addr add 192.168.47.1/24 dev enx0
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;enx0 up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the client-side laptop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nmcli device &lt;span class="nb"&gt;set &lt;/span&gt;enx0 managed no
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip addr add 192.168.47.2/24 dev enx0
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;enx0 up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running &lt;code&gt;coturn&lt;/code&gt; on the server-side laptop is as simple as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;turnserver &lt;span class="nt"&gt;-c&lt;/span&gt; ~/test.cfg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the client-side laptop acts both as a client and a remote peer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;turnutils_peer &lt;span class="nt"&gt;-L&lt;/span&gt; 192.168.47.2
turnutils_uclient &lt;span class="nt"&gt;-u&lt;/span&gt; user password &lt;span class="se"&gt;\&lt;/span&gt;
 &lt;span class="nt"&gt;-w&lt;/span&gt; p &lt;span class="nt"&gt;-e&lt;/span&gt; 192.168.47.2 &lt;span class="nt"&gt;-n&lt;/span&gt; 100000 &lt;span class="nt"&gt;-m&lt;/span&gt; 50 &lt;span class="nt"&gt;-g&lt;/span&gt; 192.168.47.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performance Results
&lt;/h2&gt;

&lt;p&gt;In my case, running &lt;code&gt;pidstat&lt;/code&gt; for &lt;code&gt;coturn&lt;/code&gt; indicates CPU usage at a negligible level, virtually 0%, which jumps swiftly to 20% when &lt;code&gt;Ctrl+C&lt;/code&gt; is pressed in the &lt;code&gt;TURN-BPF&lt;/code&gt; terminal. System-wide CPU usage is 0-1% when the offload is active, versus 6-7% in its absence. Again, my test rig was lacking in many ways, so the results presented might not necessarily meet production-grade expectations.&lt;/p&gt;

&lt;p&gt;As I continue to explore the frontiers of high-performance networking, I should like to remain open to communication with peers and would be delighted to converse about any other architectural and system design challenges of the modern age.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>performance</category>
      <category>showdev</category>
      <category>systemsprogramming</category>
    </item>
  </channel>
</rss>
