Marco Cadetg

Posted on Sep 22

Why Catching Short-Lived Processes Requires eBPF on Linux but Just a Header on macOS

#linux #rust #networking #opensource

TL;DR

macOS's PKTAP provides process info directly in packet headers (10 lines of code), while Linux requires eBPF programs hooking into kernel functions (100+ lines). Both solve the problem of identifying which process owns network packets but with very different complexity levels.

📚 This is Part 1 of my "Building a Network Monitor" series.
Coming next: Implementing Process Detection in RustNet: The Code

The Challenge
macOS: The PKTAP Approach
Linux: The Powerful but Complex eBPF Route
The Trade-offs
Implementation Notes

The Challenge

Traditional approaches like polling /proc/net/* on Linux or running lsof in a loop on macOS work well for long-lived connections, but they struggle with short-lived processes. By the time you poll, the process might already be gone, leaving you with orphaned connections whose origins remain a mystery. For example when running curl.

While working on adding process identification to the network monitoring tool RustNet, I discovered interesting differences in how macOS and Linux tackle this challenge.

How The Data Flows

macOS PKTAP:

Packet → Kernel → [+Process Info] → PKTAP Header → Your App
         ↑
         Automatic!

Linux eBPF:

Packet → Kernel Function → eBPF Hook → Map → Userspace
           ↑                    ↑        ↑
      tcp_connect()     You write this   You poll this
      udp_sendmsg()
      (and 10 more...)

macOS: The PKTAP Approach

macOS provides PKTAP (Packet Tap), where the kernel automatically includes process information in packet headers. This makes implementation very simple:

// From Apple's darwin-xnu (bsd/net/pktap.h)
struct pktap_header {
    // ... other fields
    pid_t pth_pid;        // Process ID
    char pth_comm[17];    // Process name (MAXCOMLEN + 1)
    pid_t pth_epid;       // Effective process ID
    char pth_ecomm[17];   // Effective command name
    // ... more fields
};

You simply read packets and the process info is right there in the header. The kernel handles all the heavy lifting of mapping packets to processes. Want to know which process sent a packet? Just parse the header:

pub fn get_process_info(&self) -> (Option<String>, Option<u32>) {
    let process_name = extract_process_name_from_bytes(&self.pth_comm);
    let pid = if self.pth_epid != 0 {
        Some(self.pth_epid as u32)
    } else {
        None
    };
    (process_name, pid)
}

That's it. Clean, simple, and it works for most packets. Interestingly, some packet types (like ICMP and ARP) don't always include process information—likely because they're handled differently by the kernel or lack a clear originating process context.

Linux: The Powerful but Complex eBPF Route

Linux doesn't have an equivalent to PKTAP, so one solution involves using eBPF programs that hook into kernel networking functions:

SEC("kprobe/tcp_connect")
int trace_tcp_connect(struct pt_regs *ctx) {
    struct sock *sk = (struct sock *)PT_REGS_PARM1_CORE(ctx);

    // Extract network info from socket
    key.saddr[0] = BPF_CORE_READ(sk, __sk_common.skc_rcv_saddr);
    key.daddr[0] = BPF_CORE_READ(sk, __sk_common.skc_daddr);

    // Get process info
    info.pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&info.comm, sizeof(info.comm));

    // Store in map for userspace retrieval
    bpf_map_update_elem(&socket_map, &key, &info, BPF_ANY);
    return 0;
}

But here is where it gets interesting (and complicated):

You need separate kprobes for tcp_connect, inet_csk_accept, udp_sendmsg, tcp_v6_connect, etc.
The comm field is limited to 16 characters—so "Firefox" becomes "Socket Thread"
You must understand kernel internals—socket structures, CO-RE relocations, BTF
Build complexity: Requires libelf, clang, LLVM, and kernel headers

The Trade-offs

macOS PKTAP Pros:

Dead simple API
Works out of the box
Full process names (when available)
Zero kernel programming required
Automatic process-packet association for most traffic

macOS PKTAP Cons:

macOS only (possibly other BSDs)
Requires special interface setup
Limited to what Apple exposes
Some packet types (ICMP, ARP) may lack process info

Linux eBPF Pros:

Incredibly powerful and flexible
Can hook into virtually any kernel function
Lower overhead than polling
Works on most modern kernels

Linux eBPF Cons:

Steep learning curve
Complex build requirements
16-char process name limit (comm field)
Must handle kernel version differences
More moving parts

Quick Comparison

Aspect	macOS PKTAP	Linux eBPF
Lines of Code	~10	~100+
Setup Complexity	None	Requires kernel headers, LLVM, libelf
Process Name Limit	17 chars	16 chars
Kernel Programming	No	Yes
Learning Curve	Minutes	Days/Weeks
Availability	macOS only	Linux 4.x+

Implementation Notes

For RustNet, I ended up using libbpf instead of Rust's aya framework specifically to avoid the rust nightly toolchain dependency. While aya offers more idiomatic Rust, libbpf's stability and broader compatibility made it the better choice for this project.

The contrast really highlights different OS design philosophies: macOS provides high-level, purpose-built APIs versus Linux offering low-level primitives that can be composed into powerful solutions, albeit with significantly more complexity.

Both approaches solve the same problem effectively, but the developer experience is very different. I wonder if Linux could benefit from higher-level networking APIs like PKTAP, though perhaps that's antithetical to the Unix philosophy of composable tools.

Note on the comm field: The 16-character limitation is a kernel constraint where thread names get truncated. Firefox appears as "Socket Thread", Chrome as "ThreadPoolForeg", etc. You can work around it by combining eBPF with selective procfs lookups, but that defeats some of the performance benefits.

💭 Discussion Points

Have you worked with PKTAP or eBPF? What was your experience?
Is Linux's complexity justified by the flexibility it provides?
Should Linux add a simpler API like PKTAP for common use cases?
Any Windows developers here? How does Windows handle this problem?
What other OS-specific networking APIs have you discovered?

🔗 See It In Action

This implementation is part of RustNet, a cross-platform network monitoring TUI where you can see both approaches in action. The eBPF implementation is available in v0.9.0 as an experimental feature (--features=ebpf).

If you're interested in network monitoring, packet inspection, or just want to see how different operating systems handle the same problem, check out the project. I'm always looking for feedback and contributions!

Originally published on my blog. Follow me for more posts about systems programming, Rust, and the interesting quirks of different operating systems.

DEV Community