Parth Shah

Posted on Dec 13, 2025 • Originally published at getlinnix.substack.com

10,000 eBPF Events to 1 Alert: Don’t burn the CPU

#ebpf #linux #kubernetes

eBPF lets you observe everything the Linux kernel is doing.

The problem: if you ship every event to user space, your monitoring becomes the outage.

On a busy server, the kernel can generate millions of events per second: file opens, network packets, process forks… everything.

If you try to ship all of that to a database or log system, two things happen:

Observer effect: your monitoring agent eats CPU and makes latency worse.
Disk death: you fill storage with noise nobody will ever read.

I’ve seen people rack up serious log bills just by flipping debug on in the wrong place.

So the question becomes:

How do we go from 10,000+ raw events

to 1 useful alert,

without burning the CPU?

For me, the answer is an architecture that looks like a funnel.

TL;DR

You cannot send every event to user space. Crossing the boundary (kernel → user) is not free.

The pattern I use is a 3-stage funnel:

In-kernel filtering — throw away as much as possible before waking any agent
Ring buffers — move data efficiently when you do need to send it
User-space windowing — find patterns over time and only then alert

This is the core idea behind how I’m building my own agent with Rust + eBPF right now.

The architecture: a funnel, not a firehose

If you treat eBPF like a log pipeline, you’ll lose.

Most of the cost is not “eBPF” itself — it’s the work you force by crossing kernel → user space too often:

wakeups / context switches
per-event allocations in your agent
backpressure that turns into dropped events (aka blind spots)

You want a funnel: throw away the boring stuff early, and only ship the interesting tail.

Stage 1: In-kernel filtering (don’t wake the agent)

The fastest code is the code that never runs.

The cheapest event is the event you never send.

Example: detecting slow HTTP requests

Naive approach

On every request start and end, send an event to user space.
Let the agent compute (end - start) for every request and check if it’s > 500ms.

Result: you send thousands of events per second just to discover that almost all of them are fine.

eBPF approach

Keep a small map (hash table) in kernel memory.
When a request starts, store the start timestamp in the map.
When the request ends, look up the start time, compute duration, and check it.

Logic in kernel:

if duration <= 500ms → delete entry, do nothing
if duration > 500ms → send one event to user space

So 99% of “healthy” requests never cross the kernel boundary. No extra wakeup, no extra allocation in your agent, nothing.

Same idea works for fork storms, short-lived jobs, etc. Do the cheap check in kernel, and only emit “interesting” cases.

Stage 2: Ring buffers (moving data without pain)

When we do find a “bad” event (for example: a fork-bomb pattern), we still need to ship it to user space.

What we don’t want:

writing to files on every event
sending everything over TCP sockets

Both are too slow and add overhead.

Instead, use a perf ring buffer:

it’s a shared buffer between kernel and user space
kernel writes events to the head
the agent reads from the tail
no syscall per event, no per-event allocation on the hot path

The tricky part: falling behind

If the kernel writes faster than you read, the buffer wraps and overwrites older data. That’s dropped events.

To reduce that risk, don’t read one event at a time and process it synchronously. My pattern looks like:

read a batch of events from the ring buffer
push them into an internal queue/channel
process them in a separate worker

Keep the ring buffer as empty as possible. Keep the kernel happy.

Stage 3: Windowing (from raw events to a real alert)

Even after filtering, raw events are not the alert.

Example stream:

“PID 100 called fork”
“PID 100 called fork”
“PID 100 called fork”

That’s not actionable. It’s just a list.

To turn this into something useful, use time windows in user space.

Very simplified example in pseudocode:

// on each fork event
if event.type == FORK {
  process_stats[pid].fork_count += 1
}

// every 1 second (the tick)
for pid in process_stats {
  if process_stats[pid].fork_count > 50 {
    trigger_alert("fork_bomb_suspected", pid)
  }
  process_stats[pid].fork_count = 0
}

Now you have a metric: forks per second per PID.

The alert becomes:

“PID 1234 called fork 57 times in the last second.”

Much more useful than staring at a wall of single fork events.

Same idea applies to other patterns:

“opened N new file descriptors in a short window”
“created and exited M child processes in 2 seconds”
“network connections to the same IP exploded in the last second”

The missing piece: context

Even with good filtering and windowing, tools often fail on the “why?” question.

You get: “High CPU on PID 555.”

You ask: “What was PID 555 actually doing?”

If the process is already gone, you won’t be able to inspect it later.

That’s why I try to attach context at the moment of the event:

stack trace → which function was running
parent PID → who launched this
container/cgroup ID → which container/pod this belongs to

I grab this data as close as possible to the event (inside the eBPF program or right after it reaches user space), and send it along with the alert.

So the alert is no longer:

“CPU high on PID 555”

It becomes something like:

“CPU high in container X, process /usr/bin/worker, function handle_batch(), parent PID 42”

Now you have a chance of fixing the real issue, not just staring at numbers.

How I’m using this today

All of these ideas are not just theory for me. I’m building them into a small Rust + eBPF agent I call Linnix:

eBPF programs handle the in-kernel filtering and write into perf ring buffers
a Rust daemon reads events in batches, does the time-window logic, and raises alerts
I keep a hard budget of < 1% CPU overhead, so I’m forced to be careful about what leaves the kernel

If you follow these principles:

filter early (in kernel)
transport fast (ring buffers)
aggregate later (user-space windows)

you can watch systems doing millions of operations per second without your “observability” layer becoming the problem.

Next up, I want to talk about automated remediation — how to safely act on these signals (for example, killing a runaway process) without creating a new class of outages.

If you want to see the code side of this, I’m slowly open-sourcing it here:

https://github.com/linnix-os/linnix