DEV Community: TooFastTooCurious

Building Runtime Enforcement for Kubernetes with eBPF

TooFastTooCurious — Tue, 14 Apr 2026 14:54:45 +0000

Originally published on the Juliet Security blog.

Most Kubernetes security tools stop at scan time. They'll flag a critical CVE in a container image or complain that a pod runs as root. What they won't do is tell you that someone just spawned a shell in your production namespace, opened a connection to a mining pool, or loaded a kernel module to break out of the container sandbox.

Juliet started as a graph-based security platform. We map attack paths, score blast radius, prioritize findings. Useful stuff. But customers kept circling back to the same ask: can you actually stop the bad thing, or just tell me it happened?

So we built runtime enforcement. This post walks through the design, the tradeoffs we made, and the production incident that changed how we think about safety.

Replacing Falco

We started with Falco as a sidecar. It watches syscalls through eBPF, writes alerts to a FIFO pipe, and our node-agent reads from the other end of that pipe.

The pipe was the problem. If our agent started before Falco, the pipe didn't exist yet. If Falco restarted, the pipe broke. If the pipe filled up because our reader fell behind, Falco would block. We burned more hours managing that pipe than we spent building actual security features.

On top of that, Falco's rule language was too coarse for what we needed. We wanted to match events against customer-defined policies with namespace scoping, image pattern matching, and per-process exception lists. Translating between our internal policy model and Falco's YAML rules created a fragile middle layer that broke in subtle ways.

We ripped it out and embedded the eBPF sensor directly in our Go agent using cilium/ebpf.

What we trace

We hook 22 syscalls across five categories:

Category	Syscalls	What we catch
Process execution	`execve`, `execveat`	Shells, exploit toolkits, crypto miners
File access	`openat`, `unlinkat`, `memfd_create`	Reads of /etc/shadow, log deletion, fileless payloads
Network	`connect`, `listen`, `accept4`	C2 callbacks, cloud metadata grabs, rogue listeners
Container escape	`ptrace`, `mount`, `setns`, `unshare`, `init_module`, `finit_module`	Namespace tricks, host filesystem mounts, module loading
Privilege escalation	`chmod`, `fchmodat`, `capset`	Setuid flips, capability changes

Each tracepoint handler writes a fixed 304-byte struct into a 2MB ring buffer. The struct uses a C union for the syscall-specific payload (file path, network address, or process metadata), so every event is identical in size regardless of type. This keeps the ring buffer math simple and avoids variable-length parsing on the hot path.

Filtering where it matters: in the kernel

This was the single best decision we made. Instead of sending every syscall event to userspace and filtering there, we filter inside the BPF program using two maps:

monitored_syscalls: a hash map of syscall numbers that active policies actually care about. If nobody has a network policy enabled, connect and listen events never leave the kernel. When a customer toggles policies on or off, we update this map and the change takes effect on the next syscall.

container_cgroups: a fast lookup by cgroup ID to decide whether a process belongs to a monitored container. For runtimes we haven't populated the map for, we fall back to checking PID namespace depth (task->nsproxy->pid_ns_for_children->level > 0). Containers always have level > 0; host processes sit at level 0. This works across Docker, containerd, and CRI-O without any userspace coordination.

The payoff: overhead scales with the number of policies you enable, not the number of syscalls we could theoretically trace.

Turning PIDs into something useful

A raw eBPF event gives you a PID and a 16-character process name. That's not enough to make a security decision. You need the container name, the pod, the namespace, the image, and the service account.

We use three caches that each pull from a different source:

PID LRU reads /proc/<pid>/cgroup to get the container ID. 10K entries, 5-minute TTL.
CRI cache talks to containerd over gRPC and watches container start/stop events.
K8s cache watches the pod API for the local node.

If one cache goes down, the other two still contribute what they can. If all three are broken, events still carry the PID, container ID, and process name from the kernel. We never stall the pipeline waiting for metadata. An event with partial enrichment moves through and the policy matcher treats it conservatively (no enforcement on events we can't fully identify).

Matching policies fast

Every two minutes, the agent syncs policies from the API and compiles them into a lookup structure:

CompiledPolicy {
    SyscallSet:    {59: true, 322: true}     // execve, execveat
    ProcessNames:  {"bash": true, "sh": true}
    PathPrefixes:  ["/tmp/", "/var/run/"]
    NetCIDRs:      [169.254.169.254/32]
    Scope:         {IncludeNamespaces: {"production": true}}
    Exceptions:    [{process_name: "nginx"}]
}

Policies are bucketed by syscall category. When an event comes in, we look up its category (derived from the syscall number), get the handful of candidate policies (usually 3-8), and check each one. The hot path uses pre-allocated maps and does zero heap allocation.

If two policies both match and one says "alert" while the other says "kill", the kill wins. We always pick the highest-severity enforce-mode match.

Why we kill instead of block

We enforce by sending SIGKILL from userspace. The alternative is BPF LSM, where the eBPF program returns -EPERM and the kernel refuses the syscall before it completes.

LSM is objectively better at prevention. But we chose kill for three reasons:

Portability. BPF LSM requires kernel 5.7+ with CONFIG_BPF_LSM=y. A lot of production clusters still run Amazon Linux 2 or RHEL 8. We didn't want to cut out half our addressable market.

Failure mode. If an LSM policy has a bug and matches kubelet or containerd, the node goes down. You can't start new pods, can't pull images, can't recover without SSH access. With SIGKILL, the worst case is that a process dies and the kubelet restarts it. Annoying, but the node stays up.

We tested the failure mode on ourselves. Not on purpose.

How we broke staging

Three weeks into our enforcement beta, we turned on enforce mode in staging. Within minutes, Harbor (our container registry) started throwing 500 errors. Pulls failed. Deployments queued up. The cluster ground to a halt.

Here's what happened: we had a policy that flags processes running as root. That's a reasonable thing to detect. But our enforcement engine applied it globally, across every namespace on the node. Harbor's Postgres process runs as root. So does Cilium's agent. So does RabbitMQ. The enforcement engine dutifully killed all of them.

We turned enforcement off, traced the kills in our metrics, and realized the fix was obvious in hindsight: enforcement needs to be scoped to specific namespaces.

func isInScope(namespace string, scope CompiledScope) bool {
    if len(scope.IncludeNamespaces) > 0 {
        return scope.IncludeNamespaces[namespace]
    }
    if len(scope.ExcludeNamespaces) > 0 {
        return !scope.ExcludeNamespaces[namespace]
    }
    return true
}

Three rules came out of that incident:

If an event has no namespace metadata (enrichment failed or it's a host process), never enforce. Default to audit.
If a namespace isn't in the policy's scope, downgrade from kill to audit. Still record the event, just don't act on it.
The UI now requires you to specify at least one namespace when you set a policy to enforce mode. No more global enforcement.

Seven things we check before every kill

After the Harbor mess, we added layers of protection to the response actor. Every kill request goes through all seven:

No container ID, no kill. If we can't confirm it's a container process, we leave it alone.
Simulate mode. Logs what would happen without sending the signal. You should always run a new policy in simulate for a few days first.
Protected namespaces. kube-system is off-limits by default.
PID 0 and PID 1 are untouchable. We will never kill init.
Self-preservation. The agent will not kill its own process.
Rate limiting per pod. 10 kills per pod in a 60-second window. After that, we stop and flag it. This prevents kill-restart spirals.
Namespace scope. The policy must explicitly include the event's namespace.

Each attempt gets tagged with a result code: killed, failed, skipped_namespace, skipped_pid1, suppressed, or simulated. All of these show up in Prometheus, so you can see exactly what enforcement did on every node.

Moving events without drowning

A busy node can produce thousands of syscall events per second. Sending each one to the API individually would saturate the network and hammer ClickHouse. So we built a five-stage pipeline:

Sensor (ring buffer, polls every 100ms)
  -> Response Actor (kill decisions happen here, < 200ms)
    -> Coalescer (groups by rule+container+process, 5s window)
      -> Batcher (flushes at 500 events or 5s, whichever hits first)
        -> Forwarder (gzip, retry with backoff, disk spool if API is down)

The important thing: kills happen in stage 2. We don't batch enforcement. If a process needs to die, it dies within 200ms of the syscall, not after a 5-second batch window.

Coalescing cuts volume by 10x to 100x on noisy workloads. If bash keeps spawning in the same container and hitting the same policy, we collapse 100 events into one record with event_count: 100.

If the API goes offline, the forwarder writes batches to a local disk spool (capped at 100MB, oldest files evicted first). When the API comes back, a drain loop picks up the files and replays them. We'd rather lose some events than let backpressure freeze the enforcement path.

Handling different kernels

eBPF with CO-RE needs BTF data. Modern kernels (5.8+) ship it at /sys/kernel/btf/vmlinux. Plenty of production kernels don't.

Our fallback chain:

Use host kernel BTF if it exists
Try an embedded BTFhub archive that matches the kernel release
If nothing works, run in status-only mode. The agent reports its health and syncs policies, but doesn't hook any syscalls.

ARM64 adds another wrinkle. Those kernels don't have dup2 or chmod as separate syscalls; they use dup3 and fchmodat instead. We attach tracepoints on a best-effort basis: skip what's missing, log a warning, only bail out if literally nothing attaches.

What the numbers look like

50 pods on a node, all 40 built-in policies active in audit mode:

Metric	Value
CPU (steady state)	200-300 mCPU
Memory	500-800 MB
Raw events per second	50-200
After coalescing	5-20 per second
Network to API	50-500 KB every 5s
Time from syscall to ClickHouse	5-11 seconds
Time from syscall to kill	under 200ms

Storage latency is deliberately higher than enforcement latency. Killing a process can't wait for batch compression. Writing it to a database can.

Things we'd change

Scope enforcement from day one. Global enforcement without namespace scoping cost us a staging outage and a scramble to patch. If you're building enforcement for anything, make scope a required field before you write your first kill call.

Move coalescing earlier for audit-only events. Right now every event hits the response actor, even if it's just going to be logged. For audit policies, we could coalesce first and skip the per-event response check entirely. That would cut CPU on nodes with chatty workloads.

Ship a heartbeat from the start. For months we inferred agent health from when it last uploaded an SBOM or synced policies. If a node had no new images and runtime was off, the agent looked dead even though it was fine. A 60-second heartbeat ping would have saved us a lot of false alarms.

Where this is going

We're looking at BPF LSM as an opt-in mode for clusters running kernel 5.7+. SIGKILL handles most cases well, but some compliance regimes want proof that the syscall was blocked, not just that the process was terminated afterward.

We're also wiring up alert routing so enforcement events go straight to Slack and PagerDuty instead of sitting in a dashboard waiting to be noticed.

Building runtime enforcement changed Juliet from a scanner into a platform. It also taught us more about production safety than anything else we've shipped. If you're curious, juliet.sh.

Questions about any of the runtime stuff? Reach us at contact@juliet.sh.

Axios was compromised for 3 hours - how to find it in your running Kubernetes clusters

TooFastTooCurious — Tue, 31 Mar 2026 18:32:15 +0000

On March 31, 2026, a compromised maintainer account was used to publish two malicious versions of axios, the most popular JavaScript HTTP client on npm with over 100 million weekly downloads. Versions 1.14.1 and 0.30.4 contained a hidden dependency that deployed a cross-platform remote access trojan (RAT) to any machine that ran npm install during a three-hour window.

The malicious versions were pulled from npm by 03:29 UTC. But npm lockfiles only protect your source repos. If a container image was built during that window, the compromised package is baked into the image and running in your cluster right now.

What happened

The attacker gained publishing access to the official axios npm package, likely through a compromised maintainer account. Instead of modifying axios source code directly, they added a malicious dependency — plain-crypto-js@4.2.1 — to the package.json. That package had a "clean" version published 18 hours earlier to establish a plausible history on the registry.

On npm install, the malicious package ran a postinstall hook that executed a double-obfuscated dropper script. The dropper detected the host OS, downloaded a platform-specific RAT from a C2 server at sfrclak[.]com:8000, and then deleted all traces of the postinstall script.

The RAT capabilities include:

macOS: Binary at /Library/Caches/com.apple.act.mond disguised as an Apple daemon. Accepts commands for arbitrary binary injection, shell execution, and filesystem enumeration. Beacons every 60 seconds.
Windows: PowerShell RAT disguised as Windows Terminal at %PROGRAMDATA%\wt.exe.
Linux: Python RAT at /tmp/ld.py launched as an orphaned background process.

Why your lockfile isn't enough

Most of the incident response guidance focuses on checking lockfiles and running snyk test against your source repository. That's necessary but incomplete.

The gap: container images. If any image in your cluster was built between 00:21 and 03:29 UTC on March 31, the build may have pulled axios 1.14.1 or 0.30.4. That image is now running in your cluster with the RAT baked in, regardless of whether you've since fixed your lockfile.

This matters because:

CI/CD pipelines that build images overnight in UTC-aligned schedules were squarely in the window
Multi-stage Docker builds that run npm install without a committed lockfile (or with npm install instead of npm ci) would have pulled the latest malicious version
Images already deployed don't get rescanned unless you explicitly trigger it

Checking your source repo is step one. Checking what's actually running in your clusters is step two, and most organizations skip it.

How to check your Kubernetes clusters

1. Find affected images

If you generate SBOMs from your container images (via Syft, Trivy, or similar), query them for the compromised versions:

# Scan a running image for the compromised package
grype <image> | grep -E "axios.*1\.14\.1|axios.*0\.30\.4|plain-crypto-js"

# Or generate an SBOM and search it
syft <image> -o json | jq '.artifacts[] | select(.name == "axios" and (.version == "1.14.1" or .version == "0.30.4"))'

2. Check for the RAT indicators on nodes

If you have node-level access, check for the platform-specific IOCs:

# Linux nodes — check for the Python RAT
kubectl get nodes -o name | xargs -I{} kubectl debug {} --image=busybox -- find /tmp -name "ld.py"

# Check for outbound connections to the C2
kubectl get pods --all-namespaces -o name | xargs -I{} kubectl exec {} -- sh -c \
  "cat /proc/net/tcp 2>/dev/null | grep '$(printf "%X" 142.11.206.73 | fold -w2 | tac | tr -d "\n")'" 2>/dev/null

3. Check network policies for C2 egress

The RAT beacons to 142.11.206.73:8000. If you have network policy enforcement (Cilium, Calico), check whether any pod has made outbound connections to that IP:

# If using Cilium with Hubble
hubble observe --to-ip 142.11.206.73 --verdict FORWARDED

4. Block the compromised package at admission

If you run an admission controller with OPA policies, add a rule to reject images containing the compromised dependency:

deny[msg] {
    input.request.kind.kind == "Pod"
    container := input.request.object.spec.containers[_]
    # Flag images built during the compromise window
    # (requires SBOM-aware admission — see your scanner's docs)
    msg := sprintf("Container %s may contain compromised axios — verify image SBOM", [container.name])
}

5. Or skip the manual work

The steps above work, but they're per-image and per-node. If you're running dozens of namespaces across multiple clusters, doing this manually doesn't scale.

Juliet continuously generates SBOMs from every container image running in your clusters and builds a graph of how vulnerabilities connect to workloads, RBAC permissions, network policies, and secrets. For an incident like this, you open Explorer and type what you're looking for in plain English:

Juliet converts the natural language query into structured filters across your entire cluster graph and returns every match:

Every affected pod across every cluster, plus the blast radius: what service accounts those pods use, what secrets they can access, whether they have network egress to the C2 IP, and which other workloads they can reach. No grep, no per-image scanning, no guessing which namespaces to check.

You can also set an admission control policy to block any new deployment containing plain-crypto-js or the affected axios versions — so even if a team hasn't seen the advisory yet, the compromised image can't land in the cluster.

What to do right now

If you find affected images:

Don't just update the lockfile and redeploy. The RAT may have already exfiltrated secrets from the container's environment. Rotate every secret that was mounted into or accessible from affected pods — service account tokens, API keys, database credentials, cloud provider credentials.
Rebuild images from scratch. Don't layer a fix on top of a potentially compromised image. Rebuild from the base image with a clean npm ci against a verified lockfile.
Check for lateral movement. If the RAT was active, the attacker had arbitrary code execution inside your cluster. Review RBAC permissions of affected pods — could they access other namespaces, secrets, or the Kubernetes API?
Block the C2 at the network level. Add 142.11.206.73 and sfrclak[.]com to your network policy deny lists and DNS blocklists immediately.

If you don't find affected images:

Verify, don't assume. The absence of evidence in a spot check isn't the same as a clean bill of health. Scan every image in every namespace, not just the ones you think might be affected.
Add plain-crypto-js to your package blocklist in whatever registry proxy or admission policy you use.
Enforce npm ci in all Dockerfiles. If any of your Dockerfiles use npm install instead of npm ci, they ignore the lockfile and pull whatever's latest. That's how a three-hour window becomes your problem.

The pattern

This is the third major npm supply chain attack in 2026. The playbook is consistent: compromise a maintainer account, add a malicious transitive dependency (not modify source directly), use postinstall hooks for execution, and deploy platform-specific payloads that self-delete.

The defenses that matter are also consistent:

Lockfile enforcement (npm ci, not npm install) in every build
SBOM generation on built images, not just source repos
Runtime visibility into what's actually deployed in your clusters
Admission control that can block known-bad packages before they run
Network policy that limits egress from workloads by default

Your source repo being clean doesn't mean your cluster is clean. The question after every supply chain incident is: what's actually running right now, and can it reach anything it shouldn't?

Originally published at juliet.sh

Introducing the ABOM: Why Your CI/CD Pipelines Need a Bill of Materials

TooFastTooCurious — Fri, 27 Mar 2026 00:26:52 +0000

An ABOM (Actions Bill of Materials) is a complete inventory of every GitHub Action your CI/CD pipelines depend on — including transitive dependencies buried inside composite actions, reusable workflows, and tool wrappers that your workflow files never mention directly.

If you know what an SBOM is, you already get it. SBOMs catalog your application dependencies. ABOMs catalog your pipeline dependencies. And right now, most organizations have no idea what's actually running in their CI.

The problem

Take this workflow:

- name: Scan for vulnerabilities
  uses: crazy-max/ghaction-container-scan@v3

No mention of Trivy anywhere. But ghaction-container-scan downloads and runs Trivy internally. When 76 of 77 Trivy release tags were poisoned with credential-stealing malware in March 2026, organizations that grepped their workflows for trivy-action found nothing — and assumed they were safe.

They weren't.

This isn't a Trivy-specific problem. It's a structural one. GitHub Actions have a dependency tree just like application code does, but nobody's been tracking it.

Why SBOMs don't cover this

SBOMs document what goes into your software — libraries, packages, container base images. That's the artifact side.

But the pipeline that builds, tests, scans, and deploys that software has its own dependency tree. A compromised CI action can steal every secret in your pipeline, poison every artifact it touches, and propagate to every downstream system — and none of that shows up in an SBOM.

After Trivy, this stopped being theoretical. The attack exfiltrated AWS credentials, Kubernetes tokens, Docker configs, and SSH keys from CI runners. It then used stolen npm credentials to publish a self-propagating worm into downstream packages. The pipeline was the entry point for all of it.

What an ABOM contains

An ABOM maps every action in your workflows, resolved recursively:

Direct dependencies — the actions your workflows reference explicitly. This is what grep finds.

Transitive dependencies — actions called by composite actions or reusable workflows your workflows use. This is what grep misses. A single uses: line in your workflow might resolve to a chain of five or six nested actions, any one of which could be compromised.

Embedded tools — actions that don't call other actions but silently download and execute external tools like Trivy, Grype, or Snyk. These don't show up as action dependencies at all — you have to analyze the action's metadata and inputs to detect them.

For each action, the ABOM records the owner, repository, version reference, whether it's pinned to an immutable SHA or a mutable tag, and the full chain of how your workflow reaches it.

What you do with it

Incident response. When a GitHub Action gets compromised, you need to know in minutes whether you're affected — not after a manual audit of every composite action your workflows use. Query the ABOM for the affected action and get an immediate answer, including transitive and embedded exposure.

CI gate. Generate an ABOM on every pull request and fail the build if it contains a known-compromised action. The same way you'd fail a build for a critical CVE in an application dependency.

Compliance. If you're already generating SBOMs for regulatory or customer requirements, your CI/CD pipeline is a gap in that inventory. An ABOM closes it.

Drift detection. Compare ABOMs across builds to detect when a new transitive dependency appears or when a previously-pinned action gets changed to a mutable tag.

Standard formats

ABOMs shouldn't be a proprietary format. The dependency relationships in a CI pipeline map cleanly onto existing BOM standards:

CycloneDX 1.5 — actions become components, transitive relationships go in the dependency graph, compromised actions show up as vulnerabilities. Plugs directly into Dependency-Track, Grype, and other tooling.
SPDX 2.3 — actions become packages with DEPENDS_ON relationships. Works with existing license compliance and SBOM aggregation tools.

This means you can manage your pipeline dependencies with the same tools you already use for application dependencies.

Try it

We built abom to generate ABOMs from any GitHub repository:

# Install
go install github.com/julietsecurity/abom@latest

# Generate an ABOM for your repo
abom scan .

# Check against known-compromised actions
abom scan . --check

# Export as CycloneDX
abom scan . -o cyclonedx-json

It resolves transitive dependencies by fetching action metadata from GitHub, caches results locally, and checks against a community-maintained advisory database of known-compromised actions. It's open source under Apache 2.0.

This will happen again

The Trivy compromise was not a one-off. GitHub Actions are a high-value target: they run with access to cloud credentials, deployment keys, package registry tokens, and production infrastructure. Any widely-used action is one misconfigured token away from becoming the next supply chain incident.

The question is whether you'll find out you were affected from a tool, or from an incident report.

Questions? Reach us at contact@juliet.sh.