mckeane mcbrearty

Posted on Mar 3 • Originally published at westbayberry.com

How Dependency Guardian Would Have Caught Shai-Hulud

#cybersecurity #javascript #node #security

In September 2025, the npm ecosystem experienced its first self-replicating worm. The attacker published a single malicious package under a common typosquat name. Within 24 hours, 500+ packages were compromised. By the time the second wave subsided in November, 796 packages had been infected. The worm was called Shai-Hulud.

Every major CVE-based security tool — Dependabot, Snyk, npm audit — gave it a clean bill of health.

This post is a technical breakdown of exactly how Dependency Guardian's behavioral analysis engine would have blocked Shai-Hulud at the PR gate, before a single infected package entered your dependency tree.

What Shai-Hulud Did

The attack was elegant in its simplicity. The initial payload was a preinstall script that executed a bundled JavaScript file:

{
  "name": "colros",
  "version": "1.5.0",
  "scripts": {
    "preinstall": "node scripts/setup.js"
  }
}

That setup.js file did four things:

Read npm tokens from ~/.npmrc and environment variables (NPM_TOKEN, GITHUB_TOKEN)

Exfiltrated the tokens via the GitHub API — POST https://api.github.com/repos//shai-hulud/issues — disguised as normal GitHub traffic

Used Bun-specific APIs (Bun.spawn, Bun.file) to bypass Node.js security sandboxes that only monitor child_process and fs

Ran npm publish on every package the stolen token had write access to, injecting the same preinstall hook into each one

The self-replication loop was the breakthrough. Previous npm attacks were one-shot: compromise a package, steal data, done. Shai-Hulud was a worm. Every victim became a new attack vector.

Why Existing Tools Missed It

The answer is straightforward: there was no CVE to match against.

CVE-based tools work by maintaining a database of known-vulnerable package versions. When you run npm audit, it checks your lockfile against that database. If package@version appears in the database, you get an alert. If it doesn't, silence.

Shai-Hulud was a zero-day. No CVE existed. No advisory had been filed. The infected packages were otherwise-legitimate libraries with new patch versions published using stolen credentials. From the perspective of npm audit, version 2.3.1 of a popular utility was just another routine update.

The attack also made deliberate choices to evade heuristic tools:

GitHub API for exfiltration: most network-based heuristics allowlist github.com as a known-good domain

Bun runtime APIs: security sandboxes that hook child_process.exec and fs.readFileSync saw nothing because the code used Bun.spawn and Bun.file instead

No obfuscation: the code was clean, readable JavaScript — no base64 encoding, no eval, no string concatenation tricks

Which Dependency Guardian Detectors Would Fire

Dependency Guardian doesn't ask "is this package in a vulnerability database?" It asks "what does this code do?" Each of the 26 behavioral detectors examines a specific capability. Here's what would have triggered on Shai-Hulud's payload:

Detector	Severity	What It Caught
install_script	5	preinstall hook executing node scripts/setup.js — a non-benign script running arbitrary code at install time
child_process	4	Shell command spawning during the install lifecycle (4-pass detection: static imports, high-risk calls, inline requires, dynamic identifier-aware patterns)
network_exfil	5	Outbound HTTP calls to api.github.com — the 5-pass architecture catches builtin HTTP, third-party clients, WebSocket, DNS, and global APIs like fetch regardless of the destination domain
ci_secret_access	5	Reading NPM_TOKEN and GITHUB_TOKEN from process.env — the 17-pattern engine with PROC/ENV/D/BO building blocks catches dot notation, bracket notation, destructuring, aliasing, and optional chaining
token_theft	5	Reading ~/.npmrc via file system calls — 21 credential target patterns including .npmrc, .yarnrc, and path.join split-path evasions
bun_runtime_evasion	4	Bun.spawn, Bun.file API usage — detects dot notation, bracket notation (Bun['spawn']), .call/.apply/.bind, globalThis.Bun, and destructured aliases
worm_behavior	5	npm publish commands executed programmatically — catches exec('npm publish'), spawn('npm', ['publish']), Bun.spawn publish variants, and execa wrappers

Seven detectors. Seven independent signals that this package is doing things a color utility should never do.

A Closer Look at the Detection

Consider what the ci_secret_access detector sees when it scans setup.js:

// Shai-Hulud's token harvesting (simplified)
const token = process.env.NPM_TOKEN || process.env.npm_token;
const ghToken = process.env.GITHUB_TOKEN;

The detector's 2-pass per-file architecture first runs findAliases() to discover any process.env aliasing (destructuring, variable assignment, globalThis indirection). Then buildAliasPatterns() generates dynamic regexes specific to that file. Even if the attacker had written:

const e = process["env"];
const t = e["NPM" + "_TOKEN"];

The alias tracking would catch the process["env"] assignment, and the string concatenation evasion pattern would flag the token construction at severity 4.

The worm_behavior detector is similarly thorough. It doesn't just grep for npm publish. It looks for the entire taxonomy of programmatic publishing:

// All of these trigger worm_behavior:
exec("npm publish --access public");
spawn("npm", ["publish"]);
Bun.spawn(["npm", "publish"]);
Bun["spawnSync"](["bun", "publish"]);   // bracket notation evasion
execa("npm", ["publish"]);
execaCommand("npm publish");

Each pattern carries a weight of 3. A single match is enough for a severity 5 finding.

How the Correlator Amplifies Signals

Individual detectors produce findings. The correlator examines combinations of findings across detectors and generates amplifier findings when multiple signals co-occur in the same package. This is where Shai-Hulud's score goes from "suspicious" to "automatic block."

Four amplifiers would fire:

worm_propagation (severity 5, critical)

Fires when worm_behavior + (token_theft OR ci_secret_access) are both present. The correlator's evidence string is explicit:

"This matches the Shai-Hulud worm pattern: steal tokens, publish malicious versions"

This amplifier was added specifically because of Shai-Hulud. Confidence is capped at 0.95, and the critical: true flag means the score is floored at 100.

secret_theft (severity 5, critical)

Fires when ci_secret_access + network_exfil co-occur. The package reads secrets from the environment AND sends data over the network. Confidence derived from the weakest constituent + 0.15 boost, capped at 0.9.

install_download_exec (severity 5)

Fires when install_script + network_exfil co-occur. A lifecycle hook that makes network calls is the canonical dropper pattern. Capped at 0.85 confidence.

bun_evasion_attack (severity 5, critical)

Fires when bun_runtime_evasion + (network_exfil OR token_theft) co-occur. Using Bun-specific APIs alongside network exfiltration or credential theft indicates deliberate sandbox evasion.

All four of these amplifier IDs appear in the UNCONDITIONAL_BLOCK_IDS list in the reporting engine. Any single one of them triggers an automatic block regardless of the numeric score.

What the Scan Output Would Look Like

When Dependency Guardian runs in CI (as a GitHub Action, GitLab CI step, or any PR gate), the scan would produce output like this:

Package: colros@1.5.0

Score: 100 / 100

Verdict: BLOCK

Findings (7 base + 4 amplifiers):

[sev 5] install_script — Suspicious preinstall script

[sev 5] ci_secret_access — Reads NPM_TOKEN, GITHUB_TOKEN from environment

[sev 5] token_theft — Reads ~/.npmrc credential file

[sev 5] network_exfil — HTTP POST to api.github.com

[sev 5] worm_behavior — Executes npm publish programmatically

[sev 4] child_process — Shell command execution during install

[sev 4] bun_runtime_evasion — Bun.spawn, Bun.file API usage

Amplifiers:

[sev 5] worm_propagation — Steals credentials AND self-publishes (CRITICAL)

[sev 5] secret_theft — CI secret access + network exfiltration (CRITICAL)

[sev 5] install_download_exec — Install script with network access

[sev 5] bun_evasion_attack — Bun APIs used for sandbox evasion (CRITICAL)

The PR comment would display: BLOCK: Critical supply chain threat detected.

The critical: true flag on worm_propagation alone is enough to floor the score at 100. Even if the base detector scores somehow summed to less than the block threshold, the critical flag overrides numeric scoring. The scoring engine enforces this:

const hasCritical = findings.some((f) => f.critical === true);
if (hasCritical) {
  raw = Math.max(raw, 100);
}

There is no configuration that overrides a critical finding. The PR is blocked.

The Key Insight

Shai-Hulud was not sophisticated. It didn't use novel exploitation techniques. It didn't chain zero-days in the Node.js runtime. It was a preinstall script that read tokens, made HTTP calls, and ran npm publish. These are ordinary operations — but they are ordinary operations that a color utility package should never perform.

CVE-based tools ask: "Is this package version in our vulnerability database?"

Behavioral analysis asks: "Why is a CSS helper reading ~/.npmrc and posting to the GitHub API during installation?"

That difference is why Dependency Guardian would have blocked Shai-Hulud on the first infected package, before the worm had a second host to spread to. No CVE required. No advisory needed. Just 26 detectors asking what the code actually does, and a correlator that recognizes when the answers form an attack pattern.

The npm ecosystem will see more worms. The question is whether your CI pipeline asks the right questions before npm install finishes running.

Dependency Guardian is free for up to 200 scans/month. Add it to your CI pipeline in 5 minutes: https://westbayberry.com/docs

DEV Community