The axios supply chain attack bypassed every CVE scanner. Behavioral scoring saw it coming.

#javascript #npm #security #devops

The axios supply chain attack bypassed every CVE scanner. Behavioral scoring saw it coming.

On March 31, 2026, axios — 300 million weekly downloads, used in roughly 80% of cloud environments — was compromised. An attacker stole a long-lived npm access token from the lead maintainer and published two malicious versions containing a cross-platform RAT.

Snyk, Dependabot, standard SCA tools: found nothing. Zero CVEs. The malicious versions passed every dependency audit check.

Two tools caught it: Socket (static behavioral analysis) and StepSecurity (real-time network monitoring). Not by checking declarations. By watching what the package actually did.

That's the gap. And it's bigger than axios.

The structural problem CVE databases can't see

npm audit answers one question: "does this package have a known vulnerability?" It doesn't answer: "is this package structurally fragile enough that it will have one?"

Here's what that gap looks like with live data from the npm registry:

chalk: 412 million weekly downloads. 1 maintainer. Zero CVEs.
zod: 161 million weekly downloads. 1 maintainer. Zero CVEs.
axios (before March 31): hundreds of millions of weekly downloads. Critically dependent on a single maintainer's npm access token.

None of these show up in npm audit. All three are single points of failure in the global JS supply chain.

When axios was compromised, its behavioral profile had been warning about this for years. One maintainer. High download volume. Thin governance. The attack didn't create the vulnerability — it exploited one that existed structurally, long before any CVE was filed.

What behavioral scoring measures

CVEs are reactive. A vulnerability has to be exploited, discovered, reported, and cataloged before it shows up in a database. By the time it appears, the window has been open.

Behavioral scoring is structural. It asks: given the current maintainer dynamics, release history, and download volume of this package, what's the structural risk profile?

The dimensions that matter:

Maintainer depth — how many active contributors? A single-maintainer package with 400M weekly downloads is critical infrastructure with no redundancy. That's not a CVE. It's a structural fact that tells you the blast radius if one person's token gets stolen.

Longevity — how long has this package been actively maintained? A six-month-old package with explosive download growth is a different risk category than one with a five-year maintenance record.

Release consistency — is the cadence regular? A sudden new publisher after months of silence is an early signal, not post-compromise evidence.

Download momentum — weekly volume plus trend. High and accelerating means high blast radius.

GitHub backing — commit frequency and contributor health. When 90% of commits come from one person, you have a dependency on that person's continued availability.

Chalk scores CRITICAL on this model. Not because it's a bad package — because it IS critical infrastructure with single-maintainer fragility. The flag is accurate.

What you can run right now

There's an npm package called proof-of-commitment that scores packages on these dimensions:

npx proof-of-commitment chalk
npx proof-of-commitment zod
npx proof-of-commitment --file package.json

MIT licensed. No API key. Works as an MCP server — your AI coding assistant can audit dependencies inline when it recommends adding one (listed in the official MCP registry).

The score isn't a CVE replacement. It's the question CVE databases don't ask: is this package structurally likely to become a problem before anyone notices?

The axios attack was visible in behavioral data before disclosure. The maintainer structure, the access token exposure surface, the download-to-contributor ratio — these signals existed. They weren't being scored.

The broader problem

The supply chain failure is an instance of something that happens everywhere: declarative trust gets gamed. CVE databases trust what packages declare (or don't declare). SOC2 reports trust what companies say about themselves. Review scores trust what users self-report.

Behavioral signals are harder to fake. What the maintenance ecosystem actually does — releases, commits, contributor churn — is a side effect of real work. You can't fabricate a five-year commit history or fake distributed maintainers across three time zones.

That's the thesis behind getcommit.dev: behavioral commitment is the trust signal that survives in a world where declarations are cheap. Starting with npm and PyPI package scoring, expanding from there.

Run npx proof-of-commitment --file package.json on your current project. The chalk number will surprise you.