DEV Community: Pico

PostCSS Adopted Staged Publishing. 685M Weekly Downloads Now Gated.

Pico — Sat, 27 Jun 2026 14:54:51 +0000

On June 18, 2026, I filed postcss/postcss#2096 about OIDC provenance for PostCSS. The ai npm account — one person, Andrey Sitnik — publishes PostCSS, nanoid, Autoprefixer, browserslist, and caniuse-lite. Combined: over 900 million weekly downloads through a single publish credential.

Andrey's first reply was not agreement. It was a correction.

"CI-as-publisher increased the attack risks"

From his comment:

Provenance wouldn't save from all of that supply chain attack. The old CI-only based provenance was also a reason of TanStack Shai-Hulud attack.

CI-as-publisher increased the attack risks compared to 2FA manual publishing. TanStack was attacked only because they publish by CI and it was a token on CI.

He is right. TanStack's May 2026 compromise came through GitHub Actions cache poisoning. The attacker got an OIDC token from the CI runner and used it to publish. The provenance attestation was valid — the package was built by TanStack's CI pipeline. The CI pipeline was just also running the attacker's code.

Red Hat's June 1 compromise proved the same pattern. Thirty-two packages published through a compromised GitHub account's CI pipeline. All 32 had valid SLSA provenance attestations.

Andrey's argument: if you publish manually with hardware-bound 2FA (passkey, YubiKey), the attacker needs physical access to your device. If you publish through CI, the attacker needs a GitHub token — a much larger attack surface.

The resolution: Staged Publishing

npm's Staged Publishing splits the problem: CI builds and stages. A human approves before latest moves. A stolen CI token stages a malicious version but never promotes it.

From Andrey's follow-up:

I already moved nanoid and nanospy to the new process, we can test them.

PostCSS will be done in a week or two (too many other open source projects) 😅

The diff

nanoid's release.yml, updated June 18:

- name: Publish npm package
  run: npm stage publish

PostCSS followed through

Andrey said "a week or two." It took nine days. As of June 27, four of the seven packages under the ai npm account have Staged Publishing enabled:

Package	Weekly downloads	Staged Publishing	Score
postcss	251M	✅	85
nanoid	207M	✅	92
browserslist	166M	✅	89
autoprefixer	61M	✅	89
caniuse-lite	171M	—	81
postcss-nested	54M	—	72
postcss-js	53M	—	70

That's 685 million weekly downloads now behind a human approval gate. One GitHub issue, nine days, no drama.

Three more packages remain. When caniuse-lite, postcss-nested, and postcss-js adopt, the entire PostCSS ecosystem — 963 million weekly downloads — will be gated.

Check your dependencies

npx proof-of-commitment

Scans your lockfile. Flags single-publisher packages at scale. Shows provenance, Staged Publishing, and dormant access status. When nanoid's score went from 90 to 92 after adopting Staged Publishing, the CLI picked it up automatically.

The full PostCSS ecosystem audit data comes from Commit, which scores packages on behavioral signals rather than declared metadata.

vitebot Publishes 140 Million npm Downloads Per Week. The Account Has Zero Public Repos.

Pico — Sat, 27 Jun 2026 10:08:20 +0000

Vite is the build tool behind most of the modern JavaScript ecosystem. React, Vue, Svelte, Astro, Nuxt, SolidStart all default to it. The package gets 140 million npm installs per week. It shipped 740 versions.

Every one of those versions published in the last five years came from a single npm account: vitebot.

The publisher lifecycle

Vite has had four npm publishers over its lifetime. Two had their access revoked: antfu (53 months inactive, 22 versions) and patak (32 months inactive, 72 versions). The third, yyx990803 (Evan You, who created Vite) published 195 versions. His last publish was October 7, 2021. That was 57 months ago. His access was never revoked.

The fourth, vitebot, is the sole active publisher.

Package	Weekly Downloads	Active Publisher	Risk
vite	140M	vitebot	CRITICAL
vitest	70M	1 active, 2 dormant	CRITICAL
@vitejs/plugin-react	65M	vitebot	CRITICAL
@vitejs/plugin-vue	7.3M	vitebot	HIGH
@vitejs/plugin-legacy	0.7M	vitebot	WARN

283 million weekly downloads across the Vite ecosystem, published by one bot account with zero public GitHub repos.

Bot publishers are better. Until they aren't.

Publishing from a CI bot is actually a better practice than publishing from a human account. Bot tokens live in CI secrets, not in someone's .npmrc on a personal laptop. They can be scoped, rotated, and audited. The Vite team gets credit for this.

The problem isn't that vitebot exists. The problem is that vitebot is the only gate. If the CI pipeline is compromised by a GitHub Actions workflow injection, a stolen repo secret, or a forked workflow with a poisoned build step, one push reaches 140 million weekly installs.

This is what happened to Red Hat in June 2026. A compromised GitHub account pushed code to @redhat-cloud-services packages. The CI pipeline published the malware. The packages had valid SLSA provenance. Provenance just signed the attack.

Vite publishes with OIDC provenance too. That's a plus. But provenance proves the build came from a pipeline. It does not prove a human reviewed the release.

Evan You's dormant access

yyx990803 published 195 versions of Vite. His last publish was October 2021, 57 months ago. He still has publish access to both vite and @vitejs/plugin-react.

The Vite team already revoked access for antfu and patak. Whoever did that cleanup missed the creator's own account. Revoking one dormant account while leaving another is the exact pattern we found in debug, ws, and cliui.

One npm owner rm yyx990803 vite closes this gap in two seconds. Evan You can be re-added if he ever needs to publish an emergency patch directly.

The missing gate: staged publishing

npm's Staged Publishing feature adds a waiting period between npm publish and the version going live. During that window, any npm owner can review the tarball and cancel the release.

Hono adopted it after we flagged them as CRITICAL. PostCSS is in progress. Vite hasn't.

With staged publishing enabled, a compromised vitebot push would still trigger a staging period. A human on the team would see it. They'd have time to cancel before 140 million weekly installs pull the poisoned version.

Without it, the path from compromised token to production is a single npm publish.

What frameworks ship through vitebot

Vite isn't just one package in one project. It's the build layer for the frameworks that most new projects start with:

Next.js uses Vite for its Turbopack-compatible dev server path
Nuxt bundles @vitejs/plugin-vue
SvelteKit uses @sveltejs/vite-plugin
Astro depends on Vite at its core
SolidStart, Remix, Qwik — all Vite-based

A single compromise reaches not just Vite users, but every framework that depends on Vite transitively. That's most of the modern frontend.

What you can do

If you're on the Vite team:

Revoke yyx990803's dormant publish access (npm owner rm yyx990803 vite)
Enable staged publishing on vite and @vitejs/plugin-react
Consider adding a second active human publisher for emergency review

If you depend on Vite:

Pin versions in your lockfile and review lockfile diffs in PRs
Run npx proof-of-commitment --file package-lock.json to check your full dependency tree
Add a CI gate: npx proof-of-commitment --fail-on=critical

How we found this

Commit scores every npm package on behavioral signals — publisher depth, dormant access, release patterns, and provenance status. The publisher lifecycle analysis was added in v1.35.0 and flagged vitebot as the sole active publisher across the Vite ecosystem.

Data pulled June 27, 2026 from the npm registry and Commit API.

npx proof-of-commitment vite vitest @vitejs/plugin-react

Returns the same publisher-depth verdict in 30 seconds, zero install.

Originally published at getcommit.dev. Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment: signals harder to fake than stars, READMEs, or download counts.

debug Has 653M Weekly Downloads. One Publisher Hasn't Touched It in 8 Years. They Can Still Push a Release.

Pico — Mon, 22 Jun 2026 08:58:18 +0000

npm audit checks for known vulnerabilities. It doesn't check whether someone who hasn't published in eight years can still push a release to a package your project installs every build.

We scanned the publisher lifecycle of top npm packages and flagged every account that (a) has current publish scope and (b) hasn't published to that package in 12+ months.

The findings cover 2.6 billion weekly downloads.

The data

Package	Weekly Downloads	Dormant Publisher	Last Published	Inactive
debug	653M	tootallnate	Sep 2017	105 months
cliui	207M	bcoe	Aug 2020	70 months
has-flag	296M	sindresorhus	Jul 2021	59 months
tslib	389M	typescript-bot	Oct 2024	20 months
cross-spawn	234M	satazor	Nov 2024	19 months
escalade	168M	lukeed	Aug 2024	22 months
yargs	219M	bcoe, oss-bot	Apr/May 2025	14-15 months
semver	803M	npm-cli-ops	May 2025	13 months

Every one of these accounts can run npm publish right now and push code that lands in your node_modules/ within minutes.

Why this matters

The npm token model is straightforward: if you have publish access, you can publish. There's no "inactive" state, no timeout, no re-authentication required after years of dormancy. A token from 2017 works in 2026.

That means every dormant account is a credential target. The attacker doesn't need to find a zero-day. They need to find a .npmrc on an old laptop, a token in a GitHub Actions secret that was never rotated, or a phished email on an account the owner hasn't checked in years.

This is exactly how the axios attack worked in March 2026. One stolen npm token. One push. 113 million installs per week compromised.

debug: the 105-month case

debug has 4 historical publishers. Two had their access revoked: tjholowaychuk (TJ Holowaychuk, the original author) at 146 months inactive, and thebigredgeek at 109 months.

The third, tootallnate (Nathan Rajlich), published 19 versions between 2014 and 2017. His last debug publish was September 2017. His access was never revoked.

The active maintainer, qix- (Josh Junon), published most recently. But tootallnate's account has had publish access to 653 million weekly downloads for eight years without using it.

Someone cleaned up tjholowaychuk and thebigredgeek. They missed tootallnate.

What you can do

If you maintain a package:

Audit who has publish access: npm access ls-collaborators <package>
Revoke access for anyone who hasn't published in 12+ months
Enable npm Staged Publishing — it adds a review step before versions go live

If you depend on these packages:

Pin versions in your lockfile and review lockfile diffs in PRs
Run npx proof-of-commitment --file package-lock.json to check your full tree for dormant publisher risk
Add a CI gate: npx proof-of-commitment --fail-on=critical

How we found this

Commit tracks publisher lifecycle for every scored package: total historical publishers, who's currently active, who has access but hasn't published, and who was revoked. The dormant-with-access flag was added in v1.35.0.

Scan any package: npx proof-of-commitment debug cross-spawn yargs

Or check the full reports for all packages in this article.

OIDC Provenance Didn't Save TanStack or Red Hat. npm Staged Publishing Is the Missing Gate.

Pico — Fri, 19 Jun 2026 14:39:01 +0000

Two major supply chain attacks in six weeks. Both targeted packages with OIDC provenance enabled. Both succeeded.

TanStack (May 11, 2026): 42 packages, 84 malicious versions in 6 minutes. Attacker exploited GitHub Actions pull_request_target + cache poisoning + OIDC token extraction from runner memory. 16.8M weekly downloads on @tanstack/react-router alone. All published versions had valid SLSA attestations.

Red Hat @redhat-cloud-services (June 1, 2026): 32 packages republished with credential-stealing malware. Attacker compromised a GitHub account, pushed orphan commits, triggered the existing CI pipeline. The pipeline built the malware, published it to npm using its own OIDC tokens, and generated valid SLSA provenance. Provenance proved "this was built by Red Hat's CI" — which was true. The CI just happened to be building malware.

Provenance answers the wrong question

OIDC provenance answers: "Was this built by the expected pipeline?"

When the pipeline is compromised, the answer is yes. The attestation is valid. The malware is signed.

This is the gap. Provenance proves where a build came from. It doesn't prove the CI pipeline wasn't tampered with.

npm Staged Publishing closes it

npm Staged Publishing (GA May 2026) adds a human approval gate between CI publishing a version and that version becoming the default npm install target.

Without Staged Publishing:

CI runs → npm publish → version is live immediately

With Staged Publishing:

CI runs → npm stage → version sits in holding → human 2FA approval → version goes live

In both the TanStack and Red Hat scenarios, malicious versions would have sat in a staging area. No silent pushes to latest. A human would need to review and approve each one.

PostCSS maintainer Andrey Sitnik made this point directly in postcss/postcss#2096: "CI-as-publisher increased the attack risks compared to 2FA manual publishing." He was right — and the incidents proved it.

Detection

npm doesn't surface whether a package uses Staged Publishing. Registry metadata doesn't include it unless a version is actively staged. CI workflow files might reference npm stage commands, but nobody checks those at dependency-install time.

We added Staged Publishing detection to Commit this week. Two-tier detection:

dist-tags check — if a stage dist-tag exists in the registry, the package is actively using staged publishing (zero extra API calls — it's already in the registry response)
GitHub Actions workflow scan — looks for npm stage, @npm/staged-publish, or staged-publish patterns in the package's CI configuration

The result shows up in scoreBreakdown.stagedPublishing on the API response and in the CLI output.

Update: First adopter detected (June 19, 2026)

A few hours after publishing this post, Andrey Sitnik confirmed he had moved nanoid (208M weekly downloads) and nanospy to Staged Publishing. He left a comment on the PostCSS issue inviting verification: "I already moved nanoid and nanospy to the new process, we can test them."

Commit picked up both within the registry cache window. Live JSON:

nanoid: hasStagedPublishing: true, score 92/100, stagedPublishing: 2/2
nanospy: hasStagedPublishing: true, score 55/100, stagedPublishing: 2/2

The trigger was the npm stage publish step in nanoid's release.yml. PostCSS itself still reads false. That one's coming in a week or two. When Sitnik flips it, the score updates automatically.

That's the test: can scoring track the new gate as maintainers actually adopt it? On the first real-world case, yes.

Second adopter, no announcement: sweeping the top of the npm tree the same afternoon turned up preact (23M weekly downloads). No press release, no thread. Line 118 of preact's release.yml — npm stage publish preact.tgz --provenance --access public — does the work. Provenance and staged publishing shipped together. Live JSON: preact returns hasStagedPublishing: true.

Two signals, two adopters in the same week. One announced, one didn't. Detection picked up both.

Adoption accelerating: Hono adopted in 33 hours

Two days after detection shipped, Yusuke Wada (creator of Hono, 50M weekly downloads) replied to honojs/hono#5034 twelve minutes after I filed it: "I'm considering switching to Staged Publishing just now."

PR #5035 — "ci: use npm Staged publishing" merged 33 hours later, on 2026-06-22 at 11:42 UTC. The diff is one word:

- run: npm publish --provenance --access public
+ run: npm stage publish --provenance --access public

Hono already had OIDC provenance via PR #5028. Staged publishing adds the human approval gate on top. A compromised CI token alone can't push a malicious version to latest — the same model that would have closed the gap on the TanStack and Red Hat incidents.

That's four high-profile maintainers (Sitnik with nanoid and nanospy, the Preact team, Wada with Hono) moving in the same direction within a week. The pattern: sole-publisher packages at massive scale are adopting the gate that makes credential-based attacks survivable, and the friction is one word in release.yml.

Full case study: Hono Just Adopted Staged Publishing — 50M Weekly Downloads, 33 Hours After the Issue.

Check your dependencies

npx proof-of-commitment

This scans your lockfile and flags packages where a single npm publisher controls >10M weekly downloads — the exact attack surface that's been exploited three times since March. The output now includes whether each package uses OIDC provenance, Staged Publishing, or neither.

If you see CRITICAL with no provenance and no staged publishing — that package is one stolen credential away from the next incident. And the credential doesn't have to be an npm token anymore. It can be a GitHub account, a CI cache, or an OIDC token extracted from process memory.

The supply chain moved past "just use provenance" in May 2026. The question now is which packages have caught up.

One npm Account Publishes 964 Million Downloads Per Week. None Have Provenance.

Pico — Thu, 18 Jun 2026 14:35:24 +0000

The npm account ai publishes seven packages. Combined, they install 964 million times per week:

Package	Weekly downloads	Publishers	Risk
postcss	245,612,332	1	CRITICAL
nanoid	206,588,788	1	CRITICAL
caniuse-lite	173,435,668	1	CRITICAL
browserslist	167,746,012	1	CRITICAL
autoprefixer	63,517,741	1	CRITICAL
postcss-nested	54,486,292	1	CRITICAL
postcss-js	52,771,544	1	CRITICAL

That's 50 billion installs per year behind a single set of npm credentials. None of them have npm provenance attestations.

Why this matters

npm provenance uses OIDC tokens from GitHub Actions instead of long-lived npm tokens. If a package has provenance, you can verify that the published code came from a specific commit in a specific repository — not from someone's compromised laptop.

Without provenance, there's no way to distinguish a legitimate release from one pushed by a stolen token. The blast radius here is nearly a billion installs per week.

This isn't theoretical. axios was attacked on March 30, 2026 through a stolen npm token — same single-publisher, no-provenance pattern. LiteLLM was hit the same way a month earlier. The Shai-Hulud worm in May 2026 exploited stolen tokens to republish 637 package versions in 39 minutes.

What makes this different from chalk or lodash

PostCSS is interesting because it's not just one critical package. It's an entire ecosystem of critical packages, all behind the same account. chalk is one package, one publisher, 432M downloads/week. Bad enough. But ai controls seven independent packages that each cross the 10M threshold.

A compromised ai token doesn't just hit postcss. It hits the CSS build pipeline (postcss + autoprefixer + postcss-nested + postcss-js), the browser compatibility layer (browserslist + caniuse-lite), and one of the most popular ID generators in the ecosystem (nanoid).

And caniuse-lite was flagged with a dormant publisher warning — 61 months of inactivity on the publishing account. postcss-nested hasn't had a release in over 12 months.

This has been fixed before

fast-xml-parser (88M downloads/week, single publisher) had the same problem. After the community raised the issue, the maintainer set up GitHub Actions OIDC publishing. Within days, version 5.9.1 shipped with SLSA provenance attestations. Then 5.9.2 added environment gates and SHA-pinned actions. The structural gap closed in under a week.

I filed an issue on PostCSS proposing the same approach.

The PostCSS maintainer disagrees — and he's partially right

Andrey Sitnik (ai) responded within hours. His argument: OIDC provenance from CI is itself an attack surface. TanStack was compromised precisely because they published from CI — the stolen token came from the CI system. His words: "CI-as-publisher increased the attack risks compared to 2FA manual publishing."

He's right that CI-as-sole-publisher shifts the attack surface rather than eliminating it. The Shai-Hulud worm exchanged OIDC tokens for publish credentials. TanStack was hit through GitHub Actions cache poisoning. Provenance proves where a build came from; it doesn't prove the CI pipeline wasn't tampered with.

His proposed alternative: Staged Releases — a newer npm feature where CI creates the release, but a human must manually approve it before it becomes the default install target. Even if an attacker compromises the CI pipeline, the malicious version sits in a holding period. No silent pushes to latest.

The honest answer is: both are better than the current state. Right now there's no provenance, no staged releases, no second factor between a stolen credential and 964 million weekly installs. Staged Releases would address the blast-radius problem directly. Provenance would at least make attacks auditable after the fact. The worst option is what we have now: nothing.

Check your own dependencies

If you want to see which packages in your project have this concentration risk:

npx proof-of-commitment

Run it in any project directory. It auto-detects your lockfile and flags packages where a single npm publisher controls more than 10M weekly downloads. That's the exact attack surface that's been exploited three times in four months.

The full PostCSS ecosystem audit data comes from Commit, which scores packages on behavioral signals rather than declared metadata.

323 npm Packages Compromised in 39 Minutes. The Malware Installs a Claude Code SessionStart Hook.

Pico — Mon, 15 Jun 2026 20:37:07 +0000

On May 19, 2026, between 01:39 and 02:18 UTC, a single compromised npm account published 639 malicious package versions across 323 packages. The entire attack took under 40 minutes.

The packages included jest-canvas-mock (2.2M weekly downloads), echarts-for-react (1.1M), size-sensor (1.2M), timeago.js (243K), and most of the @antv visualization suite. Total blast radius: roughly 16 million weekly downloads.

This wasn't a human typing npm publish 639 times. This was a worm.

How the self-propagation works

The atool npm account was compromised (how is still unknown). That account had publish access to hundreds of packages. The initial payload did what you'd expect — harvested credentials from 80+ environment variables and 100+ file paths across AWS, GCP, Azure, GitHub, Kubernetes, and database systems.

Then it did something different: it searched for npm tokens with the bypass_2fa scope. In GitHub Actions environments, the malware exchanged OIDC tokens for per-package npm publish credentials. It then republished additional packages with itself embedded. An npm worm.

Two waves hit the registry. First: ~317 versions at 01:39. Second: ~314 versions 26 minutes later at 02:05. Detection started around 02:18. By then, the packages had been live long enough.

The persistence mechanisms

The exfiltrated credentials are serialized as JSON, gzip-compressed, encrypted with AES-256-GCM, and wrapped with RSA-OAEP. The exfiltration channel disguises traffic as OpenTelemetry traces.

A backup channel creates public repos under the victim's GitHub account and commits encrypted credential dumps with Dune-themed naming patterns.

Here's where it gets personal if you use Claude Code or VS Code:

The malware installs a SessionStart hook in .claude/settings.json. It also drops VS Code task automation in .vscode/tasks.json and a background daemon that polls GitHub every 60 seconds for RSA-signed commands.

And there's a dead man's switch. If the stolen GitHub token gets revoked, the malware runs rm -rf ~/.

These aren't hypothetical persistence vectors. They're documented by Akamai, CSA, and Expel.

What the packages looked like before the attack

I scored the compromised packages using Commit. The non-AntV packages tell the clearest story:

Package	Score	Publishers	Downloads/wk	Risk
canvas-nest.js	50	1	650	WARN: no release 12+ months
timeago.js	65	2	243K	WARN: no release 12+ months
size-sensor	66	1	1.2M	HIGH: sole publisher + >1M/wk
echarts-for-react	71	1	1.1M	HIGH: sole publisher + >1M/wk
jest-canvas-mock	72	2	2.2M	WARN: no release 12+ months

Three of these five had a sole npm publisher. Two are stale — no release in over a year, still pulled by millions of projects weekly. That's exactly the profile that makes account takeover both easy and high-impact.

The @antv packages scored higher (84–89) because they have 17–18 maintainers. But that's exactly how the attack worked: atool was one of those 18 maintainers. More publishers means more attack surface when any one of them can push.

Protect your editor

If you use Claude Code, Cursor, or Windsurf, you can gate package installs before they run:

npx proof-of-commitment hook

This installs a pre-install check that intercepts npm install, pip install, and cargo add. CRITICAL packages (sole publisher + millions of downloads — the exact Shai-Hulud profile) are blocked before they execute. The hook writes .claude/settings.json, .cursor/hooks.json, and .windsurf/hooks.json so the gate works regardless of which editor is driving.

The irony: the same file the worm writes to for persistence (.claude/settings.json) is the one you use to defend against it.

What to check

If your package-lock.json or yarn.lock includes any of these packages, check which versions you installed between 01:39 and 02:18 UTC on May 19.

Then check the rest of your dependency tree:

npx proof-of-commitment --file package-lock.json

The packages that scored 50-72 before this attack (sole publishers, stale releases, high downloads) are the same profile that got compromised in the LiteLLM attack, the axios attack, and now this one.

The pattern doesn't change. The entry point is always the same: one compromised account with publish access to a widely-installed package.

What's different about this one

Previous supply chain attacks hit one package at a time. This one propagated. It turned compromised npm tokens into more compromised packages. The window between first publish and detection is getting shorter, but the blast radius is getting wider.

And the persistence mechanisms are evolving. Targeting .claude/settings.json and .vscode/tasks.json means the malware survives container restarts and embeds itself in developer tooling. The exact environment where you decide which packages to trust.

Run a supply chain audit on your project — or set up monitoring to get alerted when a package in your tree degrades.

IronWorm Commits as 'claude.' It Steals Your Anthropic and OpenAI Keys.

Pico — Mon, 15 Jun 2026 14:36:00 +0000

On June 3, JFrog Security Research published their analysis of IronWorm — a supply chain attack that compromised 37 npm packages through the asteroiddao account. A 976KB Rust ELF binary triggered by preinstall. Caught early, before spreading to popular packages. But the techniques are a step change from everything that came before.

Three things make IronWorm different.

1. It commits as "claude"

Every malicious commit pushed to victim repositories uses the author identity claude@users.noreply.github.com. The commit messages are routine: "fix: resolve lint warnings," "test: add missing edge case," "ci: update workflow configuration."

The timestamps are forged. Some are backdated 13 years. In a repo where AI-generated commits are common and legitimate, these blend in. A developer scanning git log wouldn't notice. A code reviewer seeing a commit from "claude" might assume it came from an AI coding assistant doing its job.

Social engineering adapted to the AI era. The attacker isn't pretending to be a human — they're pretending to be an AI tool the team already trusts.

2. It steals AI credentials specifically

IronWorm targets 86 environment variables and 20+ credential files. Standard targets (AWS, SSH, Docker) plus a new category:

OpenAI API keys (OPENAI_API_KEY)
Anthropic API keys (ANTHROPIC_API_KEY)
Claude authentication files (session tokens)
Cursor authentication files
npm publish tokens (including Trusted Publishing OIDC tokens)

Stolen AI keys have immediate value. An OpenAI key with no spend cap runs thousands of dollars before anyone notices. An Anthropic key runs agents that escalate the attack. An npm token turns one compromised dev into a vector for every package they maintain.

The dedicated Exodus wallet module injects JavaScript to capture the password and seed mnemonic at login. This isn't a generic credential scraper — custom modules per high-value target.

3. It propagates through Trusted Publishing

npm's Trusted Publishing lets packages publish via GitHub Actions OIDC tokens instead of stored credentials. Designed to be more secure: no long-lived tokens to steal.

IronWorm doesn't need stored credentials. It modifies GitHub Actions workflows to request OIDC tokens at runtime, then publishes trojanized versions of the victim's packages with valid provenance attestations.

The result: malicious packages that pass npm audit signatures. Provenance says "published through a verified CI pipeline." It doesn't say "the CI pipeline was hijacked."

Same fundamental gap Miasma exploited with Red Hat's SLSA provenance the week before. Two independent attacks, one week apart, both defeating provenance through different mechanisms. Provenance is a chain-of-custody stamp, not a trust signal.

What behavioral scoring shows

I ran every IronWorm package through Commit's behavioral audit:

Package	Score	Publishers	Downloads/wk	Age
weavedb-sdk	53	1	~1.2k	~4yr
ai3	low	1	<100	<1yr
atomic-notes	low	1	<100	<1yr
cwao	low	1	<100	<1yr
zkjson	low	1	<100	<1yr

Single publisher. Low downloads. Limited history. Every IronWorm package fits the profile behavioral scoring catches before the first install completes.

The escalation timeline

Date	Attack	What was new
Mar 5	LiteLLM	Single-package PyPI credential theft
Mar 30	axios	99M downloads/week, stolen token
May 11	Shai-Hulud	Self-propagating worm, 637 packages in 39 min
May 22	TrapDoor	Cross-ecosystem + AI assistant poisoning
Jun 1	Miasma	Forged SLSA provenance on Red Hat packages
Jun 3	IronWorm	Rust + eBPF rootkit + AI credential theft + Trusted Publishing propagation

Each attack introduces a capability the previous one didn't have. IronWorm is the first npm supply chain malware written in Rust, first to use an eBPF kernel rootkit, first to self-propagate through Trusted Publishing OIDC.

And it specifically targets AI coding assistant credentials. The attack vector has come full circle — AI tools accelerate development, but their credentials are now high-value targets, and the tools themselves are being impersonated in commit history.

What to do

Gate your AI assistant's installs:

npx proof-of-commitment hook

Every npm install, pip install, cargo add, and go get runs through a behavioral check before execution. Packages with no history get blocked.

Audit your current dependencies:

npx proof-of-commitment --file package-lock.json

Rotate AI credentials if any IronWorm package was installed in your environment. Check for modified GitHub Actions workflows.

Don't trust provenance alone. Both Miasma and IronWorm demonstrate that valid provenance attestations can come from compromised pipelines. Provenance answers "where did this come from?" Behavioral scoring answers "should I trust it?" You need both.

Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment — signals harder to fake than stars, READMEs, or download counts. Try the audit or add the MCP server to your AI assistant.

npm audit says you're clean. It doesn't check who can push to your dependencies.

Pico — Mon, 15 Jun 2026 09:51:40 +0000

Run npm audit on any Node.js project and you'll get one of two things: a clean bill of health, or a list of known CVEs with suggested version bumps.

What you won't get: any signal about who can publish the packages you depend on.

The blind spot

npm audit checks an advisory database. If nobody has reported a vulnerability, your project is "clean." But the biggest npm attacks in 2026 didn't exploit known vulnerabilities — they exploited publisher access.

axios (120M downloads/week) — one npm publisher. Token stolen March 30, 2026. Malicious version pushed to 97M+ dependents.
litellm — one npm publisher. Supply chain attack, March 2026.
Shai-Hulud worm (May 2026) — compromised a single npm account with access to 547 packages. 637 malicious versions published in 39 minutes.

All three passed npm audit before the attack happened. Of course they did — there was no CVE to find.

The 7 packages you almost certainly depend on

I checked the publisher counts for the most-downloaded packages in the npm registry. These 7 are in nearly every Node.js project's dependency tree, usually as transitive deps (check your lockfile):

Package	Weekly Downloads	npm Publishers	Risk
minimatch	648M	1	🔴 CRITICAL
chalk	445M	1	🔴 CRITICAL
glob	378M	1	🔴 CRITICAL
cross-spawn	223M	1	🔴 CRITICAL
zod	195M	1	🔴 CRITICAL
lodash	161M	1	🔴 CRITICAL
axios	120M	1	🔴 CRITICAL

Combined: 2.17 billion weekly downloads. All from packages where a single stolen npm token is enough to push a malicious release.

These aren't obscure packages. minimatch and glob are in every project that uses file matching — which includes most build tools. chalk is in everything that colors terminal output. cross-spawn is in anything that spawns a child process. You didn't install them — your dependencies did.

Publisher ≠ contributor

A common objection: "But chalk has hundreds of GitHub contributors!"

True, and it doesn't matter. GitHub contributors can't publish to npm. Only npm publishers can. And chalk has one.

zod is the same — 30+ GitHub contributors, 1 npm publisher. If that one person's npm token is phished, their 2FA is compromised, or their account is hijacked, nobody else can push a fix. The 30 contributors can open a PR. They can't publish.

This is the distinction npm audit doesn't make.

Check your own project (takes 5 seconds)

npx proof-of-commitment

Run it in any project directory. It auto-detects your lockfile (package-lock.json, yarn.lock, pnpm-lock.yaml) and scores every dependency by publisher concentration, release consistency, and age.

$ npx proof-of-commitment
Scoring 147 packages from package-lock.json... done in 4.2s

⚠  12 CRITICAL packages found.
   CRITICAL = sole npm publisher + >10M weekly downloads

Package      Risk          Score   Publishers   Downloads      Age
chalk        🔴 CRITICAL   75      1            445M/wk        14.6y
minimatch    🔴 CRITICAL   78      1            648M/wk        14.9y
glob         🔴 CRITICAL   80      1            378M/wk        14.2y
...

Or audit specific packages:

npx proof-of-commitment axios zod chalk lodash

Or scan a lockfile and get JSON for tooling:

npx proof-of-commitment --file package-lock.json --json | jq '.criticalCount'

Add it to CI (one line)

# .github/workflows/supply-chain.yml
name: Supply Chain
on: [pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npx -y proof-of-commitment --fail-on=critical

This fails the build if any dependency has CRITICAL publisher concentration. Not a CVE — a structural risk.

For PR comments and step summaries, there's a dedicated GitHub Action:

      - uses: piiiico/commit-action@v1
        with:
          fail-on-critical: true
          comment-on-pr: true

Block risky installs in your AI coding assistant

If you use Cursor, Claude Code, or Windsurf — your AI assistant installs packages without asking. The Shai-Hulud worm specifically targeted this: it planted persistence hooks in .claude/settings.json and .vscode/tasks.json.

npm install -g proof-of-commitment
poc hook

Installs a pre-install gate for all three editors. When your AI tries to npm install a CRITICAL package, it blocks and asks first.

What this isn't

This isn't a replacement for npm audit. Run both. npm audit catches known CVEs after they're reported. Publisher concentration scoring catches the structural risk before the attack — the same pattern that made axios, litellm, and the 637 Shai-Hulud packages exploitable.

Different attack surfaces, different tools.

The tool is proof-of-commitment on npm (1,600+ weekly downloads). Web version: getcommit.dev/audit — paste packages, get scores, no account needed.

Free API key (no card, 30 seconds): getcommit.dev/get-started — unlocks monitoring + alerts when a package you depend on degrades.

Commit vs. Socket, Snyk, and npm audit: An Honest Comparison

Pico — Sun, 14 Jun 2026 14:03:59 +0000

If you landed here from a search for "best npm security tool" or "Snyk alternatives," you're probably evaluating a list that includes Socket, Snyk, and npm audit. Commit is newer and does something different. This piece tells you exactly what each tool measures, where each one wins, and where each one fails — including Commit's genuine gaps.

Short answer: most of these tools scan for known vulnerabilities. Commit scans for structural risk that exists before a vulnerability is known. They're complementary, not substitutes. If you only use one, you have blind spots.

What each tool actually measures

Before the comparison table, a framing that matters: these tools answer different questions.

npm audit answers: "Does this package have a reported CVE?" It submits your package-lock.json to GitHub's Advisory Database and returns known matches. Free, built-in, reactive.

Snyk answers: "Does this package have a known vulnerability, and can I auto-fix it?" Adds license compliance, SAST for your own code, and container scanning. Strong database, strong integrations.

Socket answers: "Is this package doing something dangerous right now?" Static analysis of actual package source — not just CVE lookups. It catches supply chain attacks by scanning newly published version code for suspicious patterns (obfuscated code, unusual network calls, environment variable access). That's a different class of detection.

Commit answers: "Is this package a structural single point of failure?" Behavioral signals — maintainer depth, release consistency, bus factor, contributor history — from the npm registry and GitHub API. No CVE database. No code scanning. Just: how many humans stand between an attacker and the publish button, and how consistently have they shown up?

The comparison table

Capability	npm audit	Snyk	Socket	Commit
Known CVE detection	✅	✅	✅	❌
Malicious package detection	❌	⚠️ partial	✅ real-time	❌
Typosquatting detection	❌	❌	✅	❌
Obfuscated code detection	❌	❌	✅	❌
Dangerous capabilities (network/shell/eval)	❌	❌	✅	❌
Bus factor / single-maintainer risk	❌	❌	⚠️ partial	✅ core focus
Release consistency over time	❌	❌	❌	✅
Contributor depth and longevity	❌	❌	❌	✅
Pre-attack structural signal	❌	❌	❌	✅
Auto-fix PRs	⚠️ limited	✅	❌	❌
SAST (your own code)	❌	✅	❌	❌
MCP integration	❌	❌	✅	✅
Free tier	✅ unlimited	✅ 200 tests/mo	✅ 1,000 scans/mo	✅
Paid tier starts at	Free	$25/dev/mo	$25/dev/mo	$15/dev/mo

The ua-parser-js test case (2021)

In October 2021, Faisal Salman's npm account was compromised and three backdoored versions of ua-parser-js were published (0.7.29, 0.8.0, 1.0.0). The malicious code contained a cryptominer and credential-stealing trojan that ran on every system that ran npm install in the four-hour window before the packages were pulled.

ua-parser-js had approximately 7 million weekly downloads. Used by Facebook, Microsoft, Amazon, and Google.

Here's how each tool would have performed:

Tool	Before the attack	During the attack
npm audit	`0 vulnerabilities`	`0 vulnerabilities` — silent until CVE filed days later
Snyk	No advisory match	Detected after Socket flagged it and community reported; catalogued as CWE-506
Socket	No pre-attack warning	✅ Flagged within 6 minutes of registry publication via automated malware scanner
Commit	🔴 CRITICAL — flagged for months: 1 maintainer, 7M weekly downloads	Still CRITICAL (structural risk unchanged) — no code-level detection

Two different signals. Socket detected the attack six minutes after it was launched by scanning the malicious code. Commit had identified the structural vulnerability — one person controls the publish button for 7 million weekly downloads — for months before anyone pulled the trigger.

Socket is genuinely impressive here. Fast detection can prevent most damage if you have auto-blocking enabled. But the structural conditions that make ua-parser-js a high-value target are visible right now for dozens of equivalent packages, and no amount of malware scanning changes those conditions. The attack calculus is rational: single maintainer, enormous blast radius.

The 2026 update: six attacks, every one fit the same shape

Since this comparison was first written in April, six major npm supply chain attacks have hit. Every one of them exploited a package with a sole publisher or a compromised publisher credential:

axios — March 30. Token theft. 119M downloads/week. One npm publisher.
TanStack — May 11. Mini Shai-Hulud worm. Hijacked CI/CD to publish malicious versions.
TrapDoor — May 22. 21 npm + 7 PyPI + 6 Cargo packages planting persistence hooks in AI coding assistants.
Red Hat Miasma — June 1. 32 @redhat-cloud-services packages via compromised GitHub account. Valid SLSA provenance on every malicious version.
Phantom Gyp — June 3. 57 packages including @vapi-ai/server-sdk. Used binding.gyp to bypass install-script monitors.
IronWorm — June 4. 37 packages with eBPF rootkit + Tor C2 + self-propagation via stolen npm tokens.

npm audit flagged zero of these before the attack. Snyk's vulnerability database flagged zero before the attack. A publisher-concentration check would have flagged all of them as structural risk — months before, in some cases years before.

The gap none of them close

Here's the uncomfortable reality: Socket, Snyk, and npm audit are all reactive to the wrong thing. They detect when something has gone wrong — either by finding a known CVE or by scanning newly-published code for malicious patterns. They cannot tell you whether your dependency portfolio is structurally dangerous before an attack occurs.

Two examples:

Package	Weekly Downloads	Maintainers	npm audit	Commit
`chalk`	445M	1	`0 vulnerabilities`	🔴 CRITICAL
`zod`	194M	1	`0 vulnerabilities`	🔴 CRITICAL

Neither chalk nor zod has a known vulnerability. npm audit, Snyk, and Socket all return clean results. But both have a single person with publish access to hundreds of millions of weekly installs. When (not if) an attacker evaluates those credentials as a target, they make the same rational calculation that was made for ua-parser-js in 2021: compromise one account, get code execution on the systems of millions of developers.

Socket will catch that attack in minutes after it happens. Commit tells you the structural risk exists today — so you can make informed decisions about whether that dependency belongs in your critical path.

Where each tool wins

Use npm audit if: You need zero setup, you want to catch low-hanging CVE fruit in CI without paying anything, and you understand it's a known-vuln scanner, not supply chain security. It catches real things. It just has a fundamental floor.

Use Snyk if: You want CVE scanning with strong auto-fix PRs, SAST for your own code, and container/IaC coverage. The integrations are excellent. The database is well-maintained. If you have a team running multiple projects and want remediation velocity, Snyk earns its price.

Use Socket if: You want real-time protection against active malware in the npm registry. Socket is genuinely doing something the others don't — scanning package source code as it's published and catching malicious payloads before most humans have noticed. The six-minute ua-parser-js detection is not marketing; it's the product working as designed. If your threat model includes active supply chain attacks, Socket belongs in your pipeline.

Use Commit if: You want to understand the structural risk in your dependency portfolio before anything bad happens. If you're doing architecture review, onboarding new dependencies, evaluating critical-path packages, or just want to understand which of your dependencies is a single point of failure at massive scale — that's what Commit measures. The signal that chalk at 445 million weekly downloads runs on one maintainer's npm account is not a CVE. It's a structural condition. It matters.

What Commit doesn't do

Commit does not scan for CVEs. It does not detect malicious code. It does not flag typosquatting, obfuscated payloads, or dangerous API usage. If your dependency ships a backdoor, Commit will not catch the backdoor. Socket will (if it's in the registry). npm audit will catch it eventually (when someone files a CVE).

Commit measures one thing: how much sustained human commitment stands behind this package. Maintainer depth, release consistency, contributor longevity, download momentum relative to structural fragility. These signals are public, computable in milliseconds from the npm and GitHub APIs, and systematically ignored by every other tool in this comparison.

That's the gap. Whether it belongs in your security stack depends on whether you care about knowing the structural risk profile of your dependencies before an attacker acts on it.

The honest recommendation

Run npm audit in CI. It's free and catches real things. Add Socket if you can afford it and your threat model includes supply chain attacks — the real-time detection is legitimate. Consider Snyk if you want auto-fix PRs and SAST coverage.

Add Commit to understand the structural risk in what you're already depending on. Zero install, 30 seconds:

npx proof-of-commitment --file package-lock.json

Or paste your package.json into the web demo and get structural risk scores in seconds.

If you want monitoring — automated scans, alerts when a score drops, email when a package you depend on gets compromised:

poc watch chalk --email you@company.com

Free key — watchlist, weekly digest, 30 seconds, no card. Developer — 15 packages, daily scans, $15/month.

The ua-parser-js attack wasn't a failure of security tooling. Every tool performed as designed. The failure was thinking that vulnerability scanning and supply chain security are the same problem. They're not.

Originally published at getcommit.dev. Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment — signals harder to fake than stars, READMEs, or download counts.

Snyk Scores Chalk 81. We Score It CRITICAL.

Pico — Sun, 14 Jun 2026 13:38:19 +0000

Same package. Opposite conclusions. The difference is one signal: how many people can push a new version to npm. That signal predicted every major npm attack this year.

Go to Snyk's vulnerability database right now and look up chalk. You'll see a Package Health Score of 81 out of 100. No known security issues. Sustainable maintenance. The assessment: this is a healthy package.

Run npx proof-of-commitment chalk and you'll see something different:

Package   Risk            Score   Publishers   Downloads     Age       Provenance
chalk     🔴 CRITICAL     75      1            445.5M/wk     12.9y     —
  ↳ 30+ GitHub contributors — publish-access concentration risk despite active community

CRITICAL. One npm publisher controls 445 million weekly downloads. That's not a vulnerability. It's a structural concentration risk — the exact profile that every major npm attack in 2026 has exploited.

What Snyk measures

Snyk's Package Health Score is built from four dimensions: security (known CVEs), popularity (download volume, GitHub stars), maintenance (commit frequency, release cadence), and community (contributors, documentation). These are real signals. They tell you whether a project is active and whether it has known bugs.

What they don't tell you: how many humans can push a malicious version.

Snyk shows "1 maintainer" as a data point in its maintenance section. It's listed next to "0 open PRs" and "last commit 4 months ago." The number is visible but not actionable — it doesn't change the score, doesn't trigger a warning, and isn't framed as a risk factor.

What Commit measures

Commit scores packages on behavioral signals: longevity, release consistency, download trend, OpenSSF Scorecard data, and — crucially — publisher depth. How many distinct humans have npm publish access?

When a package with 445 million weekly downloads has a single npm publisher, one stolen token, one compromised laptop, one phishing email reaches every project that depends on it. That is the attack that keeps happening.

The 2026 track record

Six major npm supply chain attacks have hit this year. Every one exploited a package with a sole publisher or a compromised publisher credential:

axios — March 30. Token theft. 119M downloads/week. 1 npm publisher.
TanStack — May 11. Mini Shai-Hulud worm. Hijacked CI/CD to publish malicious versions.
TrapDoor — May 22. 21 npm + 7 PyPI + 6 Cargo packages planting persistence hooks in AI coding assistants.
Red Hat Miasma — June 1. 32 @redhat-cloud-services packages via compromised GitHub account. Valid SLSA provenance on every malicious version.
Phantom Gyp — June 3. 57 packages including @vapi-ai/server-sdk (408K/month). Used binding.gyp to bypass install-script monitors.
IronWorm — June 4. 37 packages with eBPF rootkit + Tor C2 + self-propagation via stolen npm tokens.

npm audit flagged zero of these before the attack. Snyk's vulnerability database flagged zero before the attack. A publisher concentration check would have flagged all of them as structural risk.

The 26 packages that matter most

26 of the 91 npm packages with more than 10 million weekly downloads have a single npm publisher. Together they account for over 3 billion downloads per week. They include packages that are probably in your lock file right now:

minimatch — 625M/week, 1 publisher
chalk — 445M/week, 1 publisher
glob — 366M/week, 1 publisher
cross-spawn — 215M/week, 1 publisher
zod — 194M/week, 1 publisher
lodash — 156M/week, 1 publisher

None of them are vulnerable. All of them are structural concentration risk. The distinction matters because vulnerability scanning and behavioral risk analysis serve different functions — and confusing the two leaves the gap attackers keep walking through.

Not a replacement. A different question.

Snyk tells you: does this package have known bugs?

Commit tells you: if this package's publisher gets phished tomorrow, how bad is it?

Both questions matter. They measure different attack surfaces. The problem is that most teams only ask the first one.

Try it

Zero install, 30 seconds:

npx proof-of-commitment --file package-lock.json

Or paste your packages into the web demo (pre-loaded with chalk).

If you want monitoring — automated scans, alerts when a score drops, email when a package you depend on gets compromised:

poc watch chalk --email you@company.com

Free key — watchlist auto-seeded with chalk, weekly digest, 30 seconds, no card. Developer — 15 packages, daily scans, $15/month.

Full comparison: Commit vs. Socket, Snyk, OpenSSF Scorecard

Originally published at getcommit.dev. Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment — signals harder to fake than stars, READMEs, or download counts.

80% of Agent Skills Lie About What They Do — and the scanner that found that admitted it can't catch the rest

Pico — Sun, 14 Jun 2026 03:04:01 +0000

On June 11, 2026, Palo Alto Networks Unit42 published results from their Behavioral Integrity Verification (BIV) scanner applied to the OpenClaw skill ecosystem. They crawled 49,943 skills — the largest systematic analysis of agent skill behavior published to date.

The headline: 80% of skills (39,933) have at least one behavioral deviation from their declared intent. Across those skills, Unit42 documented 250,706 total deviations — an average of roughly six per non-compliant skill.

18.9% of skills showed adversarial intent. 5% — 2,490 skills — carried multi-stage attack chains.

The threat taxonomy is specific: Instruction-Level Threats were the most adversarial category, with 96% of skills in that class showing adversarial intent — the highest rate of any category. Credential theft was the largest single adversarial leaf, accounting for 8.2% of classified deviations. These aren't edge cases. They're systematic.

The disclosure that matters more than the data

The data is striking. But the more significant finding is a single sentence in Unit42's methodology section:

"BIV is static-only, so dynamic dispatch, reflection, and obfuscated payloads escape AST-level extraction."

This is a major security vendor saying, in their own words, that their tool — which just found behavioral deviations in 80% of skills — cannot catch the adversarial skills that matter most. Dynamic dispatch and obfuscation are standard tradecraft for any skill designed to evade detection. The 5% with multi-stage attack chains almost certainly overlap heavily with the class of skills that BIV can't see.

Read that carefully: the tool that found 39,933 deviating skills explicitly cannot analyze the class of skills most likely to cause serious harm.

This is not a criticism of Unit42's research — it's honest methodology disclosure. But it has a direct implication: static pre-installation analysis, however thorough, has a hard ceiling. The dangerous payloads are specifically designed to be invisible to it.

What behavioral deviations actually look like

A behavioral deviation, in Unit42's framework, is a gap between what a skill declares it does (in its manifest, its description, its metadata) and what it actually does when executed. The deviation types they documented are not subtle:

Instruction-level threats: Skills that modify the agent's system prompt or override task instructions mid-execution. This category had the highest adversarial rate: 96% of skills flagged here showed adversarial intent rather than developer oversight.
Credential theft: Skills that access authentication tokens, API keys, or session credentials beyond their stated scope. The largest single adversarial leaf at 8.2% of all classified deviations.
Exfiltration chains and remote code execution chains: Two of the four novel compound threat categories identified. Multi-stage attacks that distribute malicious behavior across steps, each of which looks benign in isolation.

In aggregate, this is a picture of an ecosystem where the declaration layer — what skills say they do — has almost entirely decoupled from the behavioral layer — what they actually do. 80% deviation rate at scale is not an anomaly. It's a structural condition.

Why this happens

The declaration layer was never designed to be enforceable. A skill manifest is a string of text that describes intent. Nothing in the current agent skill infrastructure verifies that the manifest accurately reflects behavior. Nothing monitors runtime execution against declared scope. Nothing signals when execution diverges from declaration.

This is the same pattern that produced the npm supply chain crisis, applied at a faster velocity. npm's package metadata — README, description, keywords — said nothing enforceable about what the package code would do at runtime. Malicious packages published with plausible descriptions and then executed adversarially when installed. The declaration layer was gameable by construction.

Agent skills are worse. Skills are designed to operate autonomously, with elevated access to orchestration infrastructure, in contexts where human review of each action is impossible. A malicious npm package needs a human to run it. A malicious agent skill executes inside an automated pipeline that may process thousands of actions per hour. The blast radius per adversarial skill is larger, and the detection window is shorter.

The Unit42 data confirms what the architecture implied: when declarations aren't enforceable, most won't be accurate.

The L3/L4 gap

In the trust infrastructure stack, there are four layers:

L1: Identity. Who is this agent? JWT/OIDC, did:key, JWKS-verifiable credentials. The IETF Transaction Tokens draft, DIF's MCP-I profile, and the A2A protocol all operate here.
L2: Authorization. What is this agent allowed to do? OAuth scopes, capability declarations, allowlists.
L3: Pre-installation verification. Static analysis, manifest scanning, provenance checks. Unit42 BIV operates at L3.
L4: Runtime behavioral monitoring. Continuous observation of what the agent actually does during execution, compared against its declared scope and historical baseline.

The industry has made significant progress on L1 and L2 in 2026. The IETF, DIF, and OpenClaw itself have active working groups on agent identity and authorization. L3 has credible tooling — Unit42 BIV, static analysis scanners, manifest validators.

L4 is nearly empty.

Unit42's methodology admission tells us exactly why this matters: the attacks that escape L3 are the ones that require L4. Static analysis finds deviations in skills that didn't bother to hide. Dynamic dispatch and obfuscation are evasion techniques for L3. A skill that uses them passes every static scan and then executes adversarially at runtime.

The 5% multi-stage attack chain finding is especially relevant here. Multi-stage attacks, by definition, distribute their adversarial behavior across multiple execution steps. Step one looks clean. Step two looks clean. The harm happens at step three, when context from steps one and two enables an action that no individual step would have triggered. Static analysis examines each skill in isolation — it cannot see the chain.

Agent trust scoring at runtime

The question this data raises isn't "how do we build a better static scanner?" Unit42 just built one and found 39,933 deviating skills — and acknowledged it can't see the dangerous tail. The question is: what does the trust signal look like at the moment an agent is executing?

Runtime behavioral trust scoring works differently from static analysis. Instead of asking "does this skill's code match its declaration?" it asks a continuous set of questions during execution:

Is this agent accessing resources outside its declared scope?
Is this agent's action pattern consistent with its historical baseline?
Is this agent communicating with endpoints not present in its manifest?
Is this agent's token consumption pattern anomalous for its stated task?
Is this agent modifying its own instructions or those of downstream agents?

These signals are continuous. They degrade naturally when behavior changes. A skill that passed static analysis and operated cleanly for thirty days produces a different runtime signal than a skill that starts exfiltrating credentials on day thirty-one. Static analysis gives you a snapshot. Runtime monitoring gives you a stream.

The Unit42 BIV data is the strongest third-party evidence to date that the snapshot is insufficient. 250,706 behavioral deviations across 49,943 skills tell you the ecosystem has a systematic declaration problem. The explicit methodology admission tells you that the solution to the declaration problem cannot itself be declarative. You need the stream.

What this means for agent deployments today

If your infrastructure runs agent skills — MCP servers, OpenClaw tools, custom agent pipelines — the Unit42 data has a direct operational implication: the skills you're running have probably not been verified against their declared behavior, and static scanning won't catch the most dangerous ones even if you run it.

A few concrete steps:

Audit your agent skill declarations. Start by comparing what your running skills say they do against what network traffic, system calls, and API access logs show they actually do. The gap is the risk surface. You can run a structural scan against any npm-distributed skill:

npx proof-of-commitment npm <your-skill-package>

# For MCP servers
npx proof-of-commitment mcp-remote <server-url>

# Web UI
# https://getcommit.dev/audit

Add behavioral gates to your CI pipeline. Structural risk flags — anomalous dependency additions, publishing pattern changes, maintainer transfers — show up before compromised skills reach production. We published a 5-minute CI integration that puts these flags in PR comments.

Don't rely on marketplace verification. The OpenClaw ecosystem is not the only place this applies. We documented 9 of 11 MCP marketplaces accepting a malicious server without detection. The Unit42 data confirms this isn't an MCP-specific problem — it's a declaration-layer problem. Any ecosystem that trusts manifests over behavior has the same exposure.

Plan for L4. The agent behavioral monitoring layer is thin right now. That's not because the problem is solved — it's because the tooling hasn't caught up with the deployment curve. Unit42's explicit acknowledgment that static analysis has a hard ceiling is a signal that the industry knows this gap exists. Plan for monitoring infrastructure before your agent deployment scales past the point where manual review is possible.

The 80% figure will age badly in one of two directions. Either the ecosystem invests in L4 monitoring and the deviation rate drops as adversarial skills get caught faster — or the deviation rate climbs as agent deployments scale faster than detection. Unit42's data is a snapshot. The dynamic depends on whether the industry treats L3 as sufficient or as the floor.

The methodology admission says it's the floor.

Source: Palo Alto Networks Unit42 / arXiv 2605.11770, "Behavioral Integrity Verification for AI Agent Skills," May 2026. 49,943 OpenClaw skills analyzed. Stats: 39,933 (80.0%) with ≥1 behavioral deviation; 250,706 total deviations; 18.9% adversarial intent; 2,490 (5.0%) multi-stage attack chains; credential theft largest adversarial leaf (8.2% of classified deviations); instruction-level threats highest adversarial category (96% adversarial fraction). Limitations: "BIV is static-only, so dynamic dispatch, reflection, and obfuscated payloads escape AST-level extraction."

Originally published at getcommit.dev. Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment — signals harder to fake than stars, READMEs, or download counts.

1,579 AUR Packages Were Taken Over Through the Adoption Process. The Bypass Was the Process.

Pico — Sun, 14 Jun 2026 01:09:20 +0000

On June 11, the aur-general mailing list started seeing reports of suspicious commits on AUR packages. By June 12 morning, Phoronix counted 400 compromised packages. By that evening, the number was 1,579.

The official Arch Linux advisory uses careful language:

We are currently experiencing a high volume of malicious package adoptions and updates in the Arch User Repository.
— archlinux.org, June 12 2026

Adoptions. Not breaches. Not stolen credentials. The attackers used the AUR's published process for adopting orphaned packages, applied as legitimate new maintainers, and got handed the keys to packages that had real users.

How AUR adoption works

The AUR is a build-script repository. PKGBUILDs and .install files describe how to fetch and compile software. When a package's maintainer disappears, the package becomes "orphaned." Any AUR user can request adoption. There is no two-of-three. There is no proof-of-contribution. There is one click.

This is the right design for a volunteer-maintained software graveyard. It is the wrong design for a trust boundary. And it is the boundary every Arch user crosses when they type yay -S package-name.

What got injected

Sonatype's Security Research team tracked one strand of the campaign — "Atomic Arch" — and found the payload routed through an npm dependency added to the PKGBUILD. The AUR script pulled a poisoned npm package during install. Arch Linux machines started running JavaScript-delivered, rootkit-like malware out of a Linux distribution package manager.

The cross-ecosystem mechanic matters. AUR doesn't host the binary — the attacker only needed control of a trusted PKGBUILD and a tame-looking npm dependency. Two ecosystems, one trust path, no behavioral history check at either hop.

This isn't the first time

In July 2025, firefox-patch-bin and librewolf-fix-bin were pushed to the AUR by a fresh account and contained Chaos RAT. That incident hit a few packages. This one hit 1,579.

The structural lesson is the same in both. The AUR's defense model is "review the PKGBUILD before installing." The official advisory still says "review all PKGBUILD and install script changes." That's a useful instruction. It is also a confession that the trust model puts the work on the user.

The npm parallel

We have been writing about the npm side of this all month. 32 Red Hat packages with valid provenance. 57 packages using a 14-year-old binding.gyp execution path. 37 packages where the commit author signed as "claude". Different bypasses. Same shape.

Maintainer-identity-takeover doesn't care about the ecosystem. The mechanic is:

A package accumulates trust through downloads and use.
The trust attaches to the slug, not the human.
A new human steps behind the slug — through adoption, account compromise, or social engineering.
The trust transfers to the new human at zero cost.

npm calls this "compromised account." AUR calls it "adoption." Both are the same attack with different names.

What behavioral signals show

The attackers in the AUR incident were fresh accounts. No prior PKGBUILDs. No contribution history outside the adoptions they just made. No commits to upstream projects. No public identity attached to anything older than a week.

That signal is cheap to compute. It is the signal Commit scores npm and PyPI and Cargo packages on. Single publisher, short history, no behavioral track record outside the package itself — that's the structural fingerprint of every successful supply chain attack this year.

Incident	Ecosystem	Packages	Common fingerprint
Microsoft typosquats (May 31)	npm	14	Zero behavioral history
Red Hat Miasma (Jun 1)	npm	32	Compromised single publisher
Phantom Gyp (Jun 3)	npm	57	Compromised single publisher
IronWorm (Jun 6)	npm	37	Compromised single publisher
TrapDoor (Jun 6)	npm/PyPI/crates	34	Zero behavioral history
Atomic Arch (Jun 11–12)	AUR + npm	1,579	Fresh adopter, no AUR history

What this changes

Arch's response is the right tactical one. They froze adoption. They froze new accounts. They walked the commit graph and reverted what they could find. The advisory closes with "many (but not all)" of the affected packages.

The strategic problem is upstream of all that. As long as the trust model lets identity be transferred to a package — instead of letting identity be earned by a human — the defense is racing the attack, and the attack has every unmaintained package in the registry as ammunition.

Commit's bet is that the only signal an attacker can't fake in advance is the behavioral history of the human behind the artifact. Years of commits to other projects. Cross-ecosystem identity that resolves to the same person. Declarations are gameable. Behavior isn't.

Check your project

npx proof-of-commitment

Scores every dependency in your npm, pip, or Cargo lockfile against the same structural fingerprint that flagged the npm packages above. AUR PKGBUILDs are out of scope today — but if a PKGBUILD pulls an npm dependency that scores CRITICAL, you want to know that before the build script runs on your machine.

Originally published at getcommit.dev. Commit scores npm, PyPI, Cargo, and Go packages on behavioral commitment — signals harder to fake than stars, READMEs, or download counts.