I Scored 12 Python AI Packages on Behavioral Commitment. The LiteLLM Attack Data Makes Sense Now.

#ai #security #python #opensource

In March 2026, LiteLLM got hit with a supply chain attack. Stolen PyPI token. Malicious packages published. 97 million downloads per month exposed.

I built an MCP tool that scores Python packages on behavioral commitment — not stars, not README quality, but actual behavioral signals: longevity, release consistency, download momentum, maintainer depth. When I ran it on LiteLLM, the output made the attack feel inevitable.

LiteLLM: 74/100

Package: litellm@1.83.3
Age: 2 years
Versions: 1288 | Last: released this week
Downloads: 25,698,680 downloads/week (stable)

Commitment Score: 74/100
  Longevity:            14/25 (2 years old)
  Download momentum:    22/25
  Release consistency:  20/20 (1288 versions)
  Maintainer depth:     5/15 (~1 maintainer)
  GitHub backing:       13/15

1,288 releases in 2 years. That is 1.8 releases per day. One maintainer. Each release is a new attack window. A stolen token with that release velocity can ship a malicious version before anyone notices.

Compare that to numpy:

numpy: 86/100

Package: numpy@2.4.4
Age: 19 years
Versions: 130 | Last: released 7 days ago
Downloads: 199,479,002 downloads/week (stable)

Commitment Score: 86/100
  Longevity:            25/25 (19 years old)
  Download momentum:    22/25
  Release consistency:  20/20 (130 versions)
  Maintainer depth:     5/15
  GitHub backing:       14/15

19 years. 130 versions. numpy releases slowly and deliberately. A malicious version would stand out — the release cadence has a known rhythm. LiteLLM releases so often that one more release today is indistinguishable from noise.

Full Rankings: 12 Python AI Packages

Scored using public data from PyPI and pypistats.org. No auth required.

Package	Score	Age	Versions	Weekly Downloads
boto3	87/100	11yr	2,008	431M
numpy	86/100	19yr	130	199M
requests	86/100	15yr	156	314M
openai	86/100	6yr	382	55M
transformers	86/100	9yr	215	28M
torch	82/100	7yr	46	19M
langchain	75/100	3yr	486	53M
anthropic	75/100	3yr	163	21M
litellm	74/100	2yr	1,288	25M
crewai	69/100	2yr	246	1.3M
google-generativeai	53/100	2yr	28	3.6M
autogen-agentchat	42/100	1yr	54	342K

What the data reveals

The AI-native packages cluster at the bottom. LiteLLM, CrewAI, Google Generative AI, AutoGen — all low longevity, high release velocity relative to age. Stars reflect hype. Behavioral commitment reflects time.

google-generativeai is an interesting outlier. 2 years old, 28 versions, 3.6M weekly downloads. Low release consistency for its usage volume. A package that grew via distribution (Google pushing it) rather than organic behavioral commitment.

autogen-agentchat: 42/100. 1 year old. 54 versions. 342K weekly downloads. If you are building production systems on autogen today, your dependency is more fragile than its star count suggests.

The established winners are boring. numpy (19yr), requests (15yr), boto3 (11yr). High longevity. Deliberate cadence. An attacker exploiting these packages would need to match that rhythm or be immediately obvious.

Why release velocity matters for supply chain risk

The LiteLLM attack pattern: stolen PyPI token → malicious package published → C2 pre-staged the day before (the same pre-staging pattern that hit the Axios npm package earlier).

When a package releases 1.8 times per day:

Each release gets less scrutiny
Users with automatic updates expect frequent bumps
One stolen token = one release = millions of machines

When numpy releases every few months, one extra release is anomalous by construction.

How to query any package yourself

The lookup_pypi_package tool in the Proof of Commitment MCP server. Zero install:

{
  "mcpServers": {
    "proof-of-commitment": {
      "type": "streamable-http",
      "url": "https://poc-backend.amdal-dev.workers.dev/mcp"
    }
  }
}

Ask your AI: "Score my Python dependencies for supply chain risk" and paste your requirements.txt.

The tool also covers npm (lookup_npm_package) and GitHub repos (lookup_github_repo). The npm analysis found zod has 974M weekly downloads and one maintainer.

Proof of Commitment — behavioral trust from public data.