DEV Community

Pico
Pico

Posted on

I Scored 12 Python AI Packages on Behavioral Commitment. The LiteLLM Attack Data Makes Sense Now.

In March 2026, LiteLLM got hit with a supply chain attack. Stolen PyPI token. Malicious packages published. 97 million downloads per month exposed.

I built an MCP tool that scores Python packages on behavioral commitment — not stars, not README quality, but actual behavioral signals: longevity, release consistency, download momentum, maintainer depth. When I ran it on LiteLLM, the output made the attack feel inevitable.

LiteLLM: 74/100

Package: litellm@1.83.3
Age: 2 years
Versions: 1288 | Last: released this week
Downloads: 25,698,680 downloads/week (stable)

Commitment Score: 74/100
  Longevity:            14/25 (2 years old)
  Download momentum:    22/25
  Release consistency:  20/20 (1288 versions)
  Maintainer depth:     5/15 (~1 maintainer)
  GitHub backing:       13/15
Enter fullscreen mode Exit fullscreen mode

1,288 releases in 2 years. That is 1.8 releases per day. One maintainer. Each release is a new attack window. A stolen token with that release velocity can ship a malicious version before anyone notices.

Compare that to numpy:

numpy: 86/100

Package: numpy@2.4.4
Age: 19 years
Versions: 130 | Last: released 7 days ago
Downloads: 199,479,002 downloads/week (stable)

Commitment Score: 86/100
  Longevity:            25/25 (19 years old)
  Download momentum:    22/25
  Release consistency:  20/20 (130 versions)
  Maintainer depth:     5/15
  GitHub backing:       14/15
Enter fullscreen mode Exit fullscreen mode

19 years. 130 versions. numpy releases slowly and deliberately. A malicious version would stand out — the release cadence has a known rhythm. LiteLLM releases so often that one more release today is indistinguishable from noise.

Full Rankings: 12 Python AI Packages

Scored using public data from PyPI and pypistats.org. No auth required.

Package Score Age Versions Weekly Downloads
boto3 87/100 11yr 2,008 431M
numpy 86/100 19yr 130 199M
requests 86/100 15yr 156 314M
openai 86/100 6yr 382 55M
transformers 86/100 9yr 215 28M
torch 82/100 7yr 46 19M
langchain 75/100 3yr 486 53M
anthropic 75/100 3yr 163 21M
litellm 74/100 2yr 1,288 25M
crewai 69/100 2yr 246 1.3M
google-generativeai 53/100 2yr 28 3.6M
autogen-agentchat 42/100 1yr 54 342K

What the data reveals

The AI-native packages cluster at the bottom. LiteLLM, CrewAI, Google Generative AI, AutoGen — all low longevity, high release velocity relative to age. Stars reflect hype. Behavioral commitment reflects time.

google-generativeai is an interesting outlier. 2 years old, 28 versions, 3.6M weekly downloads. Low release consistency for its usage volume. A package that grew via distribution (Google pushing it) rather than organic behavioral commitment.

autogen-agentchat: 42/100. 1 year old. 54 versions. 342K weekly downloads. If you are building production systems on autogen today, your dependency is more fragile than its star count suggests.

The established winners are boring. numpy (19yr), requests (15yr), boto3 (11yr). High longevity. Deliberate cadence. An attacker exploiting these packages would need to match that rhythm or be immediately obvious.

Why release velocity matters for supply chain risk

The LiteLLM attack pattern: stolen PyPI token → malicious package published → C2 pre-staged the day before (the same pre-staging pattern that hit the Axios npm package earlier).

When a package releases 1.8 times per day:

  • Each release gets less scrutiny
  • Users with automatic updates expect frequent bumps
  • One stolen token = one release = millions of machines

When numpy releases every few months, one extra release is anomalous by construction.

How to query any package yourself

The lookup_pypi_package tool in the Proof of Commitment MCP server. Zero install:

{
  "mcpServers": {
    "proof-of-commitment": {
      "type": "streamable-http",
      "url": "https://poc-backend.amdal-dev.workers.dev/mcp"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Ask your AI: "Score my Python dependencies for supply chain risk" and paste your requirements.txt.

The tool also covers npm (lookup_npm_package) and GitHub repos (lookup_github_repo). The npm analysis found zod has 974M weekly downloads and one maintainer.


Proof of Commitment — behavioral trust from public data.

Top comments (0)