Aditi Bhatnagar

Posted on Mar 26

The LiteLLM Supply Chain Attack Just Changed Everything - Here's How to Protect Your AI Stack

TL;DR

On March 24, 2026, the widely-used LiteLLM Python package was compromised in a supply chain attack. Malicious versions stole credentials from tens of thousands of developers. This post breaks down what happened, why AI tooling is uniquely vulnerable, and how MCP-based security tools like kira-lite-mcp can catch these threats before they hit production.

What Happened: The LiteLLM Compromise

Two days ago, the AI development community got a brutal wake-up call.

LiteLLM - the open-source Python library that provides a unified interface across 100+ LLMs (OpenAI, Anthropic, VertexAI, and more) - was hit by a supply chain attack. Versions 1.82.7 and 1.82.8, published directly to PyPI, contained a multi-stage credential stealer targeting SSH keys, cloud provider tokens, Kubernetes secrets, cryptocurrency wallets, and .env files.

The attack was carried out by TeamPCP, a threat group that had been chaining compromises across the open-source ecosystem since late 2025. They gained access to a LiteLLM maintainer's PyPI credentials through a prior compromise of Aqua Security's Trivy scanner - meaning one breach cascaded into another.

Here's a timeline of what went down:

~08:30 UTC - Malicious versions 1.82.7 and 1.82.8 published to PyPI
10:39-16:00 UTC - Compromised packages available for download
~11:25 UTC - PyPI quarantined the packages
By end of day - Versions yanked, credentials rotated, external IR engaged

The payload was embedded in proxy_server.py and, in version 1.82.8, a .pth file (litellm_init.pth) that executed automatically on every Python process startup. Not when you ran your app. Not when you imported litellm. When Python itself started.

If you ran pip install litellm without a pinned version during that window, the malware was already running before your code did.

Why This Matters More Than a Typical Supply Chain Attack

LiteLLM isn't just another package. It sits at the center of the AI stack, acting as a gateway between applications and multiple LLM providers. That means it often has access to:

API keys for OpenAI, Anthropic, Azure, GCP, and other providers
Environment variables with database passwords and service credentials
Kubernetes cluster secrets when deployed as a proxy
CI/CD pipeline tokens when used in automated workflows

According to Wiz, LiteLLM is present in 36% of all cloud environments. That's an astonishing blast radius for a three-hour window of exposure.

The stolen data was encrypted and exfiltrated to a C2 domain designed to look legitimate: models.litellm[.]cloud. The payload also attempted to deploy privileged Kubernetes pods to every node in a cluster, giving the attackers persistent access even after the initial malware was removed.

This wasn't an isolated incident. TeamPCP's campaign has now spanned five ecosystems - GitHub Actions, Docker Hub, npm, Open VSX, and PyPI - with each compromise yielding credentials that unlock the next target.

The Real Problem: We Don't Scan What We Should

Here's the uncomfortable truth: most development teams have zero automated security checks between their AI coding assistant generating code and that code hitting production.

Think about the typical AI-assisted development workflow:

Developer asks an AI assistant to write a function
AI generates code, often pulling in dependencies
Developer reviews it (maybe), copies it, and ships it

At no point in that flow did anyone check whether those dependencies have known vulnerabilities. Nobody verified that the packages the AI suggested actually exist (package hallucination is a real attack vector). And nobody scanned the generated code itself for injection vulnerabilities, hardcoded secrets, or insecure patterns.

The LiteLLM incident makes this painfully clear. If your project had an unpinned litellm dependency and your CI pipeline ran pip install during that three-hour window, you were compromised. Period.

Enter kira-lite-mcp: Security Scanning Inside the AI Loop

This is where tools like @offgridsec/kira-lite-mcp become critical.

kira-lite-mcp is an MCP (Model Context Protocol) server that brings real-time security scanning directly into your AI coding workflow. Instead of scanning after code is written and committed, it scans code as it's being generated - inside the same conversation where your AI assistant is writing it.

What Makes It Different

It runs entirely on your machine. Kira-lite ships with Kira-Core, a compiled Go binary bundled for macOS, Linux, and Windows. Your code never leaves your laptop. For proprietary codebases or regulated industries, this is non-negotiable.

376 security rules out of the box. It covers OWASP Top 10:2025, OWASP API Security Top 10, and - critically - OWASP LLM Top 10:2025. That last category catches vulnerabilities that traditional scanners completely miss: LLM output passed directly to eval() or exec(), prompt injection patterns, and user input concatenated into prompt templates.

Five MCP tools that your AI assistant can invoke contextually:

scan_code - scans a snippet before it's written to disk. The AI literally checks its own work.
scan_file - scans an existing file and auto-triggers dependency scanning on lockfiles.
scan_diff - compares original vs. modified code and flags only new vulnerabilities.
scan_dependency - checks your lockfiles against CVE databases across 13 formats (npm, PyPI, Go, Maven, crates.io, RubyGems, and more).
scan_project - full project-level scan with configurable severity thresholds.

How This Would Have Helped With LiteLLM

Let's map kira-lite's capabilities directly to the LiteLLM attack scenario:

Dependency scanning catches known-bad versions. If your project's requirements.txt or lockfile pulled in litellm 1.82.7 or 1.82.8 after the CVE was published, kira-lite's dependency scanner would flag it. It checks against live CVE databases, not just a static list - so as advisories are published, your scans pick them up.

Code scanning catches suspicious patterns. The LiteLLM payload used os.system() calls, base64-encoded strings, and network exfiltration patterns. These are exactly the kinds of patterns that SAST rules detect. If an AI assistant generated code that imports from or interacts with a compromised dependency in a suspicious way, scan_code would catch it before it's saved.

Project-level scanning catches transitive risks. Many developers weren't even using LiteLLM directly - it was pulled in as a transitive dependency through AI agent frameworks, MCP servers, or LLM orchestration tools. A full project scan would surface these hidden dependencies.

Setting It Up (It Takes 30 Seconds)

npx @offgridsec/kira-lite-mcp

That's it. Zero config. Works with Claude Code, Claude Desktop, Cursor, VS Code, and any MCP-compatible client.

For Claude Desktop, add to your config:

{
  "mcpServers": {
    "kira-lite": {
      "command": "npx",
      "args": ["-y", "@offgridsec/kira-lite-mcp"]
    }
  }
}

Once configured, your AI assistant can call the security scanning tools automatically - catching vulnerabilities in the same conversation where code is being written.

Broader Lessons From the LiteLLM Incident

The LiteLLM compromise isn't the end of this campaign. As Endor Labs put it: "TeamPCP has demonstrated a consistent pattern: each compromised environment yields credentials that unlock the next target."

Here's what every team building with AI should take away:

1. Pin Your Dependencies

If your requirements.txt says litellm instead of litellm==1.82.6, you were exposed. Pin versions. Use lockfiles. Verify hashes. This is basic supply chain hygiene that too many AI projects skip because they move fast.

2. Treat AI-Generated Code Like Untrusted Code

Your AI assistant doesn't know about zero-day supply chain attacks. It doesn't verify that the packages it suggests are uncompromised. Treat its output the same way you'd treat code from a contractor who doesn't know your security policies.

3. Scan In the Loop, Not After the Fact

The traditional model - write code, commit, CI scans, fix later - doesn't scale when AI is generating code at 10x the speed of manual development. Tools like kira-lite-mcp shift security scanning left of left: into the generation step itself.

4. Watch Your Transitive Dependencies

LiteLLM wasn't just installed directly. It was pulled in by MCP servers, agent frameworks, and orchestration tools. Know your dependency tree. Audit it regularly. Use scan_dependency on every lockfile in your project.

5. Assume Breach, Rotate Fast

If you installed litellm 1.82.7 or 1.82.8, assume every credential on that machine is compromised. SSH keys, cloud tokens, API keys, database passwords, crypto wallets - rotate all of them. Check for persistence mechanisms (~/.config/sysmon/sysmon.py, ~/.config/systemd/user/sysmon.service). In Kubernetes environments, audit kube-system for pods matching node-setup-*.

The Bottom Line

The AI supply chain is now a high-value target. LiteLLM's 3.4 million daily downloads made it an incredibly efficient attack vector - and the attackers knew it.

We can't prevent every supply chain compromise. But we can build security into the development workflow itself, catching vulnerabilities before they become incidents. Tools like kira-lite-mcp represent a fundamental shift: from security as a gate at the end of the pipeline to security as a collaborator sitting next to you while you code.

The LiteLLM incident cost organizations time, trust, and potentially sensitive data. The next supply chain attack - and there will be a next one - doesn't have to.

If you found this useful, give it a like and follow for more on AI security and developer tooling. Have questions about securing your AI development workflow? Drop them in the comments.

Tags: #security #ai #programming #devops

FAQ: LiteLLM Supply Chain Attack and AI Security

What was the LiteLLM supply chain attack?
On March 24, 2026, the LiteLLM Python package on PyPI was compromised by a threat group called TeamPCP. Versions 1.82.7 and 1.82.8 contained credential-stealing malware that targeted SSH keys, cloud tokens, API keys, and Kubernetes secrets. The malicious packages were available for approximately three hours before being removed.

How do I know if I was affected by the LiteLLM attack?
You were likely affected if you ran pip install litellm without a pinned version between 10:39 UTC and 16:00 UTC on March 24, 2026, and received version 1.82.7 or 1.82.8. Run pip show litellm to check your installed version. Users of the official LiteLLM Proxy Docker image were not impacted.

What is kira-lite-mcp?
kira-lite-mcp (@offgridsec/kira-lite-mcp) is an MCP-based security scanner that integrates directly into AI coding assistants like Claude Code, Cursor, and VS Code. It scans code for vulnerabilities, checks dependencies against CVE databases, and detects insecure patterns - all in real-time during the development process, entirely on your local machine.

How does MCP help with security scanning?
MCP (Model Context Protocol) allows AI assistants to invoke external tools contextually. With an MCP security scanner, the AI can proactively check its own generated code for vulnerabilities, scan dependencies for known CVEs, and flag insecure patterns before the code is ever saved to disk.

What should I do if I installed a compromised version of LiteLLM?
Immediately remove the package, purge your pip cache, rotate all credentials that were accessible on the affected machine (SSH keys, cloud tokens, API keys, database passwords), check for persistence mechanisms, and if running in Kubernetes, audit for unauthorized pods. Consider rebuilding affected systems from a known clean state.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.