DEV Community

Debby McKinney
Debby McKinney

Posted on

Your LLM Gateway is a Python Package. Here's Why That Should Worry You.

Two days ago, LiteLLM got backdoored. Two malicious versions published to PyPI. Credentials stolen. Kubernetes clusters compromised. 3.4 million daily downloads exposed.

But this post is not just about LiteLLM. LiteLLM was the target this time. Next time it could be any Python package sitting in your AI infrastructure's critical path.

If you're routing LLM requests through a Python-based gateway, here's what you need to understand about the risk you're carrying and what your options look like.


What Your LLM Gateway Actually Has Access To

Think about what your LLM gateway touches. If you're using LiteLLM, Portkey's open-source proxy, or any similar Python-based routing layer, it typically has:

  • API keys for every LLM provider you route through (OpenAI, Anthropic, Google, AWS Bedrock, Azure, Mistral, Cohere)
  • Environment variables loaded at startup
  • Network access to your LLM providers and potentially your internal services
  • Kubernetes service account tokens if you're running in K8s
  • CI/CD secrets if the package is installed during build pipelines

Your LLM gateway is one of the most privileged components in your AI stack. It is the single point that touches every provider credential you have.

When attackers compromised LiteLLM, they did not just get access to one API key. They got access to everything on the machine. SSH keys, cloud credentials, database passwords, crypto wallets, CI/CD tokens. The full list is extensive.


The Python-Specific Attack Vectors

The LiteLLM attack used two delivery mechanisms. Both are specific to how Python's packaging ecosystem works.

The .pth File Problem

Version 1.82.8 included a file called litellm_init.pth in site-packages/. Python processes all .pth files on every interpreter startup. Not when you import litellm. Every time Python starts.

That means the malware executed when you:

  • Ran pip install for an unrelated package
  • Started your IDE (VS Code, PyCharm both start Python language servers)
  • Ran a test suite
  • Executed any Python script on the system

The .pth mechanism is documented Python behavior (MITRE ATT&CK T1546.018). It is not a bug. It is a feature that was weaponized.

The Transitive Dependency Problem

LiteLLM is a direct dependency for DSPy, CrewAI, OpenHands, MLflow, and Arize Phoenix. All of these projects pulled the compromised version automatically through their dependency chains.

If you run pip install crewai, you get litellm. You didn't choose it. You might not even know it's there.

# Check if litellm is in your dependency tree
pip show litellm 2>/dev/null && echo "litellm is installed" || echo "not found"
Enter fullscreen mode Exit fullscreen mode

The Legitimate Credentials Problem

The packages were published using stolen but legitimate PyPI credentials. Hash verification passed. Package signing passed. The package name was correct. There was no typosquatting to detect.

Standard security tooling did not flag this because, from PyPI's perspective, a trusted maintainer published a new version.


What a Compiled Gateway Looks Like (And Why It Matters)

LLM gateways written in compiled languages like Go or Rust have a fundamentally different supply chain profile:

Property Python-Based Gateway (LiteLLM) Compiled Gateway (Go/Rust)
Distribution PyPI package, installed via pip Single compiled binary or Docker image
Runtime dependencies Dozens of transitive Python packages None at runtime
.pth execution risk Yes, any package can install .pth hooks Does not exist
site-packages injection Yes Does not exist
Dependency chain at install pip resolves and installs full tree Binary has no install-time resolution
Attack surface for supply chain Every transitive dependency Build-time only, vendored and auditable

This does not mean compiled languages are immune to supply chain attacks. They are not. Build-time dependencies in Go modules or Rust crates can be compromised. But the runtime attack surface is categorically smaller. There is no mechanism for a Go binary to execute arbitrary code from a hidden file on every startup.


Your Options Right Now

Option 1: Stay on LiteLLM, But Lock It Down

If you need to stay on LiteLLM for now, do these things immediately:

Pin your version:

litellm==1.82.6
Enter fullscreen mode Exit fullscreen mode

Do not use >= or ~= for infrastructure packages. Aider survived this attack because it pinned to litellm==1.82.3.

Scan for .pth files regularly:

find $(python -c "import site; print(site.getsitepackages()[0])") -name "*.pth" -exec grep -l "subprocess\|base64\|exec" {} \;
Enter fullscreen mode Exit fullscreen mode

Pin CI/CD actions by commit SHA:

# Bad
uses: aquasecurity/trivy-action@latest

# Good
uses: aquasecurity/trivy-action@<full-commit-sha>
Enter fullscreen mode Exit fullscreen mode

Run in an isolated environment:
Use a dedicated virtual environment or container. Don't install litellm in the same environment as other packages. This limits the blast radius of .pth execution.

Monitor LiteLLM's security advisories:
Check LiteLLM's security update page and the Snyk vulnerability database.

Option 2: Switch to a Compiled LLM Gateway

If you're evaluating alternatives, here's what's available:

Bifrost is an open-source LLM gateway written in Go. It supports 20+ providers through a single OpenAI-compatible API. In benchmarks at 5,000 RPS, it adds 11 microseconds of overhead per request. It ships as a single binary or Docker image with zero runtime Python dependencies. It also supports semantic caching, automatic failover, weighted routing, virtual keys for budget control, and MCP gateway capabilities.

TensorZero is a Rust-based LLM gateway with sub-millisecond overhead. Similar compiled-binary supply chain benefits.

Cloudflare AI Gateway is a managed service. You don't self-host anything. Zero supply chain risk from your side, but you're dependent on Cloudflare's infrastructure.

Option 3: Drop the Gateway Entirely

If you're only using one or two providers, you might not need a gateway. The official OpenAI Python SDK and Anthropic Python SDK are maintained by the provider teams. Smaller codebase, fewer dependencies, smaller attack surface.

You lose multi-provider routing, failover, and caching. But if you don't need those features, a direct SDK is the simplest and most secure option.


How to Audit Your Current Setup

Here's a quick checklist. Run through it this week.

1. Find your LLM gateway dependency:

pip list | grep -i "litellm\|portkey\|helicone\|langsmith"
Enter fullscreen mode Exit fullscreen mode

2. Check if it's a transitive dependency you didn't choose:

pip show litellm | grep "Required-by"
Enter fullscreen mode Exit fullscreen mode

3. Scan for suspicious .pth files:

for dir in $(python -c "import site; print('\n'.join(site.getsitepackages()))"); do
  echo "Checking $dir"
  find "$dir" -name "*.pth" -exec grep -l "subprocess\|base64\|exec\|os.system" {} \;
done
Enter fullscreen mode Exit fullscreen mode

4. Check your CI/CD for unpinned actions:

grep -r "uses:.*@latest\|uses:.*@main\|uses:.*@master" .github/workflows/
Enter fullscreen mode Exit fullscreen mode

5. List all credentials your gateway environment has access to:

If the answer is "more than just the LLM provider keys it needs," your blast radius is bigger than it should be.


The Takeaway

The LiteLLM supply chain attack was sophisticated, but the underlying vulnerability is structural. Any Python package that holds high-value credentials and gets installed via pip carries this risk. The .pth mechanism, transitive dependencies, and credential-based publishing are not bugs. They are how Python's ecosystem works.

You have three choices: lock down your Python setup aggressively, switch to a compiled alternative, or simplify your architecture to remove the gateway entirely. All three are valid. Doing nothing is not.


References:


Tags: litellm, security, supply-chain-attack, llm-gateway, python, ai-infrastructure, bifrost, devops

Top comments (0)