dohko

Posted on Mar 29

7 Lessons from the LiteLLM Supply Chain Attack Every AI Developer Must Learn (With Defense Code)

#ai #security #programming #python

On March 24, 2026, the litellm package on PyPI was compromised. A malicious version exfiltrated environment variables — API keys, database credentials, cloud tokens — to an attacker-controlled endpoint. With 97M+ cumulative downloads, this is one of the largest AI supply chain attacks ever.

If you're building with LLMs, you were probably in the blast radius. Here are 7 defenses with code you can implement right now.

1. Pin Dependencies by Hash, Not Just Version

Version pinning (litellm==1.34.0) isn't enough — if PyPI serves a tampered artifact for that version, you still get owned.

Hash pinning ensures you install the exact artifact you audited.

# Generate hash-pinned requirements
pip install pip-tools
pip-compile --generate-hashes requirements.in -o requirements.txt

Your requirements.txt now looks like:

litellm==1.34.0 \
    --hash=sha256:a1b2c3d4e5f6... \
    --hash=sha256:f6e5d4c3b2a1...

Install with hash verification:

pip install --require-hashes -r requirements.txt

If the hash doesn't match, pip refuses to install. Period.

For Poetry users:

# pyproject.toml — Poetry generates hashes in poetry.lock automatically
[tool.poetry.dependencies]
litellm = "1.34.0"

poetry lock
poetry install --no-update  # installs from lock file with hash verification

2. Verify Package Integrity Before Every Deploy

Add a CI step that checks package checksums against a known-good baseline:

#!/usr/bin/env python3
"""verify_deps.py — Compare installed package hashes against baseline."""

import hashlib
import importlib.metadata
import json
import sys
from pathlib import Path


def get_package_hash(package_name: str) -> str:
    """SHA-256 of all files in an installed package."""
    dist = importlib.metadata.distribution(package_name)
    hasher = hashlib.sha256()
    if dist.files:
        for f in sorted(dist.files):
            full_path = Path(dist._path.parent) / f  # type: ignore
            if full_path.exists():
                hasher.update(full_path.read_bytes())
    return hasher.hexdigest()


def verify(baseline_path: str, packages: list[str]) -> bool:
    baseline = json.loads(Path(baseline_path).read_text())
    ok = True
    for pkg in packages:
        current = get_package_hash(pkg)
        expected = baseline.get(pkg)
        if current != expected:
            print(f"❌ MISMATCH: {pkg} expected={expected} got={current}")
            ok = False
        else:
            print(f"✅ {pkg}")
    return ok


if __name__ == "__main__":
    packages = ["litellm", "openai", "anthropic", "langchain-core"]
    if not verify("dep_hashes.json", packages):
        sys.exit(1)

Generate the baseline after auditing:

python3 -c "
import json
from verify_deps import get_package_hash
pkgs = ['litellm', 'openai', 'anthropic', 'langchain-core']
print(json.dumps({p: get_package_hash(p) for p in pkgs}, indent=2))
" > dep_hashes.json

Add to CI:

# .github/workflows/ci.yml
- name: Verify dependency integrity
  run: python verify_deps.py

3. Runtime Import Monitoring

The LiteLLM payload activated on import. Detect unexpected network calls at import time:

"""import_monitor.py — Detect suspicious activity during package imports."""

import socket
import threading
import time
from unittest.mock import patch
from collections import defaultdict

_connection_log: dict[str, list[tuple[str, int]]] = defaultdict(list)
_original_connect = socket.socket.connect


def _monitored_connect(self, address):
    """Intercept socket connections and log them with caller context."""
    import traceback
    caller = "".join(traceback.format_stack()[-4:-1])
    if isinstance(address, tuple) and len(address) == 2:
        host, port = address
        _connection_log[threading.current_thread().name].append((host, port))
        # Block known exfil patterns
        BLOCKED_HOSTS = {"evil.example.com", "*.ngrok.io"}
        if any(host.endswith(b.replace("*", "")) for b in BLOCKED_HOSTS):
            raise ConnectionRefusedError(f"Blocked connection to {host}")
    return _original_connect(self, address)


def safe_import(module_name: str, allowed_hosts: set[str] | None = None):
    """Import a module while monitoring for unexpected network activity."""
    _connection_log.clear()
    with patch.object(socket.socket, "connect", _monitored_connect):
        start = time.monotonic()
        mod = __import__(module_name)
        elapsed = time.monotonic() - start

    connections = []
    for thread_conns in _connection_log.values():
        connections.extend(thread_conns)

    if connections:
        allowed = allowed_hosts or set()
        suspicious = [(h, p) for h, p in connections if h not in allowed]
        if suspicious:
            raise RuntimeError(
                f"🚨 {module_name} made unexpected connections on import: {suspicious}"
            )

    if elapsed > 5.0:
        print(f"⚠️  {module_name} took {elapsed:.1f}s to import — investigate")

    return mod


# Usage
litellm = safe_import("litellm", allowed_hosts={"api.litellm.ai"})

4. Isolate Secrets from Package Code

The attack exfiltrated os.environ. The fix: never put secrets in environment variables that application code can read directly.

"""secret_vault.py — Secrets via Unix domain socket, not env vars."""

import json
import os
import socket
from pathlib import Path
from functools import lru_cache


class SecretVault:
    """
    Serves secrets over a Unix domain socket.
    The AI/ML code runs in a process that has NO secret env vars —
    it requests them through this socket with per-key ACLs.
    """

    def __init__(self, socket_path: str = "/tmp/vault.sock"):
        self.socket_path = socket_path
        self._secrets: dict[str, str] = {}
        self._acl: dict[str, set[str]] = {}  # key -> allowed PIDs or process names

    def load_from_env(self, keys: list[str]):
        """Load secrets from env, then REMOVE them from env."""
        for key in keys:
            val = os.environ.pop(key, None)
            if val:
                self._secrets[key] = val

    def serve(self):
        Path(self.socket_path).unlink(missing_ok=True)
        sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        sock.bind(self.socket_path)
        os.chmod(self.socket_path, 0o600)
        sock.listen(5)
        print(f"Vault listening on {self.socket_path}")
        while True:
            conn, _ = sock.accept()
            data = conn.recv(1024).decode()
            req = json.loads(data)
            key = req.get("key", "")
            value = self._secrets.get(key, "")
            conn.sendall(json.dumps({"value": value}).encode())
            conn.close()


class SecretClient:
    """Client side — used by your AI application code."""

    def __init__(self, socket_path: str = "/tmp/vault.sock"):
        self.socket_path = socket_path

    @lru_cache(maxsize=64)
    def get(self, key: str) -> str:
        sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        sock.connect(self.socket_path)
        sock.sendall(json.dumps({"key": key}).encode())
        data = sock.recv(4096).decode()
        sock.close()
        return json.loads(data)["value"]


# In your AI app — no env vars, no exfiltration surface
client = SecretClient()
api_key = client.get("OPENAI_API_KEY")

Docker Compose setup to separate vault from app:

# docker-compose.yml
services:
  vault:
    build: ./vault
    volumes:
      - vault_sock:/sockets
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

  app:
    build: ./app
    volumes:
      - vault_sock:/sockets:ro
    # NO secret env vars here
    environment:
      - VAULT_SOCKET=/sockets/vault.sock

volumes:
  vault_sock:

5. Deploy Canary Tokens in Your Environment

Plant fake credentials that trigger alerts when used:

"""canary_tokens.py — Generate and monitor decoy API keys."""

import hashlib
import time
import json
from datetime import datetime, timezone
from pathlib import Path


def generate_canary_key(provider: str, label: str) -> dict:
    """Generate a realistic-looking fake API key."""
    seed = f"{provider}:{label}:{time.time()}"
    h = hashlib.sha256(seed.encode()).hexdigest()

    formats = {
        "openai": f"sk-canary-{h[:48]}",
        "anthropic": f"sk-ant-canary-{h[:40]}",
        "aws": f"AKIA{h[:16].upper()}",
    }

    key = formats.get(provider, f"canary-{h[:32]}")
    return {"provider": provider, "label": label, "key": key, "created": datetime.now(timezone.utc).isoformat()}


def deploy_canaries(output_path: str = "canaries.json"):
    """Generate canary tokens and save the manifest."""
    canaries = [
        generate_canary_key("openai", "prod-backup"),
        generate_canary_key("anthropic", "staging"),
        generate_canary_key("aws", "ml-pipeline"),
    ]
    Path(output_path).write_text(json.dumps(canaries, indent=2))
    print(f"Deployed {len(canaries)} canary tokens")
    return canaries


def inject_canaries_to_env(canaries: list[dict]):
    """
    Set canary keys as env vars alongside real ones.
    Any package that exfiltrates env will grab these too.
    Monitor your canary dashboard for usage attempts.
    """
    import os
    mapping = {
        "openai": "OPENAI_API_KEY_BACKUP",
        "anthropic": "ANTHROPIC_BACKUP_KEY",
        "aws": "AWS_SECRET_ACCESS_KEY_OLD",
    }
    for c in canaries:
        env_var = mapping.get(c["provider"], f"CANARY_{c['provider'].upper()}")
        os.environ[env_var] = c["key"]
        print(f"  Set {env_var} (canary)")


if __name__ == "__main__":
    canaries = deploy_canaries()
    inject_canaries_to_env(canaries)
    print("\n⚡ Canaries deployed. Monitor your provider dashboards for auth attempts.")

When the attacker tries to use the exfiltrated keys → you get an alert from the provider's auth logs.

6. Network Egress Control for AI Workloads

Lock down what your AI containers can talk to:

# docker-compose.yml with network segmentation
services:
  llm-app:
    build: .
    networks:
      - ai_internal
      - ai_egress
    deploy:
      resources:
        limits:
          memory: 4G

  # Egress proxy — only allowed destinations
  egress-proxy:
    image: envoyproxy/envoy:v1.30-latest
    networks:
      - ai_egress
      - internet
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml:ro

networks:
  ai_internal:
    internal: true  # No internet access
  ai_egress:
    internal: true
  internet:
    driver: bridge

Envoy config to whitelist only LLM API endpoints:

# envoy.yaml
static_resources:
  listeners:
    - name: egress
      address:
        socket_address: { address: 0.0.0.0, port_value: 8443 }
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                route_config:
                  virtual_hosts:
                    - name: allowed_apis
                      domains: ["*"]
                      routes:
                        - match: { prefix: "/" }
                          route:
                            cluster: allowed_upstream
                          request_headers_to_add:
                            - header: { key: "X-Egress-Audit", value: "%REQ(:authority)%" }
  clusters:
    - name: allowed_upstream
      type: STRICT_DNS
      load_assignment:
        cluster_name: allowed_upstream
        endpoints:
          - lb_endpoints:
              - endpoint: { address: { socket_address: { address: api.openai.com, port_value: 443 }}}
              - endpoint: { address: { socket_address: { address: api.anthropic.com, port_value: 443 }}}

7. Automated Dependency Audit Pipeline

Tie it all together with a CI pipeline that runs on every PR and on a schedule:

# .github/workflows/dep-audit.yml
name: AI Dependency Audit

on:
  pull_request:
    paths: ["requirements*.txt", "pyproject.toml", "poetry.lock"]
  schedule:
    - cron: "0 6 * * *"  # Daily at 6 AM UTC

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install audit tools
        run: |
          pip install pip-audit safety packj

      - name: pip-audit (known vulnerabilities)
        run: pip-audit -r requirements.txt --strict --desc

      - name: safety check
        run: safety check -r requirements.txt --full-report

      - name: Verify hashes unchanged
        run: |
          pip install --require-hashes -r requirements.txt --dry-run 2>&1 | tee hash_check.log
          if grep -q "HASH MISMATCH" hash_check.log; then
            echo "::error::Hash mismatch detected!"
            exit 1
          fi

      - name: packj — behavioral analysis
        run: |
          # Check for suspicious behaviors in AI packages
          for pkg in litellm openai anthropic langchain-core; do
            echo "=== Analyzing $pkg ==="
            packj audit pypi "$pkg" || true
          done

      - name: Check for typosquatting
        run: |
          python3 -c "
          import re
          from pathlib import Path

          KNOWN_GOOD = {
              'litellm', 'openai', 'anthropic', 'langchain-core',
              'transformers', 'torch', 'numpy', 'pandas'
          }

          req = Path('requirements.txt').read_text()
          pkgs = set(re.findall(r'^([a-zA-Z0-9_-]+)', req, re.MULTILINE))
          unknown = pkgs - KNOWN_GOOD
          if unknown:
              print(f'⚠️  Packages not in allowlist: {unknown}')
              print('Review these manually before merging.')
          "

TL;DR — Your Defense Checklist

#	Defense	Effort	Impact
1	Hash-pinned dependencies	10 min	🛡️🛡️🛡️
2	Integrity verification in CI	30 min	🛡️🛡️🛡️
3	Runtime import monitoring	1 hour	🛡️🛡️
4	Secret isolation (vault pattern)	2 hours	🛡️🛡️🛡️🛡️
5	Canary tokens	30 min	🛡️🛡️
6	Network egress control	1 hour	🛡️🛡️🛡️🛡️
7	Automated audit pipeline	1 hour	🛡️🛡️🛡️

The LiteLLM incident is a wake-up call. AI dependencies have massive install bases and direct access to your most valuable secrets. Treat them like the attack surface they are.

Start with #1 and #4 today. They take 2 hours combined and block the exact attack vector used on March 24.

Found this useful? Follow for more AI engineering security content. Next up: building an air-gapped LLM inference stack.

DEV Community