Mila Kowalski

Posted on Apr 2

You Don't Know What Model Is Reading Your Code Right Now

Two things happened in the last two weeks that should make every developer uncomfortable.

First, a developer named Fynn set up a debug proxy, intercepted Cursor's API traffic, and found this model ID in plain sight: accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast. That's Kimi K2.5, a 1-trillion-parameter open-source model from Moonshot AI, a Beijing-based company backed by Alibaba and Tencent. Cursor, valued at $29.3 billion, had launched Composer 2 as "frontier-level coding intelligence" without mentioning it was built on a Chinese foundation model. The disclosure only came because a random developer intercepted an API call.

Second, Anthropic accidentally shipped Claude Code's entire source code as unminified npm source maps. The full TypeScript codebase, out in the open. They quickly rewrote and re-published in Python, but the original was already mirrored across GitHub.

One company hid what was inside. The other accidentally showed everything.

Both stories point to the same uncomfortable truth: your AI coding tools are a supply chain you're not auditing. And for a profession that spent the last decade learning to lock down dependency chains after left-pad, Log4Shell, and xz-utils, we're being remarkably trusting about the tools that read, analyze, and rewrite our entire codebases.

Your Code Editor Is Now a Supply Chain Dependency

Let's be precise about what happens when you use an AI coding agent.

Your entire codebase, or large chunks of it, gets sent to a remote model. That model processes your proprietary logic, your authentication flows, your database schemas, your business rules. The response comes back and gets applied to your files, sometimes automatically.

You are trusting:

The vendor to route your code to the model they say they're using
The model provider to not retain, train on, or leak your code
The transport layer to be encrypted and not intercepted
The inference provider (which might be different from the vendor) to handle your data correctly
Every dependency in the tool itself to not be compromised

Before the Cursor/Kimi story, how many developers had thought about point #1? You pick "Claude Sonnet" or "GPT-4o" in the dropdown and assume that's what's running. But Cursor demonstrated that the model behind the curtain can be something entirely different from what's advertised. And it happened twice: Composer 1 also quietly used DeepSeek's tokenizer without disclosure.

Cursor co-founder Aman Sanger called it "a miss." VentureBeat called it something more significant: proof that Chinese open-source models are becoming the invisible foundation of the global AI stack. DeepSeek, Kimi, Qwen, and GLM are powering products that market themselves as Western-built AI.

I don't have a problem with using open-source models from any country. I do have a problem with not knowing about it.

The Trust Model Is Completely Backwards

In traditional software supply chains, we've built an entire discipline around knowing what's inside our stack:

SBOMs (Software Bill of Materials) tell you every dependency in your deployed artifact. Container scanning tells you every vulnerability in your base image. License compliance tools flag GPL contamination before it hits production. Dependency pinning ensures you control exactly which version of which package runs in your system.

Now look at your AI coding tools:

Which model version ran on your last prompt? You don't know. Models get swapped, updated, and A/B tested without notification.

Where did your code go? You know it went to an API endpoint. You don't know which inference provider processed it. Cursor routes through Fireworks AI for Kimi-based requests. Did you know that? Did you audit Fireworks' data retention policies?

What gets retained? Every AI vendor has different data policies, and they change. GitHub just announced that starting April 24, Copilot Free, Pro, and Pro+ user data will be used to train models unless you opt out. Did you catch that buried in a blog post?

What model is actually running? As Cursor proved, the model ID in your dropdown might not match the model processing your code. When Fynn intercepted the API call, Composer 2 didn't even try to hide it: kimi-k2p5-rl-0317-s515-fast was right there in the response.

The PANews analysis coined a term I think we should adopt: AI-BOM (AI Bill of Materials). Just like an SBOM lists every software component in your artifact, an AI-BOM would list every model, every inference provider, every data pipeline, and every retention policy involved when your AI tool processes your code.

No AI coding tool provides this today. Not one.

"But I'm Using Claude/GPT Directly, So I'm Fine"

Maybe. But consider:

Claude Code's source leak showed the full system prompt and tool architecture. Anyone who grabbed those source maps now knows exactly how Claude Code works: what tools it has, how it makes decisions, what its system prompt contains, how it handles permissions. That's a roadmap for prompt injection attacks against Claude Code users.

Model routing is becoming standard. Even tools that use "name brand" models increasingly route between them. Cursor picks different models for different tasks. Windsurf swaps between models. GitHub Copilot uses multiple models behind a single interface. The model you think you're using might only handle part of your request.

Inference providers add another layer. Even if you know the model, do you know who's hosting it? A vendor might use Anthropic's model but route through a third-party inference provider for cost or latency reasons. Your code passes through an additional set of servers, with an additional set of data policies, that you never agreed to.

Fine-tuning creates derivative models. Cursor's Composer 2 was Kimi K2.5 plus reinforcement learning. Is that Kimi? Is it Cursor's model? The licensing says one thing, the marketing says another. When your code is processed by a derivative model, whose data policies apply?

What an Actual AI Tool Audit Looks Like

I'm a DevOps engineer. I audit things for a living. Here's the checklist I now run for every AI coding tool before it touches our codebase.

1. Network Traffic Analysis

Before you trust any AI tool, proxy its traffic and see where your code actually goes.

# Set up mitmproxy to intercept AI tool traffic
# This is how Fynn caught Cursor

mitmproxy --mode regular --listen-port 8080

# Configure your AI tool to use the proxy
# (usually via HTTP_PROXY / HTTPS_PROXY env vars)
export HTTP_PROXY=http://localhost:8080
export HTTPS_PROXY=http://localhost:8080

# Now use the tool normally and watch what endpoints
# it calls, what payloads it sends, and what model IDs
# appear in the responses

What you're looking for:

Which endpoints receive your code
What model IDs appear in responses (like Fynn's kimi-k2p5-rl-0317-s515-fast)
Whether requests go to the vendor directly or through a third-party inference provider
How much of your codebase is included in each request
Whether telemetry or analytics calls send code snippets

2. Data Policy Mapping

For every AI tool your team uses, document:

Tool: [name]
Vendor: [company]
Model(s) used: [list, if disclosed]
Inference provider: [if different from vendor]
Data retention: [policy, with date checked]
Training opt-out: [yes/no, default state]
SOC 2 / ISO 27001: [status]
Data residency: [where is code processed geographically]
Last policy change: [date]

Check these quarterly. Policies change. GitHub's training opt-out change was announced in a blog post, not an email to affected users.

3. Code Exposure Assessment

Not all code is equal. Map your risk:

# Simple framework for classifying code sensitivity
# Decide what your AI tool should and shouldn't see

SENSITIVITY_LEVELS = {
    "public": {
        "description": "Open-source or public-facing code",
        "ai_allowed": True,
        "examples": ["docs/", "public/", "examples/"]
    },
    "internal": {
        "description": "Business logic, non-sensitive internals",
        "ai_allowed": True,
        "requires_review": False,
        "examples": ["src/components/", "src/utils/"]
    },
    "sensitive": {
        "description": "Auth, payments, PII handling, crypto",
        "ai_allowed": "with_approval",
        "requires_review": True,
        "examples": ["src/auth/", "src/payments/", "src/crypto/"]
    },
    "restricted": {
        "description": "Secrets, keys, proprietary algorithms",
        "ai_allowed": False,
        "examples": [".env", "src/core/pricing-engine/"]
    }
}

Most teams send everything to their AI tool indiscriminately. A .gitignore keeps secrets out of your repo. What's the equivalent for keeping sensitive code out of your AI tool's context?

4. Model Provenance Verification

After the Cursor incident, I now verify what model is actually running:

# If your tool uses an OpenAI-compatible API, you can often
# inspect the model field in responses

# For tools with debug/verbose modes:
CURSOR_DEBUG=1 cursor .  # Check if model IDs leak in debug output

# For Claude Code, check the --verbose flag
claude --verbose  # Watch which model and version is invoked

# For any tool, check DNS queries to see which
# inference endpoints it contacts
sudo tcpdump -i any port 443 -nn | grep -i "api\|inference\|model"

If a tool won't tell you what model is processing your code, that's a red flag. Not a deal-breaker (maybe they have competitive reasons), but a factor in your risk assessment.

What the Industry Should Build (But Hasn't)

AI-BOM Standard

Every AI tool that processes code should publish a machine-readable bill of materials:

{
  "tool": "cursor",
  "version": "2.4.1",
  "models": [
    {
      "name": "composer-2",
      "base_model": "kimi-k2.5",
      "base_model_provider": "moonshot-ai",
      "fine_tuning": "reinforcement-learning",
      "inference_provider": "fireworks-ai",
      "data_residency": "us-west-2"
    }
  ],
  "data_retention": "none",
  "training_opt_out": true,
  "last_updated": "2026-03-19"
}

This doesn't exist yet. But after the Cursor incident, the PANews analysis and several security researchers are calling for exactly this. Given that SBOMs took a decade to become standard, I'm not holding my breath, but the demand is building.

Model Transparency Policies

Cursor's Aman Sanger said they'll "fix that for the next model." But the fix shouldn't be a voluntary disclosure. It should be a standard expectation:

Disclose the base model and its provenance
Disclose the inference provider
Publish data retention and training policies in a standardized, machine-readable format
Notify users when models change (not just when someone intercepts an API call)

Boundary Enforcement in AI Tools

Your AI tool should let you configure:

Which directories it can read and send to the model
Which files are excluded from AI context (a .aiignore, equivalent to .gitignore)
Whether sensitive patterns (API keys, connection strings, PII) are redacted before sending
Maximum context window size (to control how much code leaves your machine per request)

Some tools are starting to do parts of this. Claude Code has permission modes. Cursor has .cursorignore. But the implementations are inconsistent, incomplete, and often opt-in rather than opt-out.

My Setup Now

After the Cursor and Claude Code incidents, here's what changed in my workflow:

I proxy AI tool traffic weekly. A 30-minute session with mitmproxy, checking endpoints, model IDs, and payload sizes. It's the same discipline as reviewing your cloud spend: you look at it regularly because surprises are expensive.

I maintain an AI tool inventory. Every tool, every model, every policy, checked quarterly. Treat it like your dependency audit.

Sensitive code is excluded by default. Auth modules, payment logic, and cryptographic implementations have a .aiignore rule. If I need AI help in those areas, I copy sanitized snippets manually.

I pin model versions when possible. For API-based workflows (not IDE plugins), I specify exact model versions in my config. When the vendor updates, I test before upgrading, just like any other dependency.

I read the policy updates. GitHub's April 24 training data change. Anthropic's data retention updates. Cursor's model swap. These get buried in blog posts. I have RSS feeds for every vendor's changelog.

Is this paranoid? Maybe. Is it more paranoid than running npm audit on your dependencies? No. It's the same discipline, applied to a new category of supply chain risk.

The Takeaway

A $29 billion company shipped a model built on a Chinese open-source foundation and forgot to mention it. Another company accidentally published their tool's entire source code via npm. GitHub is quietly changing its training data policy. And every day, millions of developers send their proprietary codebases to AI models without asking basic questions about where that code goes, what processes it, and who keeps it.

We learned the hard way with Log4Shell that software supply chains need active management. We learned with xz-utils that even trusted open-source maintainers can be compromised. The AI tool supply chain is the next version of this lesson, and we're still in the "trusting everything by default" phase.

Your AI coding tool is the newest, most powerful, most trusted, and least audited dependency in your entire stack.

Maybe start auditing it.

DEV Community