Hermetic Dev

Posted on Apr 16 • Originally published at hermeticsys.com

GitHub Copilot Will Train on Your Code Context. Here's What That Means for Your API Keys.

#opensource #ai #security #github

April 2026 · The Hermetic Project

On March 25, GitHub announced that starting April 24, Copilot Free, Pro, and Pro+ users' interaction data will be used to train AI models. The data includes inputs sent to Copilot, code snippets shown to the model, code context surrounding your cursor position, file names, repository structure, and navigation patterns. Users can opt out. Business and Enterprise tiers are excluded.

This is a reasonable decision by GitHub. Real-world interaction data produces better models. The opt-out exists. The post is transparent about what's collected.

But it has a consequence that the announcement doesn't address: if your credentials are in your code context, they're in the training pipeline.

The problem isn't GitHub. The problem is where credentials live.

Most developers store API keys in one of four places: .env files in the project root, IDE configuration files (claude_desktop_config.json, .cursor/mcp.json), environment variables visible in the terminal, or hardcoded in source files during development.

All four are in the blast radius of "code context surrounding your cursor position."

When Copilot is active, it reads files in your workspace to provide relevant suggestions. If your working directory contains a .env file with STRIPE_SECRET_KEY=sk_live_..., that string is part of the context window. If your claude_desktop_config.json has a cleartext GitHub token in the env block, Copilot sees it when you're editing MCP server configurations. If you echo $API_KEY in your terminal, Copilot's terminal integration captures that context.

GitHub's post states: "We use the phrase 'at rest' deliberately because Copilot does process code from private repositories when you are actively using Copilot. This interaction data is required to run the service and could be used for model training unless you opt out."

The credentials aren't being targeted. They're collateral. They exist in files that Copilot legitimately needs to read to do its job. The training pipeline inherits whatever is in those files.

Opt-out doesn't fix the architecture

You can opt out of model training in GitHub settings. Your interaction data won't be used for training. Problem solved?

Not quite. The opt-out controls whether your data is used for training. It doesn't change what Copilot processes during active use. The context window still contains your credentials. They still flow to GitHub's servers for inference. The opt-out prevents them from entering the training pipeline, but they've already left your machine.

This isn't a GitHub-specific concern. Every AI coding agent — Claude Code, Cursor, Windsurf, Cline, Copilot — processes the files in your workspace. Every one of them sends code context to a remote inference endpoint. If your credentials are in that context, they travel with it.

The question isn't whether to opt out of training. The question is whether your credentials should be in the agent's context at all.

The architectural solution: credentials that never enter context

There's a different approach. Instead of storing credentials in files that agents read, store them in an encrypted vault that the agent cannot access. When the agent needs to make an authenticated API call, it sends the request to a local daemon with an opaque reference — not the credential itself. The daemon injects the real credential, makes the HTTPS call, and returns only the response. The agent never sees, holds, or transmits the key.

The credential doesn't appear in .env files. It doesn't appear in IDE configs. It doesn't appear in environment variables. It doesn't appear in the terminal. It doesn't appear anywhere that an AI agent's context window can reach.

No credential in context means nothing for Copilot to process. Nothing to send to inference servers. Nothing to enter a training pipeline. The opt-out becomes irrelevant because there's nothing to opt out of.

This isn't a hypothetical architecture. We built it.

What we built

Hermetic is a local daemon that brokers credentials for AI agents. The cryptographic core is open source (AGPL-3.0). It runs on your machine. Zero cloud. Zero telemetry.

Instead of this in your IDE config:

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "ghp_REAL_TOKEN_IN_CLEARTEXT" }
    }
  }
}

You have this:

{
  "mcpServers": {
    "github": {
      "command": "hermetic",
      "args": ["proxy", "--server", "github", "--",
               "npx", "-y", "@modelcontextprotocol/server-github"]
    }
  }
}

No env block. No cleartext token on disk. No credential in any file that any agent can read.

The daemon handles authenticated API calls, MCP server credential injection, and CLI tool authentication — three tiers covering every way an AI agent needs credentials. The agent works normally. It just never touches the keys.

The broader point

GitHub's policy change is a symptom, not the disease. The disease is that credentials live in files that AI agents read. As long as that's true, every AI agent is a credential exfiltration surface — whether through training pipelines, prompt injection, supply chain attacks, or simple log aggregation.

The Axios npm supply chain attack in March 2026 harvested credentials from developer machines by reading environment variables and .env files. It didn't need AI. It just read the files. AI agents make the same files accessible to a larger attack surface: remote inference servers, training pipelines, and any prompt injection that can redirect agent behavior.

The fix isn't better opt-out controls. The fix is removing credentials from the agent's reach entirely.

Hermetic is open source. The code, the threat model, and the security research behind it are at github.com/hermetic-sys/hermetic.

The Hermetic Project · hermeticsys.com · AGPL-3.0-or-later

Top comments (1)

Rahul Joshi • Apr 21

This is a critical reminder that while opting out of AI training is a start, the real security win lies in credential isolation—ensuring that secrets never enter the 'context window' where coding agents can see them in the first place.