Maria Khan

Posted on May 31

Secret scanning for the agent era: verify the leak, then fix it

#security #rust #ai #devtool

The pull request looked clean. The tests were green, the diff was small, and the commit message was tidy. Three lines deep in a config helper there was a real AWS access key that the coding agent had pasted in to make an example run. Nobody read those three lines, because nobody reads every line an agent writes anymore.

When a person writes a secret into a file, it usually passes under someone's eyes before it reaches a commit. When an agent writes one, it often does not. The reviewer is either the same agent that wrote the code or a human skimming a 600 line diff. A regex scanner can match the shape of a key, but it cannot tell you whether that key is live, and it cannot fix it for you. So the leak ships, someone force pushes over it, it sits in the history, and you find out when the provider sends you an email.

leakferret is a secret scanner written for that situation. It is a single Rust binary that runs as a CLI in CI and as an MCP server that an agent can call on its own work before it commits. It checks findings against the real provider, it keeps the raw secret on your machine, and it can apply the fix by replacing the hardcoded value with an environment variable lookup. This post explains how each of those parts works and how to wire it into both a CI pipeline and an agent loop. It is open source under MIT and free to use.

1. The new threat model: who writes secrets now

Secret scanning was built around a person at the keyboard. A developer copies a key into a file, a pre-commit hook might warn them, and a CI job might flag it later. The whole pipeline assumes that a human made a mistake and that the tool needs to catch it on the way out.

That assumption is starting to break. A large and growing share of the code that lands in your repository is now written by an agent such as Claude Code, Cursor, Copilot, or an automated loop running in CI. Agents hardcode secrets for the same plain reasons people do. They want a snippet to run, they want to unblock a test, or the key was already sitting in the context window. The difference is that the review step is often gone. The program that wrote the code is frequently the same program that reviews it, and the human in the loop is skimming.

This is why shape detection was only ever half the job. A regex match tells you that a string looks like an AWS key. It does not tell you the two things you need to know before you decide whether to worry. The first is whether the key is live, because a dead key, a rotated key, or a documentation example is not an incident, while a live key is a real problem. The second is whether you can fix it, because flagging a leak and stopping there leaves a person to do the slow and error prone part by hand. That means pulling the literal out, wiring up an environment variable, updating .env.example, and seeding the secret manager.

leakferret is built to close that gap. It tells live keys apart from dead ones and examples, and then it applies the fix.

2. The five station pipeline

A scan runs through five stages. Each stage only sees what it needs, and the raw secret never moves past the file on disk.

Scan

The first stage is a fast regex pre-filter over your files, which is the same kind of detection that gitleaks does. It respects .gitignore and reads dotfiles such as .env. With the --git flag it walks commit history, and it does that as a diff, so it reports the commit that introduced a secret rather than every later commit that happens to touch the same line.

leakferret scan .

This stage is cheap and a little noisy on purpose. Its job is to surface candidates, and the later stages decide which ones matter.

Catalog

Before any candidate gets escalated, it is checked against a signed database of known public example credentials. That database includes Stripe test keys, AKIAIOSFODNN7EXAMPLE, jwt.io sample tokens, and similar values. Anything that matches gets a fixed FIXTURE verdict.

This step is what separates a tool you keep from a tool you turn off. AKIAIOSFODNN7EXAMPLE shows up in countless READMEs and in AWS's own documentation, so paging someone about it only teaches them to ignore the scanner. The catalog is signed with Ed25519 and ships with the binary, and it can be refreshed, so documented examples do not raise a false alarm.

Classify

Every candidate that survives gets one of three verdicts. REAL means it looks like a genuine credential that is in use. FIXTURE means it is a known or documented example. UNKNOWN means the tool cannot tell while it is offline.

Classification runs offline by default, using path heuristics and dummy value markers. Inside an editor or an agent it can instead ask the model you already have, which means there is no extra API key and no added cost, and it only ever sends the redacted preview rather than the secret itself.

The rule of thumb here is that a false positive is better than a false negative. When the tool is not sure, it reports the candidate as UNKNOWN instead of dropping it. You can always filter results, but you cannot un-leak a key you never saw.

Verify

This is the stage that regex scanners do not have. leakferret makes one harmless, read only API call to the actual provider to confirm that a key is live.

The read only part matters. Each verifier uses an identity or read endpoint and never a call that changes anything. For AWS it is a SigV4 signed STS GetCallerIdentity request, which returns who the key belongs to and changes nothing. For GitHub it reads the authenticated user. The other providers work the same way.

About 15 providers are covered directly, including AWS, GitHub, GitLab, Stripe, OpenAI, Anthropic, Slack, Twilio, SendGrid, Mailgun, Datadog, Heroku, npm, PyPI, and DigitalOcean. Anything outside that list falls back to trufflehog. The request goes straight from your machine to the provider, because leakferret has no servers of its own.

leakferret verify .

src/aws.py:14   aws_secret_access_key   REAL · LIVE
  wJal…EKEY   verified against AWS STS (HTTP 200)

.env:3          stripe_secret_key       FIXTURE
  sk_test…p7dc   documented Stripe test key (catalog)

README.md:88    github_pat              UNKNOWN
  ghp_…s1Az   could not verify (offline)

3 findings · 1 live · 1 fixture · 1 unknown

Rewrite

Finding a leak is only half of the work. The rewrite stage does the other half. It replaces the hardcoded value with an environment variable lookup in the right form for the language, which is os.environ in Python, process.env in JavaScript and TypeScript, and ENV.fetch in Ruby. It then adds a line to .env.example and prints seed commands for your secret manager.

leakferret rewrite src/aws.py --dry-run-diff   # preview, change nothing
leakferret rewrite src/aws.py --apply          # write the change in place

src/aws.py:14
- AWS_SECRET_ACCESS_KEY = "wJal…EKEY"
+ AWS_SECRET_ACCESS_KEY = os.environ["AWS_SECRET_ACCESS_KEY"]

+ .env.example   AWS_SECRET_ACCESS_KEY=
✓ rewrote 1 literal · added 1 .env.example entry

The --backend option accepts env, which is the default, along with vault, doppler, aws-secrets-manager, and infisical, and it prints the matching seed command. By default it only rewrites findings that were confirmed live. If you add --include-unknown, it will also fix candidates that were not confirmed.

3. The privacy invariant

A tool that reads all of your secrets and calls external APIs is asking for a lot of trust. leakferret answers that with a single rule that you can test. The full secret value is only ever read from disk. It is never written into a report, a log, a network message, or a model prompt.

Only a redacted preview of the first four and last four characters, such as AKIA...4XYZ, ever leaves the file. The classification prompt sees that preview. The JSON, SARIF, and terminal output see that preview. The MCP responses see that preview.

The one time the full value is used is during verification, and that request goes from your machine straight to the provider. Your computer signs a request to AWS or GitHub directly. There is no leakferret server, no proxy, no telemetry, and no account, so nothing is collected and there is nowhere for it to go.

Because a promise is not the same as a guarantee, there is a dedicated test that fails the build if a raw secret ever reaches an output. The rule is enforced in CI rather than only described in the README.

4. Wiring it into CI

leakferret is a single binary with clear exit codes, where 0 means clean and 1 means findings, so it fits into any pipeline. On GitHub, the official Action installs the binary, runs the scan, and uploads SARIF to Code Scanning so that findings appear inline on the pull request.

# .github/workflows/leakferret.yml
name: leakferret
on: [pull_request, push]

permissions:
  contents: read
  security-events: write   # required for SARIF upload

jobs:
  secrets:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: leakferrethq/leakferret-action@v1
        with:
          verify-mode: only-verified
          fail-on: any

On CircleCI, GitLab CI, Argo, or Jenkins the steps are the same. You install the binary and run leakferret verify on the repository.

Two flags help keep CI quiet. The first is the baseline. Running leakferret baseline init stores one way HMAC fingerprints of the current findings, never the raw secret, and the per repository salt is added to .gitignore automatically. Once you commit .leakferret-baseline.json, CI only fails on new leaks, so you do not have to clear the entire backlog before you turn the check on. The second is the verification mode. With --only-verified, the build fails only on keys that were confirmed live. With ever-verified and a baseline, it fails on anything that ever verified live, which catches a key that was rotated but is still present in the history.

For history specifically, scan --git walks commits as a diff, so you can review what was introduced without wading through every file that later touched the same line.

5. Wiring it into an agent with MCP

This is the part that the rest of the project was built for. The same binary is also an MCP server, so a coding agent can run the scanner on its own work before it commits.

npx @leakferret/mcp

Add it to your agent's MCP configuration, which is .mcp.json in Claude Code, the MCP settings in Cursor, or claude_desktop_config.json in Claude Desktop.

{
  "mcpServers": {
    "leakferret": {
      "command": "npx",
      "args": ["@leakferret/mcp"]
    }
  }
}

It exposes five tools.

Tool	What it does
scan_repository	Walk a path and return regex pre-filter candidates
classify_candidates	Apply REAL, FIXTURE, or UNKNOWN verdicts
verify_finding	Run a live, read only check against the provider
propose_rewrite	Propose an environment variable replacement for a real finding
baseline_diff	Diff a scan against the repository baseline

The loop you want is simple to describe. The agent scans its own diff, verifies which findings are live, rewrites the leak, and only then commits. Agents will not do this on their own, so you give them a rule. In Cursor, a single workspace rule is enough.

Before you run git commit, call the leakferret scan_repository tool.
If any finding is REAL or UNKNOWN, stop and fix it with propose_rewrite
before committing.

With that rule in place, the program that writes your code checks its own output before it lands, which puts the review step back where it used to be.

One thing to expect is that running leakferret mcp in a normal terminal looks like it has frozen. That is correct behavior. It is a stdio JSON-RPC server waiting for an editor to connect, and it is not a command you run by hand.

6. Honest limits at v0.1

This is an early release, so I will be upfront about the rough edges.

There are about 15 native verifiers. The major providers are covered directly, and everything else falls back to trufflehog. Adding more native verifiers is the main item on the roadmap.

There are five prebuilt targets, which are x86_64 Linux, x86_64 and arm64 macOS, and x86_64 and arm64 Windows. Linux ARM64 is not published yet, so on that platform you build from source and point LEAKFERRET_BIN at the binary.

There is no hosted product. There are no servers by design, and the CLI, the engine, the MCP server, and every language wrapper are MIT licensed and free. The fixture catalog data is licensed under CC-BY-SA-4.0.

If you find a provider that should verify but does not, that is the most useful issue you can file.

7. Try it

cargo install leakferret-cli
npm i -g @leakferret/cli
gem install leakferret
go install github.com/leakferrethq/leakferret-go/cmd/leakferret@latest

leakferret scan .
leakferret verify .
leakferret rewrite . --dry-run-diff

It is also available as a GitHub Action and as a VS Code and Cursor extension.

The repository is at https://github.com/leakferrethq/leakferret and the site is at https://leakferret.com.

If you run it on a real repository and it catches something, or misses something, I would like to hear about it, because that is what will shape the next version. You can file an issue or tell me which provider you want verified next.

DEV Community