DEV Community

Cover image for Stop Your AI Agent from Leaking API Keys, Private Keys, and PII
Natnael Getenew
Natnael Getenew

Posted on

Stop Your AI Agent from Leaking API Keys, Private Keys, and PII

Stop Your AI Agent from Leaking API Keys, Private Keys, and PII

Your AI agent generates text. That text sometimes contains secrets.

Maybe the LLM hallucinated an AWS key from its training data. Maybe a tool returned database credentials in its output. Maybe the agent is summarizing a document that contains a user's SSN, email, or crypto wallet private key.

If that output reaches the end user — or worse, gets logged to a third-party service — you have a data breach.

This post covers how to automatically strip sensitive data from any text before it leaves your system, using the redact() function from Agntor SDK. It ships with 17 built-in patterns covering PII, cloud secrets, and blockchain-specific keys.

Install

npm install @agntor/sdk
Enter fullscreen mode Exit fullscreen mode

Basic Usage

import { redact } from "@agntor/sdk";

const input = `
  Here are the credentials:
  AWS Key: AKIA1234567890ABCDEF
  Email: admin@internal-corp.com
  Server: 192.168.1.100
`;

const { redacted, findings } = redact(input, {});

console.log(redacted);
// Here are the credentials:
//   AWS Key: [AWS_KEY]
//   Email: [EMAIL]
//   Server: [IP_ADDRESS]

console.log(findings);
// [
//   { type: "aws_access_key", span: [42, 62] },
//   { type: "email", span: [72, 95] },
//   { type: "ipv4", span: [106, 119] }
// ]
Enter fullscreen mode Exit fullscreen mode

Zero configuration. The empty policy {} uses all 17 built-in patterns.

What Gets Caught

Standard PII

Type Example Replaced With
Email user@example.com [EMAIL]
Phone (US) +1 (555) 123-4567 [PHONE]
SSN 123-45-6789 [SSN]
Credit card 4111 1111 1111 1111 [CREDIT_CARD]
Street address 123 Main Street [ADDRESS]
IPv4 192.168.1.1 [IP_ADDRESS]

Cloud Secrets

Type Example Replaced With
AWS access key AKIA1234567890ABCDEF [AWS_KEY]
Bearer token Bearer eyJhbGciOiJI... Bearer [REDACTED]
API key/secret assignments api_key: "sk-abc123..." api_key: [REDACTED]

The API key pattern is smart — it matches api_key, secret, password, and token followed by : or = and a value of 20+ characters. The key name is preserved in the output so you know which secret was redacted.

Blockchain / Crypto Keys

This is where Agntor's redaction stands out. If your agents operate in the crypto space, these patterns are critical:

Type Example Replaced With
EVM private key 0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 [PRIVATE_KEY]
Solana private key 87-88 char base58 string [SOLANA_PRIVATE_KEY]
Bitcoin WIF key Starts with 5, K, or L + 50-51 base58 chars [BTC_PRIVATE_KEY]
BIP-39 mnemonic (12 words) abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about [MNEMONIC_12]
BIP-39 mnemonic (24 words) 24-word seed phrase [MNEMONIC_24]
Keystore JSON ciphertext "ciphertext": "a1b2c3..." "ciphertext": "[REDACTED_KEYSTORE]"
HD derivation path m/44'/60'/0'/0/0 [HD_PATH]

Real Example: Crypto Agent Output

import { redact } from "@agntor/sdk";

const agentOutput = `
  I've set up your wallet. Here are the details:
  Address: 0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18
  Private Key: 0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80
  Recovery Phrase: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about
  Derivation Path: m/44'/60'/0'/0/0
`;

const { redacted } = redact(agentOutput, {});

console.log(redacted);
// I've set up your wallet. Here are the details:
//   Address: 0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18
//   Private Key: [PRIVATE_KEY]
//   Recovery Phrase: [MNEMONIC_12]
//   Derivation Path: [HD_PATH]
Enter fullscreen mode Exit fullscreen mode

Notice that the public wallet address (42 hex chars) is not redacted — only the private key (64 hex chars) is. The regex specifically matches 64 hex characters, which is the length of an EVM private key.

Custom Patterns

Add your own patterns for domain-specific secrets:

const { redacted } = redact(agentOutput, {
  redactionPatterns: [
    {
      type: "internal_endpoint",
      regex: /https?:\/\/internal\.[a-z]+\.corp\/[^\s]*/gi,
      replacement: "[INTERNAL_URL]",
    },
    {
      type: "jwt_token",
      regex: /eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+/g,
      replacement: "[JWT]",
    },
  ],
});
Enter fullscreen mode Exit fullscreen mode

Custom patterns are merged with the defaults. You keep all 17 built-in patterns plus your additions.

How Overlapping Matches Are Handled

What happens when two patterns match overlapping text? For example, a hex string that could be both a private key and part of an API key assignment.

The algorithm:

  1. Runs all patterns via matchAll() to collect every match with position
  2. Sorts by start position, then by length (longest first)
  3. Scans left-to-right: if a match overlaps with an already-accepted match, it's skipped

This means the longest, leftmost match wins. In practice, this produces the most useful output — you see [PRIVATE_KEY] rather than a partially-redacted string.

Express Middleware Example

Here's a practical middleware that redacts all JSON responses:

import express from "express";
import { redact } from "@agntor/sdk";

const app = express();

// Redaction middleware — intercepts JSON responses
app.use((req, res, next) => {
  const originalJson = res.json.bind(res);

  res.json = (body: unknown) => {
    const bodyStr = JSON.stringify(body);
    const { redacted, findings } = redact(bodyStr, {});

    if (findings.length > 0) {
      console.warn(
        `Redacted ${findings.length} sensitive items:`,
        findings.map((f) => f.type)
      );
    }

    return originalJson(JSON.parse(redacted));
  };

  next();
});

app.post("/api/agent", async (req, res) => {
  const llmOutput = await callYourLLM(req.body.prompt);
  // Even if the LLM leaks secrets, they get stripped here
  res.json({ result: llmOutput });
});
Enter fullscreen mode Exit fullscreen mode

Combining with Input Guard

Redaction handles the output side. For the input side, combine it with guard():

import { guard, redact } from "@agntor/sdk";

async function processAgentRequest(userInput: string) {
  // 1. Guard the input
  const guardResult = await guard(userInput, {});
  if (guardResult.classification === "block") {
    throw new Error("Input rejected: " + guardResult.violation_types.join(", "));
  }

  // 2. Process with your LLM
  const output = await callYourLLM(userInput);

  // 3. Redact the output
  const { redacted } = redact(output, {});

  return redacted;
}
Enter fullscreen mode Exit fullscreen mode

Or use wrapAgentTool() which does guard + redact + SSRF check in one call:

import { wrapAgentTool } from "@agntor/sdk";

const safeTool = wrapAgentTool(myTool, {
  policy: {},
});

// Inputs are redacted and guarded, then the tool executes
const result = await safeTool("https://api.example.com/data");
Enter fullscreen mode Exit fullscreen mode

Performance

Redaction runs entirely in-process with regex. There are no network calls, no LLM inference, no external dependencies (beyond the SDK itself).

On typical agent output (500-2000 characters), redact() completes in under 1ms. Even on large documents (100KB+), it stays under 10ms. You can safely call it on every response without measurable latency impact.

Limitations

  • False positives on hex strings: A 64-character hex hash (like a SHA-256 digest) will match the private key pattern. If your agent output frequently contains non-secret hex hashes, you may want to adjust the pattern.
  • Mnemonic detection is greedy: Any sequence of 12 or 24 lowercase words of 3-8 characters will match. This could flag legitimate English text in rare cases.
  • No semantic understanding: The redaction is purely pattern-based. It can't distinguish between a real AWS key and a string that looks like one. This is the right tradeoff — false positives are safer than false negatives when it comes to secret leakage.

Source Code

Everything is open source (MIT):

If you're building agents that generate text — especially agents that interact with APIs, databases, or blockchain — add output redaction. It's a one-line change that prevents an entire class of data breaches.


Agntor is an open-source trust and payment rail for AI agents. Star the repo if this was useful.

Top comments (0)