How I built a pure client-side sanitizer to stop leaking Stripe tokens to ChatGPT.

#webdev #devops #aws #postgres

Like most developers, my go-to debugging strategy often involves dumping massive Nginx configs or React error traces straight into AI tools like Claude and ChatGPT. A few weeks ago, I made a classic mistake: I accidentally included a production DB URI and a Stripe token in my prompt. It made me realize how dangerously easy it is to leak credentials when you're moving fast and frustrated by a bug.

I searched around for a lightweight client-side scrubber to sanitize my text, but came up empty. The tools I found either choked on multi-line secrets (like RSA private keys) or completely failed on greedy regex traps (like the @ symbol in database credentials).

So I spent the last couple of weekends building GhostSanitizer — a pure TypeScript/Regex engine that runs entirely in the browser. It intercepts your text, tokenizes any high-entropy secrets (AWS keys, JWTs, URIs) into dummy tokens (e.g., STACK_SEC_1), and only sends the safe structure to the LLM. When the LLM replies with the refactored code, the browser locally maps the tokens back to your real secrets so you can 1-click copy it.

I open-sourced the core sanitizer logic here if anyone wants to use it for their own AI wrappers:

https://github.com/abests/ghost-sanitizer-js

The Live Sandbox:

To actually make it useful for myself, I wrapped it into a UI and used a Python script to map out and generate specific pre-loaded prompts for about 500 annoying WebDev/DevOps error scenarios (mostly React hydration errors, Nginx 502s/CORS issues). Yes, it's a generated matrix, but it forces the LLM to stay highly context-specific instead of giving generic advice.

You can test the live decryption theater (press F12 to watch it redact before the network request) here: stackengine.dev

(Link to a specific issue like PostgreSQL Deadlock ShareLock if you want to see the specific system prompts in action).

Full transparency on the model:

I threw in a tiny free tier (3 uses) routed through my own Claude 3.5 Haiku key just so people can test the decryption UI. After that, you have to drop in your own OpenAI/Anthropic key (BYOK). When you use your own key, the requests go straight from your browser to the LLM provider. I have zero backend database and literally store nothing.

Let me know if you manage to break the local regex mask. I'm actively trying to find edge cases to patch in the repo. Cheers!

DEV Community

How I built a pure client-side sanitizer to stop leaking Stripe tokens to ChatGPT.

Top comments (0)