DEV Community

Cover image for How I Built Real-Time PII Detection Inside ChatGPT's Hostile Text Editor (Without Breaking It)
Apoorva Sharma
Apoorva Sharma

Posted on

How I Built Real-Time PII Detection Inside ChatGPT's Hostile Text Editor (Without Breaking It)

If you've ever tried to build a Chrome extension that modifies text inside ChatGPT, Gemini, or Claude, you know it's not like injecting into a normal <textarea>.

These apps use rich text editors — ProseMirror (ChatGPT), Draft.js-style frameworks — that actively fight external DOM manipulation. They reconcile state internally, swallow events, and will silently undo your changes on the next keystroke.

I spent months figuring out how to detect and highlight sensitive data (API keys, passwords, PII) inside these editors without breaking them. This is the technical story of what I built, what failed, and the architecture I landed on.

The project is called Prompt Armour — a Chrome extension that intercepts your input on AI chatbot platforms and catches sensitive data before it's sent to the LLM. It runs 100% client-side. No servers, no data collection.

But this article isn't a product pitch. It's about the engineering.


The Problem: You Can't Just Modify the DOM

My first instinct was simple. Watch the input, find sensitive strings with regex, wrap them in a <span> with a red background.

// The naive approach — DO NOT do this inside ProseMirror
const match = editor.innerHTML.match(regex);
if (match) {
  editor.innerHTML = editor.innerHTML.replace(
    match[0],
    `<span class="highlight">${match[0]}</span>`
  );
}
Enter fullscreen mode Exit fullscreen mode

This works for about 200 milliseconds. Then ProseMirror's internal state reconciliation runs, sees DOM nodes it didn't create, and either:

  1. Strips them out silently
  2. Duplicates content as it tries to reconcile
  3. Breaks the cursor position so the user is suddenly typing in the wrong place

ChatGPT's editor isn't a <textarea>. It's a contenteditable div managed by ProseMirror, which maintains its own document model. Any DOM mutation you make outside ProseMirror's transaction system gets treated as corruption.

Gemini and Claude have similar architectures. These are hostile environments for extension developers.


The Solution: CSS Custom Highlight API

The breakthrough was the CSS Custom Highlight API — a relatively new browser API that lets you apply visual highlights to arbitrary text ranges without modifying the DOM at all.

Here's the key insight: instead of wrapping text in <span> elements, you create Range objects pointing at the text nodes, group them into a Highlight object, and register it with CSS.highlights. The browser renders the visual highlight at the paint level. ProseMirror never knows anything happened.

// Create a range pointing at the sensitive text
const range = new Range();
range.setStart(textNode, matchStart);
range.setEnd(textNode, matchEnd);

// Register it as a CSS highlight
const highlight = new Highlight(range);
CSS.highlights.set("prompt-armour-pii", highlight);
Enter fullscreen mode Exit fullscreen mode
::highlight(prompt-armour-pii) {
  background-color: rgba(239, 68, 68, 0.3);
  color: inherit;
}
Enter fullscreen mode Exit fullscreen mode

This is non-destructive highlighting. The DOM stays exactly as ProseMirror expects. No nodes added, no attributes changed, no state corruption. The user sees red highlights on their API keys and SSNs, but the editor has zero awareness that anything happened.

Why This Is Hard to Find

Most extension developers don't know this API exists. If you Google "highlight text in contenteditable Chrome extension," you'll get 50 Stack Overflow answers telling you to wrap text in <mark> tags or use document.execCommand. All of which will break inside ProseMirror.

The CSS Custom Highlight API has been stable in Chromium since Chrome 105, but adoption is still low because most use cases don't involve fighting a hostile rich text framework. For Prompt Armour, it was the only viable path.


The Detection Engine: Regex + Shannon Entropy

Highlighting is just the display layer. The actual detection runs a multi-pass scanning engine on every input change.

Pass 1: Pattern Matching

A library of regex patterns catches known formats:

Emails              → standard RFC-ish pattern
Phone numbers       → standard + fuzzy (no dashes, spaces, etc.)
Credit cards        → Luhn-validated 13-19 digit sequences
SSNs                → XXX-XX-XXXX with area number validation
AWS Access Keys     → AKIA[0-9A-Z]{16}
AWS Secret Keys     → 40-char base64 following known prefixes
AWS ARNs            → arn:aws:*
EC2 Instance IDs    → i-[0-9a-f]{8,17}
MongoDB URIs        → mongodb(+srv)?://...
Postgres/MySQL URIs → postgres(ql)?://... , mysql://...
Redis URIs          → redis://...
Bearer Tokens       → Bearer [A-Za-z0-9\-._~+/]+=*
Basic Auth          → Basic [A-Za-z0-9+/]+=*
Session Cookies     → session patterns with base64/hex values
IPv4/IPv6           → standard patterns with private range flagging
MAC Addresses       → colon and dash separated hex octets
Enter fullscreen mode Exit fullscreen mode

Pass 2: Shannon Entropy Scanner

This is where it gets interesting. Not every secret matches a known pattern. Random passwords, custom API keys, encrypted tokens — these just look like high-entropy gibberish.

Shannon entropy measures the randomness of a string. English text averages around 3.5-4.0 bits per character. A random 32-character API key hits 5.5-6.0+.

function shannonEntropy(str: string): number {
  const freq: Record<string, number> = {};
  for (const char of str) {
    freq[char] = (freq[char] || 0) + 1;
  }
  let entropy = 0;
  for (const char in freq) {
    const p = freq[char] / str.length;
    entropy -= p * Math.log2(p);
  }
  return entropy;
}
Enter fullscreen mode Exit fullscreen mode

Any token-like string (no spaces, sufficient length, mix of character classes) that crosses the entropy threshold gets flagged as a potential secret. This catches things regex can't — custom tokens, generated passwords, encrypted blobs pasted into prompts.

Pass 3: NLP Name and Location Detection

Using the compromise library for lightweight client-side NLP. It identifies person names and location references that regex alone would miss. It's not perfect — NLP in the browser never is — but it catches "Sarah Johnson" and "deployed to us-east-1" which pure regex would skip.


Twin-Write Architecture: Solving the Storage Race Condition

Chrome extension storage (chrome.storage.sync) is asynchronous. When a user toggles a setting — say, switching redaction style from [REDACTED] to masked (****) — there's a real delay before the new value is readable from storage.

If the user changes a setting and immediately types something that triggers a redaction, the detection engine might read the old setting because the storage write hasn't completed.

My solution is what I call Twin-Write Architecture:

User changes setting
  → Write 1: Local in-memory object (instant, synchronous)
  → Write 2: chrome.storage.sync (persistent, async)

Detection engine reads setting
  → Read from local memory first (always current)
  → Falls back to chrome.storage only on cold start
Enter fullscreen mode Exit fullscreen mode
// Simplified twin-write pattern
const localState: Settings = { /* defaults */ };

async function updateSetting(key: string, value: any) {
  // Write 1: instant (synchronous)
  localState[key] = value;

  // Write 2: persistent (async)
  await chrome.storage.sync.set({ [key]: value });
}

function readSetting(key: string) {
  // Always reads from local memory — no async, no race condition
  return localState[key];
}
Enter fullscreen mode Exit fullscreen mode

On extension startup, local memory is hydrated from chrome.storage.sync once. After that, all reads hit local memory. All writes go to both targets simultaneously.

This eliminates:

  • UI flicker (settings apply instantly)
  • Race conditions (detection engine never reads stale data)
  • Cold start lag (only one async read on init)

Shadow DOM Isolation: Don't Pollute, Don't Get Polluted

Prompt Armour injects UI components (toast notifications, redaction tooltips) into the host page. Without isolation, two things go wrong:

  1. Your CSS leaks out and breaks ChatGPT's styling
  2. Their CSS leaks in and breaks your components

The solution is Shadow DOM. Every injected UI component lives inside a shadow root:

const host = document.createElement("div");
const shadow = host.attachShadow({ mode: "closed" });

// Styles are scoped — they can't leak out
const style = document.createElement("style");
style.textContent = `/* component styles */`;
shadow.appendChild(style);

// Component renders inside the shadow boundary
shadow.appendChild(renderComponent());
document.body.appendChild(host);
Enter fullscreen mode Exit fullscreen mode

Plasmo (the framework I'm using) handles some of this automatically for content script UIs, but the toast notification system and redaction tooltips needed manual shadow DOM management to work correctly across ChatGPT, Gemini, and Claude — each of which has different CSS resets and global styles that would otherwise corrupt injected components.


Architecture Overview

┌─────────────────────────────────────────────┐
│                  Host Page                    │
│  (ChatGPT / Gemini / Claude)                 │
│                                              │
│   ┌──────────────┐    ┌──────────────────┐   │
│   │  ProseMirror  │    │  Shadow DOM UI    │   │
│   │  Editor       │    │  (Toast, Tooltip) │   │
│   │  [untouched]  │    │  [isolated]       │   │
│   └──────┬───────┘    └──────────────────┘   │
│          │                                    │
│   ┌──────▼───────────────────────────────┐   │
│   │  Content Script (protector.ts)        │   │
│   │                                       │   │
│   │  MutationObserver → watches input     │   │
│   │  Detection Engine → regex + entropy   │   │
│   │  CSS Highlight API → visual overlay   │   │
│   │  Twin-Write → instant settings        │   │
│   │  Redaction Handler → modifies text    │   │
│   └──────────────────────────────────────┘   │
│                                              │
└─────────────────────────────────────────────┘
         │
         │ chrome.storage.sync
         ▼
┌─────────────────┐
│  Background SW   │
│  (badge count)   │
└─────────────────┘
         │
         ▼
┌─────────────────┐
│  Popup UI        │
│  (React + TW)    │
│  Settings/Stats  │
└─────────────────┘
Enter fullscreen mode Exit fullscreen mode

What I'd Do Differently

Use a proper parser instead of pure regex. Regex handles 90% of cases but gets fragile with edge cases — especially multi-line connection strings and tokens with unusual delimiters. A tokenizer-based approach would be more maintainable. It's on my roadmap.

Start with the Highlight API from day one. I wasted two weeks trying DOM manipulation approaches before discovering CSS Custom Highlight API. If you're building any Chrome extension that needs to visually annotate text inside a rich editor — start here.

Don't underestimate SPA navigation. ChatGPT, Gemini, and Claude all use client-side routing. Your content script's MutationObserver needs to handle the editor being destroyed and recreated without a page load. Cleanup and re-initialization logic is critical and boring and absolutely necessary.


Try It / Contribute

Prompt Armour is live on the Chrome Web Store.

It's free. Core PII and API key detection, all redaction styles, all supported platforms — no paywall. A Pro tier is planned later for custom regex patterns and team features, but the core protection stays free.

If you've dealt with hostile SPA injection, ProseMirror wrangling, or built detection engines in the browser, I'd love to hear how you approached it. I'm a solo developer and genuinely learning as I ship.

Marketing site | Chrome Web Store

Top comments (0)