Kanchan Ghosh

Posted on Oct 29

Building Client-Side PII Protection for LLMs Using Chrome's Built-in AI

#ai

TL;DR: I built PII Shield, a Chrome extension that automatically detects and masks sensitive information before you send it to ChatGPT, Claude, or any LLM. Everything runs locally using Chrome's Prompt API and Gemini Nano. Zero server costs, complete privacy, works offline.
Built for: Google Chrome Built-in AI Challenge 2025

Demo Video: https://youtu.be/QvCY2sPC4YU

The Problem Nobody Talks About
You're at work. You need to draft an email using ChatGPT. You type:

"Draft an email to john.smith@acme.com about employee ID EMP-12345's performance review. CC sarah.jones@acme.com"

You just sent three pieces of PII to an external AI service. Your compliance team would not be happy.
This happens thousands of times daily in companies worldwide. Employees use LLMs for legitimate work, but accidentally expose:

Employee names and IDs
Customer emails and phone numbers
Company confidential data
Protected health information
Financial account numbers

Traditional Data Loss Prevention (DLP) solutions cost $50,000+ annually, require server infrastructure, and often break web applications. Small companies can't afford them. Larger companies struggle to enforce them across all AI tools.
There had to be a better way.

The Solution: Edge AI for Privacy
What if we could scan text for PII before it leaves your device? No servers, no costs, no data transmission risks.
Chrome's new built-in AI capabilities make this possible.
Architecture
User types message
↓
Content script intercepts
↓
Prompt API analyzes (LOCAL - never sent anywhere)
↓
PII detected and masked with ######
↓
Safe message sent to LLM
The key insight: Chrome's Prompt API with Gemini Nano runs entirely on your device. This means:

Zero API costs - No per-request fees or quotas
Complete privacy - Your data never leaves your computer
Works offline - No internet required after setup
Real-time performance - Local processing is fast

How It Works

Detection Pipeline Primary Method: AI-Powered I use Chrome's Prompt API to configure Gemini Nano with a system prompt: javascriptconst session = await ai.languageModel.create({ systemPrompt: "You are a PII detector. Analyze text and identify emails, names, phone numbers, IDs, and addresses. Return JSON with exact positions." });

const result = await session.prompt(userText);
Fallback Method: Regex Patterns
For reliability, I maintain regex patterns for:

Email addresses (RFC 5322 compliant)
Phone numbers (international formats)
Social Security Numbers
Credit card numbers
Employee/Account IDs

Masking Strategy Detected PII is replaced with hash characters: Original: "Contact john.smith@company.com" Masked: "Contact ########################" This preserves:

Sentence structure
Message context
Readability
Clear indication that data was removed

Universal Integration The extension works on any LLM interface:

ChatGPT
Claude
Gemini
Microsoft Copilot
Any web-based AI chat

It intercepts:

Paste events (bulk text)
Keyboard events (Enter key)
Button clicks (Send buttons)
Text input changes

Technical Implementation
Chrome Extension Architecture
Manifest V3 Structure:
json{
"manifest_version": 3,
"name": "PII Shield",
"permissions": [
"storage",
"activeTab",
"aiLanguageModelOriginTrial"
],
"content_scripts": [{
"matches": ["https://chatgpt.com/*", "https://claude.ai/*"],
"js": ["content.js"]
}],
"background": {
"service_worker": "background.js"
}
}
Content Script (Simplified):
javascript// Detect PII using Prompt API or regex
function detectPII(text) {
// Try AI detection first
if (ai?.languageModel) {
return detectWithAI(text);
}
// Fallback to regex
return detectWithRegex(text);
}

// Mask detected PII
function maskText(text, piiItems) {
let masked = text;
piiItems.forEach(item => {
const mask = '#'.repeat(item.length);
masked = masked.replace(item.value, mask);
});
return masked;
}

// Monitor all text input
document.addEventListener('paste', async (e) => {
const text = e.clipboardData.getData('text');
const pii = detectPII(text);

if (pii.length > 0) {
e.preventDefault();
const masked = maskText(text, pii);
insertMaskedText(masked);
showNotification(Masked ${pii.length} PII items);
}
});

Why Chrome Built-in AI Changes Everything
Traditional Cloud-Based Approach
❌ User → Network → Cloud API → Processing → Response

Costs money per request
Data exposed during transmission
Requires internet
Privacy concerns
API rate limits Chrome Built-in AI Approach ✅ User → Local Processing → Response
Zero cost
Data never leaves device
Works offline
Complete privacy
No quotas This is a paradigm shift for privacy-critical applications.

Current Limitations (Being Honest)
What Works:
✅ Typed text detection
✅ Pasted text masking
✅ Real-time processing
✅ Multi-platform support
What Doesn't Work Yet:
❌ File uploads - Can't scan PDF/DOCX files users upload
❌ Images - No OCR for text in screenshots
❌ Audio - No speech-to-text PII scanning
❌ Context awareness - May flag legitimate business terms
These are future enhancements. For now, it's a proof-of-concept that shows what's possible.

Proposed Use Cases
Healthcare
Could protect HIPAA-compliant patient data when using AI for medical research or documentation.
Legal Services
Could ensure client confidentiality during AI-assisted legal research.
Financial Services
Could prevent exposure of customer account numbers and financial data.
Enterprise
Could maintain corporate data policies across all employee AI usage without blocking access.
Note: These are potential applications. Production deployment would require thorough testing and compliance validation.

Performance Characteristics
Latency:

AI detection: Local processing with Gemini Nano
Regex fallback: Near-instantaneous
User-perceived delay: Minimal (asynchronous)

Resource Usage:

Lightweight Chrome extension
Minimal CPU (burst processing only)
Small storage footprint

Performance depends on device capabilities and whether Gemini Nano model is downloaded.

Comparison with Existing Solutions
FeatureEnterprise DLPCloud ScannerPII ShieldCost$50,000+/year$10,000+/year$0PrivacyProxy accessCloud processingOn-device onlySetup3-6 months1-2 monthsUnder 5 minutesLatencyNetwork-dependentNetwork-dependentMinimalOfflineNoNoYesQuotasLicense-basedAPI limitsNone

What I Learned Building This

Chrome's Prompt API is Powerful Being able to run Gemini Nano locally with custom system prompts is game-changing. You can build sophisticated AI features without any cloud dependency.
Edge AI Enables New Applications Applications that were impossible due to privacy or cost constraints are now feasible. Think:

Medical AI processing health data locally
Financial AI analyzing sensitive transactions on-device
Personal AI assistants with true privacy

User Experience Matters The extension had to be completely transparent. Users shouldn't need to change their workflow. It just works silently in the background.
Fallback Systems are Critical Not everyone has Gemini Nano downloaded. Having regex fallback ensures the extension provides value immediately while AI capabilities enhance it.

Technical Challenges Solved
Problem 1: Gemini's React Components
Gemini uses complex React components, not standard textareas. Standard document.execCommand didn't work.
Solution: Multi-method text insertion with fallbacks:
javascript// Try multiple insertion methods
if (target.tagName === 'TEXTAREA') {
target.value = maskedText;
} else if (target.isContentEditable) {
document.execCommand('insertText', false, maskedText);
} else {
// Find any editable element on page
const editables = document.querySelectorAll('[contenteditable="true"]');
editables[editables.length - 1].textContent = maskedText;
}
Problem 2: Paste Event Timing
Needed to intercept paste events before React processes them.
Solution: Capture phase with proper event handling:
javascriptdocument.addEventListener('paste', handler, true); // Capture phase
e.preventDefault();
e.stopPropagation();
Problem 3: Stats Persistence
Chrome extensions can't use localStorage reliably across sessions.
Solution: Chrome Storage API with daily reset logic:
javascriptconst result = await chrome.storage.local.get(['maskedToday', 'lastReset']);
const today = new Date().toDateString();

if (result.lastReset !== today) {
await chrome.storage.local.set({ maskedToday: 0, lastReset: today });
}

Development Context
This project was built for the Google Chrome Built-in AI Challenge 2025, a hackathon showcasing innovative applications of Chrome's built-in AI capabilities.
The challenge provided access to:

Prompt API (what I used)
Summarizer API
Writer API
Rewriter API
Translator API
Proofreader API

My background includes building CareCircle++, a zero-knowledge healthcare coordination platform that attracted NHS Digital interest. That experience with privacy-first architecture directly informed PII Shield's design.
In healthcare, data protection isn't optional - it's legally mandated and ethically critical. Those same principles apply to enterprise AI usage.

Future Roadmap
If I continue developing this:
Phase 1: Core Improvements

File upload scanning (PDF, DOCX parsing)
Whitelist management for approved terms
Custom regex patterns for company-specific IDs
Improved context awareness

Phase 2: Enterprise Features

Centralized policy management
Compliance audit reports
Team deployment tools
Integration with existing DLP systems

Phase 3: Advanced AI

Multimodal support (images, audio)
Hybrid strategy with Firebase AI Logic
Machine learning for better detection
Industry-specific patterns (medical, legal, financial)

Installation & Testing
Requirements:

Chrome 127 or later
Enable Chrome Built-in AI flags:

chrome://flags/#optimization-guide-on-device-model → Enabled
chrome://flags/#prompt-api-for-gemini-nano → Enabled
Restart Chrome

Installation:

Download from GitHub (link will be added)
Go to chrome://extensions/
Enable "Developer mode"
Click "Load unpacked"
Select the folder

Testing:
Open the included test.html file with sample PII data, or try it on ChatGPT/Claude.

Key Takeaways
For Developers:

Chrome's built-in AI is production-ready for client-side applications
Edge computing enables new privacy paradigms previously impossible
Zero-cost AI inference opens doors for indie developers and startups
Hybrid approaches work - AI primary, regex fallback

For Companies:

DLP doesn't have to be expensive - client-side solutions can work
Privacy and productivity aren't opposed - you can have both
Browser extensions can solve enterprise problems without complex infrastructure
AI safety requires new thinking - traditional approaches don't scale

For Privacy Advocates:

On-device AI is the future of privacy-preserving applications
Users can be protected transparently without friction
Open source matters - privacy tools should be auditable
Edge computing wins when privacy is critical

Open Questions
I'm still figuring out:

File uploads: How to efficiently parse and scan uploaded documents client-side?
Context sensitivity: How to reduce false positives for legitimate business terms?
Enterprise adoption: What features would compliance teams need?
Monetization: Should this be free and open source, or commercial with enterprise features?

What do you think? Drop your thoughts in the comments.

Conclusion
PII Shield demonstrates that powerful privacy protection can run entirely on-device using Chrome's built-in AI. No servers, no costs, no compromises.
While this is a proof-of-concept with limitations (especially file uploads), it shows what's possible when AI moves to the edge.
As LLMs become ubiquitous in workplaces, privacy-preserving architectures like this will be essential. The technology exists. Now we need to build it.
The Google Chrome Built-in AI Challenge 2025 showed me that client-side AI isn't just a nice-to-have - it's a fundamental shift in what's possible for privacy-critical applications.
Try it out:

GitHub: [Link will be added after submission]
Demo Video: [YouTube link will be added]
Feedback: kanchan@ikanchan.com

About Me
I'm Kanchan, an Independent AI Developer with 16+ years of software development experience. I've built 6+ production AI applications across healthcare, education, real estate, and gaming.
My previous work includes CareCircle++, a privacy-first healthcare platform that attracted NHS Digital interest. I hold an MBA and Google certifications in Gemini and Vertex AI.
Connect with me:

Website: www.ikanchan.com
Email: kanchan@ikanchan.com
GitHub: [Your profile]

Discussion Questions

Would you trust a client-side PII scanner in your organization?
What other privacy-critical applications could benefit from edge AI?
Should this be open source or commercial?
What detection features would make this production-ready for your use case?

Drop your thoughts below!

Tags: #chrome #ai #privacy #security #webdev #machinelearning #chromeextension #llm #gdpr #compliance

This article is based on my submission to the Google Chrome Built-in AI Challenge 2025. Full technical white paper available on my website.