Why I Built a Browser-Based Image Tool That Never Uploads Your Photos
Every time you upload an image to an online tool, you're trusting that server with your data. Your photos contain GPS coordinates, camera serial numbers, timestamps, and sometimes even facial recognition markers — all embedded as EXIF metadata that most people don't even know exists.
I wanted to build something different. A tool that processes images entirely in the browser, so your photos never leave your device.
Here's what I learned building PixelFresh, and the technical decisions behind it.
The Hidden Data in Your Photos
Before diving into the technical side, let's look at what's actually in your photos. A typical JPEG from a smartphone contains:
| EXIF Field | Privacy Risk |
|---|---|
| GPS Coordinates | Exact location where photo was taken |
| DateTime | When you were at that location |
| Camera Make/Model | Device fingerprinting |
| Serial Number | Unique device identifier |
| Thumbnail | May contain original crop before editing |
| Software | What apps you use |
A single photo can reveal where you live, work, and travel — and most image optimization tools upload these to their servers before processing.
The Canvas API Approach
The key insight is that the HTML5 Canvas API naturally strips EXIF metadata when you draw an image onto it. Here's the simplest version:
function stripExif(file) {
return new Promise((resolve) => {
const img = new Image();
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
img.onload = () => {
canvas.width = img.width;
canvas.height = img.height;
ctx.drawImage(img, 0, 0);
canvas.toBlob((blob) => {
resolve(blob); // Clean image — no EXIF data
}, 'image/jpeg', 0.92);
};
img.src = URL.createObjectURL(file);
});
}
When you call canvas.toBlob(), the output contains only pixel data. No EXIF, no GPS, no camera info. It's the simplest and most reliable way to strip metadata without parsing binary EXIF structures.
Beyond Metadata: Creating Unique Images
Stripping EXIF is step one. But what if you need each processed image to have a unique file hash? This matters for:
- E-commerce sellers uploading to multiple marketplaces
- Content creators distributing across platforms
- Anyone who needs distinct files from the same source
Simply re-encoding a JPEG at the same quality produces the same file. To create genuinely unique files, I implemented what I call "AI pixel reconstruction":
function pixelReconstruct(ctx, width, height) {
const imageData = ctx.getImageData(0, 0, width, height);
const data = imageData.data;
// 1. Subtle gamma adjustment (random per invocation)
const gamma = 0.98 + Math.random() * 0.04; // 0.98–1.02
// 2. Micro channel mixing
const mix = (Math.random() - 0.5) * 2; // ±1 per channel
for (let i = 0; i < data.length; i += 4) {
// Apply gamma
data[i] = Math.pow(data[i] / 255, gamma) * 255; // R
data[i + 1] = Math.pow(data[i + 1] / 255, gamma) * 255; // G
data[i + 2] = Math.pow(data[i + 2] / 255, gamma) * 255; // B
// Add micro noise (imperceptible, ±1 value)
data[i] = Math.max(0, Math.min(255, data[i] + mix));
data[i + 1] = Math.max(0, Math.min(255, data[i + 1] + mix));
data[i + 2] = Math.max(0, Math.min(255, data[i + 2] + mix));
}
ctx.putImageData(imageData, 0, 0);
}
Each invocation uses different random seeds, producing images that:
- Look identical to the human eye
- Have completely different file hashes (MD5/SHA)
- Have different perceptual hashes (pHash/dHash)
Video Frame Extraction with Scene Detection
Another feature I built was extracting key frames from video. Instead of capturing every frame (which would give you thousands of near-identical images), I implemented scene change detection:
async function detectScenes(video, threshold = 30) {
const scenes = [];
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
// Downsample for performance (160x90)
canvas.width = 160;
canvas.height = 90;
let prevFrame = null;
const duration = video.duration;
const interval = 0.5; // Check every 0.5 seconds
for (let sec = 0; sec < duration; sec += interval) {
video.currentTime = sec;
await new Promise(r => video.onseeked = r);
ctx.drawImage(video, 0, 0, 160, 90);
const frame = ctx.getImageData(0, 0, 160, 90).data;
if (prevFrame) {
let diff = 0;
for (let i = 0; i < frame.length; i += 4) {
diff += Math.abs(frame[i] - prevFrame[i]); // R
diff += Math.abs(frame[i+1] - prevFrame[i+1]); // G
diff += Math.abs(frame[i+2] - prevFrame[i+2]); // B
}
const avgDiff = diff / (160 * 90 * 3);
if (avgDiff > threshold) {
scenes.push(sec); // Scene change detected
}
}
prevFrame = new Uint8Array(frame);
}
return scenes;
}
The algorithm compares consecutive downscaled frames pixel-by-pixel. When the average difference exceeds a threshold, it marks a scene change. Then we capture at full resolution (1080p/4K) at those timestamps.
Performance Considerations
Processing images client-side has its challenges:
Memory management — Large images (4000×3000) consume significant memory. I process one image at a time and explicitly release ObjectURLs:
URL.revokeObjectURL(objectUrl); // Free memory after use
Web Workers — For batch processing, offloading pixel manipulation to a Web Worker prevents UI freezing. The main thread handles only the Canvas API calls (which must run on the main thread due to DOM access).
JPEG quality — I settled on 92% quality as the sweet spot. Below 90%, compression artifacts become noticeable. Above 95%, file sizes balloon with no perceptible improvement.
The Architecture
The entire app is a single HTML file with inline JavaScript — no build framework, no npm dependencies (at runtime), no backend:
index.html (~1500 lines)
├── HTML structure
├── Tailwind CSS (CDN)
├── i18n system (4 languages)
├── Image processing pipeline
├── Video scene detection
└── ZIP download (JSZip CDN)
This "zero-dependency" approach means:
- No server costs — Hosted as static files on Vercel
- Instant loading — No framework hydration delay
- Complete privacy — Impossible to leak data since there's no server to leak to
- Offline capable — Works without internet after initial load
Lessons Learned
Canvas toBlob() is your friend for metadata stripping. Don't try to parse and remove EXIF fields manually — just redraw the image.
Random seeds matter for unique file generation. Using
Math.random()for gamma and noise values ensures every output is genuinely different.Downsample for analysis, full-res for output. Scene detection at 160×90 is fast enough for real-time processing, but final captures should use the original video resolution.
Single-file architecture works for tools up to ~2000 lines. Beyond that, consider splitting — but don't split prematurely.
Privacy as a feature resonates strongly with users. "Your photos never leave your device" is a concrete, verifiable claim that builds trust.
Try It
PixelFresh is free, requires no sign-up, and works in any modern browser. The source is a single HTML file — you can literally view-source to verify that no data is sent anywhere.
If you're building browser-based tools, I'd love to hear about your approach to client-side processing. Drop a comment below!
This article was originally published on the PixelFresh blog.
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.