How to compress PDFs in the browser without uploading to servers: A complete guide

#ai #webdev #programming #productivity

Your PDFs contain sensitive data. Tax forms, contracts, medical records. Yet most online tools want you to upload them to mysterious servers in who-knows-where.
I built Kreotar's PDF compressor to solve this exact paranoia. Everything happens in your browser. Here's exactly how you can implement the same architecture.
Step 1: The Architecture Decision
We use PDF-lib (client-side JS) combined with custom WASM modules for image compression. The key is handling everything in a Web Worker so the UI stays responsive during heavy processing.
JavaScript
Copy
// pdf-processor.worker.js
import * as PDFLib from 'pdf-lib';
import { PDFDocument } from 'pdf-lib';
import { createImageCompressionWasm } from './wasm-image-compress';

self.onmessage = async (event) => {
const { fileBuffer, quality = 0.7 } = event.data;

try {
const pdfDoc = await PDFDocument.load(fileBuffer);
const pages = pdfDoc.getPages();

let totalSaved = 0;

// Process each page
for (let i = 0; i < pages.length; i++) {
  const page = pages[i];

  // Extract images from page
  const images = await extractImagesFromPage(page);

  for (const image of images) {
    const originalSize = image.data.length;

    // Compress using WASM (mozjpeg compiled to WASM)
    const compressed = await createImageCompressionWasm({
      data: image.data,
      quality: quality * 100,
      format: 'jpeg'
    });

    totalSaved += (originalSize - compressed.length);

    // Replace image in PDF
    await replaceImageInPage(page, image.ref, compressed);
  }

  // Report progress
  self.postMessage({ 
    type: 'progress', 
    current: i + 1, 
    total: pages.length 
  });
}

const pdfBytes = await pdfDoc.save();

self.postMessage({ 
  type: 'complete', 
  result: pdfBytes,
  compressionRatio: totalSaved / fileBuffer.length
});

} catch (error) {
self.postMessage({ type: 'error', message: error.message });
}
};
Step 2: The React Integration
Here's how to wire it up in your frontend:


import { useState, useRef, useCallback } from 'react';
import PdfWorker from './pdf-processor.worker?worker';

const PdfCompressor = () => {
  const [status, setStatus] = useState('idle');
  const [progress, setProgress] = useState(0);
  const [compressionStats, setCompressionStats] = useState(null);
  const workerRef = useRef(null);

  const processPdf = useCallback(async (file) => {
    setStatus('processing');
    setProgress(0);

    // Initialize worker
    const worker = new PdfWorker();
    workerRef.current = worker;

    // Read file as ArrayBuffer
    const arrayBuffer = await file.arrayBuffer();

    return new Promise((resolve, reject) => {
      worker.onmessage = (e) => {
        const { type, current, total, result, compressionRatio, message } = e.data;

        switch(type) {
          case 'progress':
            setProgress((current / total) * 100);
            break;
          case 'complete':
            setStatus('complete');
            setCompressionStats({
              originalSize: file.size,
              newSize: result.length,
              ratio: (1 - compressionRatio) * 100
            });

            // Create download blob
            const blob = new Blob([result], { type: 'application/pdf' });
            const url = URL.createObjectURL(blob);

            // Auto-download
            const a = document.createElement('a');
            a.href = url;
            a.download = `compressed-${file.name}`;
            a.click();
            URL.revokeObjectURL(url);

            resolve(result);
            worker.terminate();
            break;
          case 'error':
            setStatus('error');
            reject(new Error(message));
            worker.terminate();
            break;
        }
      };

      // Start processing
      worker.postMessage({ 
        fileBuffer: arrayBuffer,
        quality: 0.7 // Compression quality
      }, [arrayBuffer]); // Transfer ownership for performance
    });
  }, []);

  return (
    <div className="pdf-compressor">
      <input 
        type="file" 
        accept=".pdf" 
        onChange={(e) => e.target.files?.[0] && processPdf(e.target.files[0])}
        disabled={status === 'processing'}
      />

      {status === 'processing' && (
        <div className="progress-bar">
          <div style={{ width: `${progress}%` }} />
          <span>{Math.round(progress)}% compressed</span>
        </div>
      )}

      {compressionStats && (
        <div className="stats">
          <p>Original: {(compressionStats.originalSize / 1024).toFixed(2)} KB</p>
          <p>Saved: {compressionStats.ratio.toFixed(1)}%</p>
        </div>
      )}
    </div>
  );
};

Step 3: Handling the Gotchas
Memory limits: Browsers crash around 2GB of RAM usage. For large PDFs, I process pages in chunks:

// Handle large PDFs in chunks to avoid memory crashes
const processInChunks = async (pages, chunkSize = 5) => {
  const results = [];

  for (let i = 0; i < pages.length; i += chunkSize) {
    const chunk = pages.slice(i, i + chunkSize);

    // Force garbage collection between chunks (hacky but necessary)
    if (i > 0) {
      await new Promise(resolve => setTimeout(resolve, 100));
    }

    const processed = await Promise.all(chunk.map(processPage));
    results.push(...processed);
  }

  return results;
};

CORS issues: If your WASM module is on a CDN, ensure proper headers:


Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

The Privacy Win
Your file never leaves your laptop. Check the Network tab in DevTools - zero uploads. That's the magic of client-side processing.
I made the mistake early on of trying to use serverless functions for this. The latency killed the UX. Plus, who wants to upload their tax documents to a random Lambda function?
Try it yourself: Kreotar PDF Compressor
What other PDF operations are you trying to client-side? I might already have a tool for it in the sitemap above.

DEV Community

How to compress PDFs in the browser without uploading to servers: A complete guide

Top comments (0)