In this article, we'll explore how to implement a pure client-side PDF page removal tool that runs entirely in the browser. No server required, no file uploads, complete privacy protection.
Why Browser-Based PDF Processing?
Traditional PDF processing typically requires:
- Uploading files to a server
- Processing on the backend
- Downloading the result
This approach has significant drawbacks:
- Privacy concerns - Your sensitive documents are sent to third-party servers
- Network dependency - Requires stable internet connection
- Latency - Upload and download times for large files
- Server costs - Backend infrastructure required
Browser-based processing solves all these issues:
- ✅ Files never leave your computer
- ✅ Works offline after initial load
- ✅ Instant processing
- ✅ Zero server costs for PDF operations
Architecture Overview
Our solution combines three powerful web technologies:
- WebAssembly (WASM) - Running QPDF (a powerful PDF manipulation library) compiled to WASM
- Web Workers - Offloading heavy PDF operations to a background thread
- Comlink - Making worker communication as simple as async function calls
Core Data Structures
PageRange Type
The fundamental data structure for specifying which pages to remove:
// types/pdfdata.ts
export type PageRange = [number, number];
Each PageRange is a tuple where:
- Index 0: Start page number (inclusive)
- Index 1: End page number (inclusive)
- Single pages are represented as
[n, n]
WorkerFunctions Interface
The contract between main thread and worker:
// hooks/useqpdf.ts
interface WorkerFunctions {
init: () => Promise<void>;
remove: (files: File, ...range: PageRange[]) => Promise<ArrayBuffer | null>;
// ... other operations
}
Implementation Deep Dive
1. User Interface Layer
The UI component handles user input and triggers the removal process:
// app/[locale]/_components/qpdf/remove.tsx
export const Organize = () => {
const [files, setFiles] = useState<File[]>([]);
const { value: pages, onChange: onChangeUserPassword } =
useInputValue<string>("1-z");
const { remove } = useQpdf();
const mergeInMain = async () => {
// Parse user input: "1-3,5,10-z" → [[1,3], [5,5], [10,10000]]
const remoePages = pages
.replaceAll(",", ",") // Support Chinese comma
.split(",")
.map((e) => {
if (e.includes("-")) {
const t = e.split("-");
return [parseInt(t[0]!), parseInt(t[1]!)] as PageRange;
} else {
return [parseInt(e), parseInt(e)] as PageRange;
}
});
const outputFile = await remove(files[0]!, ...remoePages);
if (outputFile) {
autoDownloadBlob(new Blob([outputFile]), "organize.pdf");
}
};
// ...
};
Key features of the input format:
- Comma-separated ranges:
1-3,5,10-z - Single pages:
5becomes[5, 5] - Ranges:
1-3becomes[1, 3] - Special character
zrepresents the last page - Supports Chinese comma
,for localization
2. Worker Management with Comlink
The useQpdf hook manages the Web Worker lifecycle:
// hooks/useqpdf.ts
export const useQpdf = () => {
const workerRef = useRef<Comlink.Remote<WorkerFunctions>>(null);
useEffect(() => {
async function initWorker() {
if (workerRef.current) return;
const worker = new PdfWorker();
worker.onerror = (error) => {
console.error("Worker error:", error);
};
workerRef.current = Comlink.wrap<WorkerFunctions>(worker);
await workerRef.current.init();
return () => worker.terminate();
}
initWorker().catch(() => { return; });
}, []);
const remove = async (
file: File,
...range: PageRange[]
): Promise<ArrayBuffer | null> => {
if (!workerRef.current) return null;
const r = await workerRef.current.remove(file, ...range);
return r;
};
return { remove };
};
Why Comlink?
- Eliminates manual
postMessageboilerplate - Provides type-safe function calls
- Handles serialization automatically
- Makes worker code look like regular async functions
3. The Range Inversion Algorithm
QPDF's --pages flag specifies which pages to keep, not which to remove. So we need to invert the user's "remove" ranges into "keep" ranges:
// hooks/pdf.worker.js
function removeRanges(mainRange, ...excludeRanges) {
const [start, end] = mainRange;
const excludeSet = new Set();
// Collect all pages to exclude
excludeRanges.forEach(([s, e]) => {
for (let i = s; i <= e; i++) {
excludeSet.add(i);
}
});
// Collect remaining pages
const remaining = [];
for (let i = start; i <= end; i++) {
if (!excludeSet.has(i)) {
remaining.push(i);
}
}
// Convert consecutive numbers to compact ranges
const result = [];
if (remaining.length === 0) return result;
let currentStart = remaining[0];
let currentEnd = remaining[0];
for (let i = 1; i < remaining.length; i++) {
if (remaining[i] === currentEnd + 1) {
currentEnd = remaining[i];
} else {
result.push(
currentStart === currentEnd
? [currentStart]
: [currentStart, currentEnd]
);
currentStart = remaining[i];
currentEnd = remaining[i];
}
}
result.push(
currentStart === currentEnd ? [currentStart] : [currentStart, currentEnd]
);
return result;
}
Example transformation:
- Input: Remove
[1,3], [5,5], [10,10000]from document with 100 pages - Process: Exclude pages 1-3, 5, 10-100 → Remaining: 4, 6-9
- Output:
[[4], [6,9]]→ formatted as"4,6-9"
4. QPDF WASM Execution
The core PDF processing happens in the Web Worker using QPDF compiled to WebAssembly:
// hooks/pdf.worker.js
async remove(file, ...range) {
// Convert File to ArrayBuffer
const arrayBuffer = await file.arrayBuffer();
const uint8Array = new Uint8Array(arrayBuffer);
// Write to QPDF's virtual filesystem
qpdf.FS.writeFile(`/input.pdf`, uint8Array);
// Calculate pages to KEEP (inverse of pages to remove)
const result = removeRanges([1, 10000], ...range);
result[result.length - 1][1] = "z"; // Use 'z' for last page
const resultstr = result.map((e) => {
if (e.length == 1) return e[0] + "";
else return e[0] + "-" + e[1];
});
// Build QPDF command
const params = [
"/input.pdf",
"--pages",
"/input.pdf",
resultstr.join(","), // Pages to KEEP
"--",
"/output.pdf",
];
// Execute QPDF
qpdf.callMain(params);
// Read output from virtual filesystem
const outputFile = qpdf.FS.readFile("/output.pdf");
return outputFile;
}
QPDF Command Example:
# To remove pages 1-3 and 5 from a 100-page document:
# We need to keep pages 4, 6-100
qpdf input.pdf --pages input.pdf 4,6-z -- output.pdf
5. WASM Initialization
The QPDF WASM module is initialized with Emscripten's virtual filesystem:
// lib/qpdfwasm.js
import createModule from "@neslinesli93/qpdf-wasm";
const f = async () => {
const qpdf = await createModule({
locateFile: () => "/qpdf.wasm",
noInitialRun: true, // Don't run main() immediately
preRun: [(module) => {
if (module.FS) {
// Filesystem is ready
}
}],
});
return qpdf;
};
Complete Processing Flow
Key Technical Decisions
1. Why QPDF?
QPDF is a powerful command-line tool for PDF manipulation. By compiling it to WASM:
- We get battle-tested PDF processing logic
- Supports complex operations (merge, split, rotate, encrypt)
- Handles edge cases and malformed PDFs well
2. Why Web Workers?
PDF processing can be CPU-intensive:
- Parsing large PDFs
- Rebuilding document structure
- Writing output files
Running in a Web Worker:
- Prevents UI freezing
- Maintains 60fps during processing
- Provides true parallelism on multi-core systems
3. Virtual File System
Emscripten provides an in-memory filesystem:
- No actual disk access needed
- Fast read/write operations
- Automatic cleanup when worker terminates
File Download Utility
After processing, we trigger the browser download:
// utils/pdf.ts
export function autoDownloadBlob(blob: Blob, filename: string) {
const blobUrl = URL.createObjectURL(blob);
const downloadLink = document.createElement("a");
downloadLink.href = blobUrl;
downloadLink.download = filename;
downloadLink.style.display = "none";
document.body.appendChild(downloadLink);
downloadLink.click();
document.body.removeChild(downloadLink);
URL.revokeObjectURL(blobUrl);
}
Benefits of This Architecture
- Privacy First: Files never leave the browser
- Performance: Near-native speed with WASM
- Responsive UI: Web Workers prevent blocking
- Type Safety: TypeScript + Comlink = type-safe worker communication
- Maintainability: Clean separation of concerns
Try It Yourself
Want to remove pages from your PDF without uploading anything to a server? Try our free browser-based PDF tool:
All processing happens locally in your browser - your files never leave your computer!
Conclusion
Building a browser-based PDF processing tool demonstrates the power of modern web technologies. By combining WebAssembly, Web Workers, and Comlink, we can perform complex PDF operations entirely client-side while maintaining a responsive user interface.
This approach is ideal for:
- Privacy-sensitive documents
- Offline-capable applications
- Reducing server costs
- Improving user experience with instant processing
The complete source code demonstrates production-ready patterns for WASM integration in React applications.


Top comments (0)