How I built browser-only PDF tools with Astro, pdf-lib, and Web Workers

Sandeep Kottapalli — Wed, 01 Jul 2026 18:08:36 +0000

Most of the "online PDF tool" I could find had the same architecture: upload
your file, wait, download the result. Their privacy policies promised
they'd delete it after an hour. That's a promise. I wanted something
where the promise wasn't necessary — where the tool couldn't leak your
file even if the operator wanted it to.

So I built mergepdfs.in — four PDF tools
(merge, split, compress, reorder) that run entirely in the browser. No backend.
No uploads. Verifiable by turning off your wifi after loading the website.

Here's how it works.

The core insight

Modern browsers can do everything a server would do for PDF manipulation.
File.arrayBuffer() reads the file into memory. pdf-lib merges/splits/
edits the PDF in that memory. A Blob URL downloads the result. At no
point does the file need to leave the tab.

The reason most online PDF tools upload to servers isn't technical —
it's historical. When these tools were built, browsers couldn't handle
this. Now they can.

The stack

Astro 4 — Static generation, zero-JS pages by default
React — Only inside interactive tool "islands"
pdf-lib — Pure JavaScript, runs in the browser, no native deps
@dnd-kit — Drag-and-drop reordering that works on touch
Web Workers — Keep the UI responsive during heavy processing
Tailwind — Because life is short
Cloudflare Pages — Free static hosting, edge-cached

Astro is the interesting choice here. Most people would reach for
Next.js. But Next.js ships a JS runtime on every page by default,
including the /privacy and /about pages that don't need any JS at all.
Astro flips that: nothing ships JS unless you explicitly opt in. My
homepage weighs about 8KB gzipped. The tool pages ship JS only for the
tool itself.

The merge flow (the interesting part)

Here's the simplified version of the merge logic:

// src/lib/pdf/merge.ts
import { PDFDocument } from 'pdf-lib';

export async function mergePdfs(
  buffers: ArrayBuffer[],
  onProgress?: (done: number, total: number) => void
): Promise<Uint8Array> {
  const merged = await PDFDocument.create();

  for (let i = 0; i < buffers.length; i++) {
    const pdf = await PDFDocument.load(buffers[i]);
    const pages = await merged.copyPages(pdf, pdf.getPageIndices());
    pages.forEach((p) => merged.addPage(p));
    onProgress?.(i + 1, buffers.length);
  }

  return await merged.save();
}
```



That's it. No servers. No uploads. Pure browser memory in, merged bytes 
out.

## Why it runs in a Web Worker

The above works on the main thread for small files. It breaks the UI 
for large ones — the tab freezes while pdf-lib parses a 100MB PDF, 
because JavaScript is single-threaded.

The fix is a Web Worker:



```typescript
// src/components/tools/merge/merge.worker.ts
import { mergePdfs } from '../../lib/pdf/merge';

self.onmessage = async (e) => {
  const { buffers } = e.data;
  try {
    const result = await mergePdfs(buffers, (done, total) => {
      self.postMessage({ type: 'progress', done, total });
    });
    self.postMessage({ type: 'result', bytes: result });
  } catch (err) {
    self.postMessage({ type: 'error', message: err.message });
  }
};
```



The main thread stays responsive. Progress updates come through as 
messages. Errors surface cleanly instead of crashing.

## The size problem

Browser memory has real limits, and they vary by device. My iPhone can 
handle ~150MB before it kills the tab. Desktop Chrome tolerates 500MB+. 
There's no reliable API to detect these limits before you hit them.

My solution: tiered warnings by total input size.



```typescript
// src/lib/limits.ts
export const SIZE_THRESHOLDS = {
  WARN_YELLOW: 50 * 1024 * 1024,    // 50 MB
  WARN_RED: 150 * 1024 * 1024,      // 150 MB
  BLOCK: 300 * 1024 * 1024,         // 300 MB
};
```



Users get a yellow notice above 50MB, a red confirmation above 150MB, 
and a hard block above 300MB with a suggestion to split the batch. 
This isn't perfect but it prevents the worst UX outcome (silent crash 
mid-processing).

## The verification trick

The most interesting part isn't the code. It's that users can verify 
the privacy claim themselves in 10 seconds:

1. Open mergepdfs.in and wait for the page to fully load
2. Turn off your wifi
3. Drop your PDFs in
4. Watch it merge

If it works offline, no network calls carry your file. This is 
verifiable in a way "we delete your files after 60 minutes" never can be.

You can also open DevTools → Network tab during a merge. It stays 
empty — the merge path uses only pdf-lib, which is bundled at build 
time, so there's nothing to fetch at runtime.

Small caveat for honesty: the split tool renders page thumbnails using 
PDF.js, which may fetch supporting resources (character maps, fonts) 
the first time you use it. Your file itself is never uploaded — but 
"zero network activity" is only strictly true for merge. This is one 
of those places where being honest about architecture matters more 
than making a cleaner marketing claim.

## What I got wrong the first time

- **I tried to render PDF previews with PDF.js on the main thread.** 
  Killed the UI for 20-page documents. Moved to a Worker.
- **I under-estimated memory pressure on mobile.** iOS Safari kills 
  tabs at ~200MB combined heap. Had to lower size limits significantly.
- **I initially reused ArrayBuffer references across merges.** Turns 
  out pdf-lib holds onto them internally, so my "clear on completion" 
  logic didn't actually free memory. Fix: `buffer = null` after each 
  file is copied, then trigger GC by yielding to the event loop.

## What's next

Three tools live (merge, split, compress). Coming: rotate, PDF↔image 
conversion, page delete/reorder, watermark. Same architecture — 
everything runs client-side.

I might open source it. Haven't decided. In the meantime, happy to 
answer any questions about the architecture or the specific 
implementation choices below.

Try it: [mergepdfs.in](https://mergepdfs.in/)

DEV Community: Sandeep Kottapalli

How I built browser-only PDF tools with Astro, pdf-lib, and Web Workers

The core insight

The stack

The merge flow (the interesting part)