DEV Community

sunshey
sunshey

Posted on

How to Split PDF Files in the Browser (No Server Required)

In the last post, I covered merging PDFs in the browser using pdf-lib. Today we're doing the opposite: splitting them.

Splitting is surprisingly common:

  • A 200-page report where you only need chapters 3 and 7
  • A scanned contract where the signature page needs to go to legal
  • A PDF that's too large to email, so you split it into two attachments
  • Extracting specific pages from a bulk download

Just like merging, we'll do this entirely in the browser — no server uploads, no privacy trade-offs.


Why Browser-Side Splitting Matters

The same privacy argument applies. When you upload a PDF to an online splitter, your file leaves your device. For documents containing personal data, financial records, or client contracts, that's a risk you don't need to take.

Browser-side splitting keeps everything local. The PDF loads into your browser tab, pdf-lib manipulates it in memory, and you download the result. The server never sees the content.

The trade-off: Browser memory. A 300MB scanned document will strain a tab. For typical office documents (under 50-100MB), splitting is instant.


Basic Split: Extract a Single Page Range

The simplest use case: "Give me pages 5 through 10."

import { PDFDocument } from 'pdf-lib';

async function extractPages(file, startPage, endPage) {
  // Load the source PDF
  const arrayBuffer = await file.arrayBuffer();
  const sourcePdf = await PDFDocument.load(arrayBuffer);

  // Create a new empty PDF
  const newPdf = await PDFDocument.create();

  // Convert to 0-based indices
  const pageIndices = [];
  for (let i = startPage - 1; i < endPage; i++) {
    pageIndices.push(i);
  }

  // Copy pages from source to new document
  const copiedPages = await newPdf.copyPages(sourcePdf, pageIndices);
  copiedPages.forEach(page => newPdf.addPage(page));

  // Save and return
  const pdfBytes = await newPdf.save();
  return new Blob([pdfBytes], { type: 'application/pdf' });
}

// Usage: extract pages 5-10
const blob = await extractPages(file, 5, 10);
downloadBlob(blob, 'pages-5-to-10.pdf');
Enter fullscreen mode Exit fullscreen mode

What's happening:

  1. Load the full source PDF into memory
  2. Create a blank destination PDF
  3. Copy only the requested pages using copyPages()
  4. Save the new, smaller document

The source PDF isn't modified — we're creating a new document containing only the pages we want.


Split by Page Ranges: One Input, Multiple Outputs

Real-world scenario: a user uploads a 50-page document and wants three separate files — pages 1-10, 11-30, and 31-50.

async function splitByRanges(file, ranges) {
  // ranges = [{ name: 'intro', start: 1, end: 10 }, ...]
  const arrayBuffer = await file.arrayBuffer();
  const sourcePdf = await PDFDocument.load(arrayBuffer);

  const results = [];

  for (const range of ranges) {
    const newPdf = await PDFDocument.create();

    const pageIndices = [];
    for (let i = range.start - 1; i < range.end; i++) {
      pageIndices.push(i);
    }

    const copiedPages = await newPdf.copyPages(sourcePdf, pageIndices);
    copiedPages.forEach(page => newPdf.addPage(page));

    const pdfBytes = await newPdf.save();
    results.push({
      name: range.name,
      blob: new Blob([pdfBytes], { type: 'application/pdf' })
    });
  }

  return results;
}

// Usage
const ranges = [
  { name: 'chapter-1', start: 1, end: 10 },
  { name: 'chapter-2', start: 11, end: 30 },
  { name: 'chapter-3', start: 31, end: 50 }
];

const files = await splitByRanges(pdfFile, ranges);
files.forEach(f => downloadBlob(f.blob, `${f.name}.pdf`));
Enter fullscreen mode Exit fullscreen mode

Memory note: We load the source PDF once, then create multiple new documents from it. Each new document only holds the pages it needs, so memory usage stays proportional to the output size, not the input size.


Advanced: Split by File Size (Email-Friendly Chunks)

Here's a trickier problem: "Split this PDF into chunks under 20MB each so I can email it."

Unlike page-range splitting, we don't know the page count in advance. We need to accumulate pages until adding one more would exceed the limit, then start a new chunk.

async function splitBySize(file, maxSizeBytes = 20 * 1024 * 1024) {
  const arrayBuffer = await file.arrayBuffer();
  const sourcePdf = await PDFDocument.load(arrayBuffer);
  const totalPages = sourcePdf.getPageCount();

  const chunks = [];
  let currentChunk = await PDFDocument.create();
  let currentSize = 0;
  let currentStartPage = 1;

  for (let i = 0; i < totalPages; i++) {
    // Try adding this page to the current chunk
    const testChunk = await PDFDocument.create();

    // Copy all pages currently in the chunk plus the new one
    const pagesToCopy = [];
    for (let j = currentStartPage - 1; j <= i; j++) {
      pagesToCopy.push(j);
    }

    const copiedPages = await testChunk.copyPages(sourcePdf, pagesToCopy);
    copiedPages.forEach(page => testChunk.addPage(page));

    const testBytes = await testChunk.save();

    if (testBytes.length > maxSizeBytes && currentStartPage <= i) {
      // This page would push us over the limit. Finalize current chunk.
      const finalChunk = await PDFDocument.create();
      const finalPages = [];
      for (let j = currentStartPage - 1; j < i; j++) {
        finalPages.push(j);
      }
      const finalCopied = await finalChunk.copyPages(sourcePdf, finalPages);
      finalCopied.forEach(page => finalChunk.addPage(page));

      const finalBytes = await finalChunk.save();
      chunks.push(new Blob([finalBytes], { type: 'application/pdf' }));

      // Start new chunk with current page
      currentStartPage = i + 1;
    }
  }

  // Don't forget the last chunk
  if (currentStartPage <= totalPages) {
    const finalChunk = await PDFDocument.create();
    const finalPages = [];
    for (let j = currentStartPage - 1; j < totalPages; j++) {
      finalPages.push(j);
    }
    const finalCopied = await finalChunk.copyPages(sourcePdf, finalPages);
    finalCopied.forEach(page => finalChunk.addPage(page));

    const finalBytes = await finalChunk.save();
    chunks.push(new Blob([finalBytes], { type: 'application/pdf' }));
  }

  return chunks;
}
Enter fullscreen mode Exit fullscreen mode

Why this is expensive: We save a test document on every page to check the size. For a 100-page document, that's 100 save operations. In practice, I optimize this by:

  1. Estimating page size from the first few pages
  2. Using binary search instead of linear checking
  3. Adding a small buffer (aim for 18MB instead of exactly 20MB)

For production, a simpler heuristic works well: if the original is 45MB and 50 pages, each page is roughly 0.9MB. Split every 22 pages to stay safely under 20MB.


Extracting Every Nth Page

Another common pattern: "I only need the odd pages" or "Extract every 5th page for a summary."

async function extractEveryNthPage(file, n, offset = 0) {
  const arrayBuffer = await file.arrayBuffer();
  const sourcePdf = await PDFDocument.load(arrayBuffer);
  const totalPages = sourcePdf.getPageCount();

  const newPdf = await PDFDocument.create();
  const pageIndices = [];

  for (let i = offset; i < totalPages; i += n) {
    pageIndices.push(i);
  }

  const copiedPages = await newPdf.copyPages(sourcePdf, pageIndices);
  copiedPages.forEach(page => newPdf.addPage(page));

  const pdfBytes = await newPdf.save();
  return new Blob([pdfBytes], { type: 'application/pdf' });
}

// Extract odd pages only
const oddPages = await extractEveryNthPage(file, 2, 0);

// Extract even pages only
const evenPages = await extractEveryNthPage(file, 2, 1);
Enter fullscreen mode Exit fullscreen mode

Handling Page Rotation and Metadata

When you split a PDF, you might want to preserve or modify metadata:

async function splitWithMetadata(file, startPage, endPage) {
  const arrayBuffer = await file.arrayBuffer();
  const sourcePdf = await PDFDocument.load(arrayBuffer);
  const newPdf = await PDFDocument.create();

  // Copy pages
  const pageIndices = [];
  for (let i = startPage - 1; i < endPage; i++) {
    pageIndices.push(i);
  }
  const copiedPages = await newPdf.copyPages(sourcePdf, pageIndices);
  copiedPages.forEach(page => newPdf.addPage(page));

  // Preserve or update metadata
  const author = sourcePdf.getAuthor();
  const creator = sourcePdf.getCreator();

  newPdf.setTitle(`Extracted Pages ${startPage}-${endPage}`);
  newPdf.setAuthor(author || 'Unknown');
  newPdf.setCreator('sotool PDF Splitter');
  newPdf.setProducer('pdf-lib');
  newPdf.setCreationDate(new Date());

  const pdfBytes = await newPdf.save();
  return new Blob([pdfBytes], { type: 'application/pdf' });
}
Enter fullscreen mode Exit fullscreen mode

This is useful when the recipient needs to know where the extract came from.


Performance: The 200-Page Test

I tested the page-range splitter on a 200-page, 85MB scanned document:

Operation Time Memory Peak
Load PDF 2.1s 180MB
Extract pages 50-100 1.8s 95MB
Split into 4 chunks 4.2s 210MB
Extract every 10th page 0.9s 45MB

The key insight: copyPages() is fast because it copies page references and shared resources efficiently. You're not deep-cloning the entire document for each output.

Optimization tip: If you're splitting a PDF into many small chunks, load the source once and reuse the PDFDocument instance. Don't reload it for each chunk.


Limitations (Honest Assessment)

Limitation Why Workaround
Bookmarks lost Splitting creates new document trees Acceptable for most extracts
Internal links break Page references change Rarely an issue for extracts
Large scans choke Browser memory limits Split server-side for 500MB+ files
Form data complexity Some forms have cross-page dependencies Test thoroughly with fillable PDFs

For 95% of use cases — extracting a chapter, splitting a contract, creating email-friendly chunks — these limitations don't matter.


Try the Live Tool

If you want to test browser-based PDF splitting without writing code:

👉 en.sotool.top/split

Features:

  • Extract specific page ranges
  • Split by fixed page count (every N pages)
  • Split by file size (email-friendly chunks)
  • Visual page thumbnails for selection
  • Pure browser-side processing
  • Free, no signup

The source code is open source if you want to see how the Vue 3 + pdf-lib integration works.


What's Next?

If you enjoyed this post, check out the companion piece on merging PDFs in the browser.

Next up in the series:

  • Compressing PDFs by downsampling embedded images
  • Adding watermarks with text and image overlays
  • Encrypting and password-protecting PDFs client-side

Have you built browser-based document tools? What's your approach to handling large files or complex page operations? Let me know in the comments.

Top comments (0)