DEV Community

monkeymore studio
monkeymore studio

Posted on

Building a Browser-Based PDF Splitting Tool with pdf-lib and JSZip

In this article, we'll explore how to implement a pure client-side PDF splitting tool that runs entirely in the browser. This tool can split PDFs by file size, split pages vertically or horizontally, making it perfect for managing large documents and creating printer-friendly layouts.

Why Browser-Based PDF Splitting?

Traditional PDF splitting typically requires:

  • Uploading large files to a server
  • Backend processing with storage limitations
  • Downloading multiple split files

Browser-based processing solves all these issues:

  • ✅ Files never leave your computer - complete privacy
  • ✅ No file size upload limits
  • ✅ Instant processing with no network delays
  • ✅ Zero server costs
  • ✅ Automatic ZIP packaging for multiple output files

The Challenge: Multiple Splitting Strategies

This tool supports three different splitting approaches:

  1. By File Size: Split large PDFs into smaller chunks (e.g., 20MB each)
  2. Vertical Split: Split each page vertically into two separate pages
  3. Horizontal Split: Split each page horizontally into two separate pages

Each strategy requires different PDF manipulation techniques.

Architecture Overview

Key Data Structures

Split Type Options

// Split strategy types
const splitTypes = [
  { name: "pages", title: "By Size" },      // Split by file size
  { name: "horizontal", title: "Horizontal" }, // Split pages horizontally
  { name: "vertical", title: "Vertical" },    // Split pages vertically
  { name: "size", title: "By Size" },        // Alias for pages
] as const;

// Size units
const sizeUnits = ["MB", "KB"] as const;
Enter fullscreen mode Exit fullscreen mode

WorkerFunctions Interface

// hooks/usepdflib.ts
interface WorkerFunctions {
  split: (file: File, maxSizeKb: number) => Promise<ArrayBuffer | null>;
  splitPagesVertically: (file: File) => Promise<ArrayBuffer | null>;
  splitPagesHorizontally: (file: File) => Promise<ArrayBuffer | null>;
  // ... other functions
}
Enter fullscreen mode Exit fullscreen mode

Split Result Structure

// Internal structure for split PDFs
interface SplitPdfInfo {
  name: string;        // Filename with page range
  bytes: Uint8Array;   // PDF bytes
}

// Example output:
// { name: "part1_pages1-5.pdf", bytes: Uint8Array }
// { name: "part2_pages6-10.pdf", bytes: Uint8Array }
Enter fullscreen mode Exit fullscreen mode

Implementation Deep Dive

1. User Interface Component

The split component provides multiple splitting options:

// app/[locale]/_components/qpdf/split.tsx
export const Merge = () => {
  const [files, setFiles] = useState<File[]>([]);
  const [splitType, setSplitType] = useState("pages");
  const t = useTranslations("Split");

  const {
    value: splitSize,
    onChange: handleSplitSizeChange,
    setValue: setSplitSize,
  } = useInputValue<number>(0);
  const [splitSizeUnit, setSplitSizeUnit] = useState("MB");

  const { split, splitPagesVertically, splitPagesHorizontally } = usePdflib();

  const mergeInMain = async () => {
    console.log("Processing PDF split:", files[0]?.name);

    let outputFile: ArrayBuffer | null = null;

    if (splitType == "vertical") {
      // Split each page vertically into two pages
      outputFile = await splitPagesVertically(files[0]!);
      if (outputFile) {
        autoDownloadBlob(new Blob([outputFile]), "split.pdf");
      }
    } else if (splitType == "pages" || splitType == "size") {
      // Split by file size
      outputFile = await split(
        files[0]!,
        splitSizeUnit == "MB" ? splitSize * 1024 : splitSize,
      );
      if (outputFile) {
        autoDownloadBlob(new Blob([outputFile]), "split.zip");
      }
    } else if (splitType == "horizontal") {
      // Split each page horizontally into two pages
      outputFile = await splitPagesHorizontally(files[0]!);
      if (outputFile) {
        autoDownloadBlob(new Blob([outputFile]), "split.pdf");
      }
    }
  };

  const changeUnit = () => {
    if (splitSizeUnit == "MB") {
      setSplitSizeUnit("KB");
    } else {
      setSplitSizeUnit("MB");
    }
  };

  return (
    <PdfPage
      process={mergeInMain}
      onFiles={onPdfFiles}
      multiple={false}
      title={t("title")}
      desp={t("desp")}
    >
      <div className="p-5">
        <Radio
          defaultValue="pages"
          values={[
            { name: "pages", title: t("pages") },
            { name: "horizontal", title: t("horizontal") },
            { name: "vertical", title: t("vertical") },
            { name: "size", title: t("size") },
          ]}
          onValueChange={(e) => {
            setSplitType(e);
            if (e === "pages") {
              setSplitSize(0);
            }
          }}
        />

        {splitType == "size" && (
          <>
            <label className="label">{t("size_desp")}</label>
            <div className="flex">
              <input
                type="number"
                className="input validator"
                required
                placeholder="Type a number"
                onChange={handleSplitSizeChange}
                min="1"
                max="1000"
              />
              <button
                className="btn btn-primary join-item"
                onClick={changeUnit}
              >
                {splitSizeUnit}
              </button>
            </div>
          </>
        )}

        <p className="validator-hint">{t("size_value")}</p>
      </div>
    </PdfPage>
  );
};
Enter fullscreen mode Exit fullscreen mode

Key features:

  • Radio button selection for split type
  • Size input with MB/KB unit toggle
  • Different output formats (PDF for page splits, ZIP for size splits)

2. Size-Based Splitting Algorithm

Splits a PDF into multiple files based on maximum size:

// hooks/pdflib.worker.js
async function splitPdf(inputFile, maxSizeKb) {
  // Convert KB to bytes with 5% overhead buffer
  const MAX_SIZE_BYTES = maxSizeKb * 1024 * 0.95;

  // Read the input PDF
  const pdfBytes = await inputFile.arrayBuffer();
  const originalPdf = await PDFDocument.load(pdfBytes);
  const totalPages = originalPdf.getPageCount();
  const splitPdfs = [];

  // Pre-calculate size of each individual page
  const pageSizes = [];
  for (let i = 0; i < totalPages; i++) {
    const tempPdf = await PDFDocument.create();
    const [copiedPage] = await tempPdf.copyPages(originalPdf, [i]);
    tempPdf.addPage(copiedPage);
    pageSizes.push((await tempPdf.save()).length);
  }

  // Split logic: Group pages into chunks that fit the size limit
  let currentPdf = await PDFDocument.create();
  let currentTotalSize = 0;
  let partNum = 1;
  let startPageIdx = 0;

  for (let i = 0; i < totalPages; i++) {
    const currentPageSize = pageSizes[i];

    // Check if adding this page would exceed the limit
    if (
      currentTotalSize + currentPageSize > MAX_SIZE_BYTES &&
      currentPdf.getPageCount() > 0
    ) {
      // Save current chunk
      const pdfBytes = await currentPdf.save();
      splitPdfs.push({
        name: `part${partNum}_pages${startPageIdx + 1}-${i}.pdf`,
        bytes: pdfBytes,
      });

      // Start new chunk
      currentPdf = await PDFDocument.create();
      currentTotalSize = 0;
      partNum++;
      startPageIdx = i;
    }

    // Add current page to chunk
    const [copiedPage] = await currentPdf.copyPages(originalPdf, [i]);
    currentPdf.addPage(copiedPage);
    currentTotalSize += currentPageSize;

    // Handle last page
    if (i === totalPages - 1) {
      const pdfBytes = await currentPdf.save();
      splitPdfs.push({
        name: `part${partNum}_pages${startPageIdx + 1}-${i + 1}.pdf`,
        bytes: pdfBytes,
      });
    }
  }

  return splitPdfs;
}
Enter fullscreen mode Exit fullscreen mode

Algorithm explanation:

  1. Pre-calculation: Calculate size of each page individually
  2. Greedy grouping: Add pages to current chunk until size limit reached
  3. Chunk creation: Save current chunk and start new one
  4. Naming: Generate descriptive filenames with page ranges

3. ZIP Packaging

Packages multiple PDFs into a single ZIP file:

// hooks/pdflib.worker.js
async function zipSplitPdfs(splitPdfs, originalFileName) {
  const zip = new JSZip();

  // Add each PDF to the ZIP
  for (const pdf of splitPdfs) {
    zip.file(pdf.name, pdf.bytes);
  }

  // Generate ZIP with compression
  const zipBlob = await zip.generateAsync({
    type: "arraybuffer",
    compression: "DEFLATE",
    compressionOptions: { level: 6 },
  });

  console.log(
    `ZIP generated: ${splitPdfs.length} PDF files`,
    zipBlob.byteLength,
  );

  return Comlink.transfer(zipBlob, [zipBlob]);
}
Enter fullscreen mode Exit fullscreen mode

JSZip configuration:

  • type: "arraybuffer": Output as ArrayBuffer for easy transfer
  • compression: "DEFLATE": Standard ZIP compression
  • level: 6: Balanced compression (1-9 scale)

4. Vertical Page Splitting

Splits each page vertically into two separate pages:

// hooks/pdflib.worker.js
async function splitPagesVertically(file) {
  const inputPdfBytes = await file.arrayBuffer();
  const inputPdfDoc = await PDFDocument.load(inputPdfBytes);
  const outputPdfDoc = await PDFDocument.create();

  // Copy pages using embedPages for efficient rendering
  const copiedPages = await outputPdfDoc.embedPages(inputPdfDoc.getPages());
  const helveticaFont = await outputPdfDoc.embedFont(StandardFonts.Helvetica);

  copiedPages.forEach((originalPage, pageIndex) => {
    // Get original page dimensions
    const { width: originalWidth, height: originalHeight } = inputPdfDoc
      .getPage(pageIndex)
      .getSize();

    // New page dimensions: same width, half height
    const newPageWidth = originalWidth;
    const newPageHeight = originalHeight / 2;

    // Create top half page
    const topPage = outputPdfDoc.addPage([newPageWidth, newPageHeight]);
    topPage.drawPage(originalPage, {
      x: 0,
      y: -newPageHeight,  // Offset to show top half
      width: originalWidth,
      height: originalHeight,
    });
    topPage.drawText(` ${pageIndex * 2 + 1} `, {
      x: 10,
      y: newPageHeight - 20,
      size: 10,
      font: helveticaFont,
    });

    // Create bottom half page
    const bottomPage = outputPdfDoc.addPage([newPageWidth, newPageHeight]);
    bottomPage.drawPage(originalPage, {
      x: 0,
      y: 0,  // No offset, shows bottom half
      width: originalWidth,
      height: originalHeight,
    });
    bottomPage.drawText(` ${pageIndex * 2 + 2} `, {
      x: 10,
      y: newPageHeight - 20,
      size: 10,
      font: helveticaFont,
    });
  });

  return outputPdfDoc.save();
}
Enter fullscreen mode Exit fullscreen mode

Visual explanation:

Original Page (A4):
+------------------+
|                  |
|    TOP HALF      |  --> New Page 1
|                  |
+------------------+
|                  |
|   BOTTOM HALF    |  --> New Page 2
|                  |
+------------------+
Enter fullscreen mode Exit fullscreen mode

5. Horizontal Page Splitting

Splits each page horizontally into two separate pages:

// hooks/pdflib.worker.js
async function splitPagesHorizontally(file) {
  const inputPdfBytes = await file.arrayBuffer();
  const inputPdfDoc = await PDFDocument.load(inputPdfBytes);
  const outputPdfDoc = await PDFDocument.create();

  const copiedPages = await outputPdfDoc.embedPages(inputPdfDoc.getPages());
  const helveticaFont = await outputPdfDoc.embedFont(StandardFonts.Helvetica);

  copiedPages.forEach((originalPage, pageIndex) => {
    const { width: originalWidth, height: originalHeight } = inputPdfDoc
      .getPage(pageIndex)
      .getSize();

    // New page dimensions: half width, same height
    const newPageWidth = originalWidth / 2;
    const newPageHeight = originalHeight;

    // Create left half page
    const leftPage = outputPdfDoc.addPage([newPageWidth, newPageHeight]);
    leftPage.drawPage(originalPage, {
      x: 0,
      y: 0,
      width: originalWidth,
      height: originalHeight,
    });
    leftPage.drawText(` ${pageIndex * 2 + 1} `, {
      x: 10,
      y: newPageHeight - 20,
      size: 10,
      font: helveticaFont,
    });

    // Create right half page
    const rightPage = outputPdfDoc.addPage([newPageWidth, newPageHeight]);
    rightPage.drawPage(originalPage, {
      x: -newPageWidth,  // Offset to show right half
      y: 0,
      width: originalWidth,
      height: originalHeight,
    });
    rightPage.drawText(` ${pageIndex * 2 + 2} `, {
      x: 10,
      y: newPageHeight - 20,
      size: 10,
      font: helveticaFont,
    });
  });

  return outputPdfDoc.save();
}
Enter fullscreen mode Exit fullscreen mode

Visual explanation:

Original Page (A4):
+--------+--------+
|        |        |
|  LEFT  | RIGHT  |
|  HALF  |  HALF  |
|        |        |
+--------+--------+
    |         |
    v         v
 New Page 1  New Page 2
Enter fullscreen mode Exit fullscreen mode

Complete Processing Flow

Key Technical Decisions

1. Why Pre-calculate Page Sizes?

Pre-calculating individual page sizes allows for:

  • Accurate size-based splitting
  • Optimal page grouping
  • Prevention of oversized chunks

Without pre-calculation, we'd have to guess or use a trial-and-error approach.

2. Why JSZip?

JSZip provides:

  • Pure JavaScript ZIP creation (no server needed)
  • Compression to reduce download size
  • Browser-compatible ArrayBuffer output
  • Easy file organization

3. Why embedPages()?

The embedPages() method from pdf-lib:

  • Efficiently embeds existing pages into new documents
  • Preserves all page content (text, images, annotations)
  • Allows positioning with x/y offsets for splitting
  • Better performance than copying page content manually

4. Page Numbering

Each split page gets a visible page number:

page.drawText(` ${pageIndex * 2 + 1} `, {
  x: 10,
  y: newPageHeight - 20,
  size: 10,
  font: helveticaFont,
  color: rgb(0, 0, 0),
});
Enter fullscreen mode Exit fullscreen mode

This helps users keep track of the original page order.

Benefits of This Architecture

  1. Privacy First: Files never leave the browser
  2. No Size Limits: Handle PDFs of any size
  3. Multiple Strategies: Size-based, vertical, or horizontal splitting
  4. Automatic Packaging: ZIP file for multiple outputs
  5. Page Tracking: Visible page numbers on split pages
  6. Responsive UI: Web Workers prevent blocking

Try It Yourself

Want to split your PDFs without uploading them to a server? Try our free browser-based tool:

Split PDF Online →

All processing happens locally in your browser - your files never leave your computer!


Conclusion

Building a browser-based PDF splitting tool demonstrates how pdf-lib combined with JSZip can handle complex PDF manipulation tasks entirely client-side. The three different splitting strategies (size-based, vertical, horizontal) showcase the flexibility of the pdf-lib library.

This approach is ideal for:

  • Splitting large PDFs for email attachments
  • Creating printer-friendly page layouts
  • Managing document archives
  • Preparing documents for mobile viewing

The automatic ZIP packaging makes it easy to download multiple split files, while the visible page numbering helps users maintain document organization.

Top comments (0)