DEV Community

monkeymore studio
monkeymore studio

Posted on

Converting PDF Pages to Images: A Client-Side Rendering Approach

Introduction

Converting PDF pages to images is a common need - whether you need to extract visuals for presentations, create thumbnails for a gallery, or share specific pages on social media. In this article, we'll explore how to build a pure browser-side PDF to image converter that renders each page as a high-quality PNG image and packages them into a downloadable ZIP file.

Why Browser-Side Conversion?

Traditional PDF to image conversion typically requires:

  1. Server Uploads: Sending your PDF to external servers
  2. Processing Queues: Waiting for server-side rendering
  3. Quality Limitations: Compressed or watermarked outputs
  4. Privacy Concerns: Documents stored on third-party servers

Browser-side processing offers significant advantages:

  • Documents never leave your device
  • Instant processing with no waiting
  • Full quality preservation (lossless PNG)
  • Complete privacy and security

Architecture Overview

Our implementation uses PDF.js for rendering and JSZip for packaging:

Note: Unlike other features in this application, PDF to image conversion happens entirely in the main thread (not a Web Worker), with only PDF.js using its own dedicated worker for parsing.

Core Technologies

1. PDF.js - Mozilla's PDF Library

PDF.js is the industry-standard JavaScript library for PDF rendering:

  • Canvas-based rendering: High-quality output
  • Web Worker support: Non-blocking PDF parsing
  • Text extraction: Can extract text alongside images
  • CMap support: Proper handling of CJK (Chinese, Japanese, Korean) characters

2. JSZip - In-Browser ZIP Creation

JSZip allows creating ZIP archives entirely in the browser:

  • No server required: Generate ZIPs client-side
  • Compression options: Configurable compression levels
  • Streaming support: Handle large files efficiently

Implementation

1. Entry Point - Page Component

// pdf2jpg/page.tsx
import { type Metadata } from "next";
import { getTranslations } from "next-intl/server";
import { seoConfig } from "../_components/seo-config";
import { Organize } from "@/app/[locale]/_components/qpdf/pdf2image";

export async function generateMetadata({
  params,
}: {
  params: Promise<{ locale: string }>;
}): Promise<Metadata> {
  const { locale } = await params;
  const seo =
    seoConfig[locale as keyof typeof seoConfig]?.pdf2jpg ||
    seoConfig["en-us"].pdf2jpg;

  return {
    title: seo.title,
    description: seo.description,
  };
}

export default async function Page() {
  return <Organize />;
}
Enter fullscreen mode Exit fullscreen mode

2. Main Component - PDF to Image Converter

// _components/qpdf/pdf2image.tsx
"use client";

import { useState } from "react";
import { useTranslations } from "next-intl";
import { PdfPage } from "../pdfpage";
import { usePdfjs } from "@/hooks/usepdfjs";
import { autoDownloadBlob } from "@/utils/pdf";

export const Organize = () => {
  const [files, setFiles] = useState<File[]>([]);
  const { page2image, isLoading } = usePdfjs();
  const t = useTranslations("Pdf2Jpg");

  const mergeInMain = async () => {
    console.log("Converting PDF to images");
    files.forEach((e) => console.log(e.name));

    // Convert PDF pages to images
    const outputFile = await page2image(files[0]!);

    if (outputFile) {
      autoDownloadBlob(new Blob([outputFile]), "images.zip");
    }
  };

  const onPdfFiles = (files: File[]) => {
    console.log("Files selected");
    files.forEach((e) => console.log(e.name));
    setFiles(files);
  };

  return (
    <PdfPage
      title={t("title")}
      onFiles={onPdfFiles}
      process={mergeInMain}
      processDisabled={isLoading}
    >
      <div className="text-sm text-gray-600">
        {t("description")}
      </div>
    </PdfPage>
  );
};
Enter fullscreen mode Exit fullscreen mode

3. Core Conversion Logic - usePdfjs Hook

// hooks/usepdfjs.ts
import { useEffect, useRef, useState } from "react";
import JSZip from "jszip";
import * as pdfjs from "pdfjs-dist";

type PdfjsLibType = {
  getDocument: typeof pdfjs.getDocument;
  GlobalWorkerOptions: typeof pdfjs.GlobalWorkerOptions;
};

export const usePdfjs = () => {
  const pdfjsRef = useRef<PdfjsLibType | null>(null);
  const [loaded, setLoaded] = useState(false);
  const [isLoading, setIsLoading] = useState(false);

  // Dynamically load PDF.js
  useEffect(() => {
    if (typeof globalThis === "undefined") {
      (window as any).globalThis = window;
    }

    const script = document.createElement("script");
    script.src = "/pdf/pdf.min.mjs";
    script.type = "module";
    script.async = true;
    script.onload = () => {
      console.log("pdfjs-dist loaded");
      const typedPdfjs = (window as any).pdfjsLib as PdfjsLibType;
      typedPdfjs.GlobalWorkerOptions.workerSrc = "/pdf/pdf.worker.min.mjs";
      pdfjsRef.current = typedPdfjs;
      setLoaded(true);
    };
    document.head.appendChild(script);

    return () => {
      document.head.removeChild(script);
      pdfjsRef.current = null;
    };
  }, []);

  const page2image = async (file: File): Promise<ArrayBuffer | null> => {
    if (!pdfjsRef.current) {
      console.error("pdfjs not ready yet");
      return null;
    }

    setIsLoading(true);
    const canvas = document.createElement("canvas");
    const arrayBuffer = await file.arrayBuffer();

    try {
      // Load PDF document with CMap support for Chinese characters
      const pdfDoc = await pdfjsRef.current.getDocument({
        data: new Uint8Array(arrayBuffer),
        cMapUrl: "https://cdn.jsdelivr.net/npm/pdfjs-dist@5.4.149/cmaps/",
        cMapPacked: true, // Essential for CJK (Chinese/Japanese/Korean) PDFs
      }).promise;

      const zip = new JSZip();

      // Render each page to canvas and save as PNG
      for (let i = 1; i <= pdfDoc.numPages; ++i) {
        const page = await pdfDoc.getPage(i);
        const viewport = page.getViewport({ scale: 1 });

        // Set canvas size to match PDF page
        canvas.width = viewport.width;
        canvas.height = viewport.height;

        // Render PDF page to canvas
        await page.render({
          canvasContext: canvas.getContext("2d")!,
          viewport: viewport,
        }).promise;

        // Convert canvas to PNG blob
        const pngBlob = await new Promise<Blob>((resolve, reject) => {
          canvas.toBlob((blob) => {
            if (!blob) {
              reject(new Error("Failed to create blob"));
            } else {
              resolve(blob);
            }
          }, "image/png"); // Lossless PNG format
        });

        // Add to ZIP archive
        zip.file(`page${i}.png`, pngBlob);
      }

      // Generate ZIP with compression
      const zipBuffer = await zip.generateAsync({
        type: "arraybuffer",
        compression: "DEFLATE",
        compressionOptions: { level: 6 },
      });

      return zipBuffer;
    } finally {
      setIsLoading(false);
    }
  };

  return { page2image, isLoading, loaded };
};
Enter fullscreen mode Exit fullscreen mode

Key Implementation Details:

  1. Dynamic Loading: PDF.js is loaded dynamically from /pdf/pdf.min.mjs
  2. Worker Configuration: PDF parsing happens in a worker via GlobalWorkerOptions.workerSrc
  3. CMap Support: Essential for rendering PDFs with Chinese, Japanese, or Korean text
  4. Scale 1: Renders at original PDF resolution (typically 72-150 DPI)
  5. PNG Format: Lossless compression for maximum quality
  6. ZIP Packaging: All pages packaged with DEFLATE compression

4. Canvas Rendering Process

5. ZIP Generation Helper

For extracting embedded images from PDFs (different from page rendering):

// lib/parsePdfImage.js
import JSZip from "jszip";

export async function zipImageBitmaps(data) {
  const zip = new JSZip();

  // Process each image
  for (let i = 0; i < data.length; i++) {
    const bitmap = data[i];

    // Convert ImageBitmap to PNG Blob
    const pngBlob = await imageBitmapToPngBlob(
      bitmap.data,
      bitmap.width,
      bitmap.height,
    );

    console.log("Image blob size", pngBlob.size);

    // Add to ZIP with original name
    zip.file(bitmap.name, pngBlob);
  }

  // Generate compressed ZIP
  const zipBuffer = await zip.generateAsync({
    type: "arraybuffer",
    compression: "DEFLATE",
    compressionOptions: { level: 6 },
  });

  return zipBuffer;
}

// Convert ImageBitmap to PNG using canvas
export async function imageBitmapToPngBlob(data, width, height) {
  const canvas = document.createElement("canvas");
  canvas.width = width;
  canvas.height = height;

  const ctx = canvas.getContext("2d");
  ctx.drawImage(data, 0, 0);

  return new Promise((resolve, reject) => {
    canvas.toBlob((blob) => {
      if (!blob) {
        reject(null);
      }
      resolve(blob);
    }, "image/png");
  });
}
Enter fullscreen mode Exit fullscreen mode

Complete User Flow

Technical Highlights

1. CMap Support for CJK Characters

const pdfDoc = await pdfjsRef.current.getDocument({
  data: new Uint8Array(arrayBuffer),
  cMapUrl: "https://cdn.jsdelivr.net/npm/pdfjs-dist@5.4.149/cmaps/",
  cMapPacked: true, // Essential for Chinese/Japanese/Korean PDFs
}).promise;
Enter fullscreen mode Exit fullscreen mode

Why CMaps Matter:

  • PDFs with Asian characters need character mapping tables
  • Without CMaps, Chinese text renders as gibberish
  • CDN-hosted CMaps ensure proper rendering

2. Canvas to Blob Conversion

const pngBlob = await new Promise<Blob>((resolve, reject) => {
  canvas.toBlob((blob) => {
    if (!blob) reject(new Error("Failed"));
    else resolve(blob);
  }, "image/png"); // Explicit PNG format
});
Enter fullscreen mode Exit fullscreen mode

PNG vs JPG:

  • PNG: Lossless, larger files, perfect quality
  • JPG: Lossy, smaller files, quality degradation
  • This implementation uses PNG for maximum fidelity

3. ZIP Compression Configuration

const zipBuffer = await zip.generateAsync({
  type: "arraybuffer",
  compression: "DEFLATE",
  compressionOptions: { level: 6 }, // 0-9, 6 is balanced
});
Enter fullscreen mode Exit fullscreen mode

Compression Levels:

  • 0: No compression (fastest)
  • 6: Balanced (default, good compression, reasonable speed)
  • 9: Maximum compression (slowest, smallest files)

4. Memory Management

// Create single canvas and reuse
const canvas = document.createElement("canvas");

for (let i = 1; i <= pdfDoc.numPages; ++i) {
  // Reuse same canvas for each page
  canvas.width = viewport.width;
  canvas.height = viewport.height;

  // Render and convert
  await page.render({ canvasContext, viewport }).promise;
  const blob = await canvasToBlob(canvas);

  // Canvas is cleared and reused for next page
}
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Single canvas instance reduces memory allocation
  • Garbage collection minimized
  • Better performance for large PDFs

Browser Compatibility

Requirements:

  • Canvas API - For rendering PDF pages
  • PDF.js - PDF parsing and rendering
  • JSZip - ZIP archive creation
  • ES6+ - Modern JavaScript features
  • File API - For reading PDF files

Supported in all modern browsers (Chrome, Firefox, Safari, Edge).

Performance Considerations

1. Resolution and Scale

// Current: Scale 1 (original PDF resolution)
const viewport = page.getViewport({ scale: 1 });

// For higher resolution:
const viewport = page.getViewport({ scale: 2 }); // 2x resolution
Enter fullscreen mode Exit fullscreen mode

Trade-offs:

  • Higher scale = Better quality but larger files
  • Lower scale = Smaller files but pixelated

2. Batch Processing

For very large PDFs, consider:

  • Streaming ZIP generation
  • Progress indicators
  • Page-by-page downloads

3. Memory Usage

All rendered pages are held in memory before ZIP generation:

  • 10-page PDF at 100KB per page = ~1MB memory
  • 100-page PDF at 500KB per page = ~50MB memory

Conclusion

Building a browser-side PDF to image converter demonstrates the power of modern web APIs. By combining:

  • PDF.js for high-quality PDF rendering
  • Canvas API for image generation
  • JSZip for client-side packaging
  • CMap support for international text

We've created a tool that offers:

  • Complete privacy - Documents never leave your device
  • Maximum quality - Lossless PNG output
  • International support - CJK character rendering
  • Instant processing - No server delays
  • Convenient packaging - ZIP download with all pages

The ability to convert PDF pages to images entirely in the browser makes document sharing and editing more accessible than ever.


Need to convert your PDF pages to images? Try our free online tool at Free Online PDF Tools - convert each page to high-quality PNG images packaged in a ZIP file, all processed locally in your browser for complete privacy!

Top comments (0)