sunshey

Posted on Jun 29

How I Compress PDFs in the Browser with Vue 3 and pdf-lib

#javascript #vue #pdf #webdev

Compressing a PDF sounds trivial. Drag a file, pick a level, get a smaller one back. But building it client-side — in the browser, no server — has a few subtleties worth documenting.

I built en.sotool.top/compress/ to handle this. Here's how it works under the hood with Vue 3 and pdf-lib.

Why Client-Side Compression?

PDFs often contain sensitive information. Contracts, financial statements, medical records. Compressing a file on a server means trusting someone else with your document.

Client-side compression gives you:

No upload bandwidth or size limits
Instant processing — the file never leaves your device
Works offline after the page loads
Zero quality loss beyond what you explicitly choose
No server costs to maintain

The tradeoff? You're limited by what the browser can do. For compression, pdf-lib can rewrite the PDF with compressed images and stripped metadata, which covers most real-world cases.

The Stack

Vue 3 — UI and state management
pdf-lib — Load, modify, and save PDFs
HTML Canvas — Re-encode images at lower quality
File API — Read the uploaded PDF

npm install pdf-lib

Loading the PDF

First, read the file into an ArrayBuffer and load it with pdf-lib.

import { PDFDocument } from 'pdf-lib'

const pdfFile = ref<File | null>(null)
const compressionLevel = ref<'low' | 'medium' | 'high'>('medium')
const processing = ref(false)
const originalSize = ref(0)
const compressedSize = ref(0)

async function handleFile(files: File[]) {
  if (files.length === 0) return
  pdfFile.value = files[0]
  originalSize.value = files[0].size
}

Mapping Compression Levels to Quality

Three user-facing levels map to different image quality and resize targets:

function getCompressionSettings(level: string) {
  switch (level) {
    case 'low':
      return { jpegQuality: 0.9, maxWidth: 2048 }
    case 'medium':
      return { jpegQuality: 0.7, maxWidth: 1400 }
    case 'high':
      return { jpegQuality: 0.45, maxWidth: 1024 }
    default:
      return { jpegQuality: 0.7, maxWidth: 1400 }
  }
}

The key insight: image quality is where 90% of PDF size comes from. Text, fonts, and metadata contribute, but a single high-res photo in a PDF can be 5–50MB. Rescaling and re-encoding images is the big win.

Compressing Images with Canvas

The core trick: extract each image from the PDF, draw it onto a Canvas at a smaller size or lower quality, then embed it back.

async function compressImageBytes(
  imageBytes: Uint8Array,
  quality: number,
  maxWidth: number
): Promise<Uint8Array> {
  const blob = new Blob([imageBytes])
  const bitmap = await createImageBitmap(blob)

  // Scale down only if the image exceeds maxWidth
  const scale = Math.min(1, maxWidth / bitmap.width)
  const width = Math.floor(bitmap.width * scale)
  const height = Math.floor(bitmap.height * scale)

  const canvas = document.createElement('canvas')
  canvas.width = width
  canvas.height = height

  const ctx = canvas.getContext('2d')!
  ctx.drawImage(bitmap, 0, 0, width, height)

  // Re-encode as JPEG at the target quality
  const compressedBlob = await new Promise<Blob>((resolve) => {
    canvas.toBlob((b) => resolve(b!), 'image/jpeg', quality)
  })

  return new Uint8Array(await compressedBlob.arrayBuffer())
}

A few things worth noting:

createImageBitmap decodes the image efficiently in the browser, off the main thread when possible
We scale down only if the image exceeds maxWidth — preserving aspect ratio
JPEG quality lets the user pick their compression level
We convert to JPEG regardless of original format. PNG images in PDFs are usually photos that benefit from lossy compression

Walking the PDF to Find Images

pdf-lib doesn't have a high-level "compress all images" API. You need to walk each page's resources and look for image operators.

async function compressPdfImages(
  pdfDoc: PDFDocument,
  quality: number,
  maxWidth: number
): Promise<void> {
  const pages = pdfDoc.getPages()

  for (const page of pages) {
    const resources = page.node.Resources()
    if (!resources) continue

    const xObject = resources.lookupMaybe('XObject')
    if (!xObject) continue

    for (const [, ref] of Object.entries(xObject.dict)) {
      const obj = xObject.lookupMaybe(ref)
      if (!(obj instanceof PDFStream)) continue

      const subtype = obj.getMaybe('Subtype')?.asString()
      if (subtype !== 'Image') continue

      const width = obj.get('Width').asNumber()
      const height = obj.get('Height').asNumber()
      const originalBytes = obj.getContents()

      const compressedBytes = await compressImageBytes(
        originalBytes, quality, maxWidth
      )

      // Embed the compressed image back
      let embeddedImage
      try {
        embeddedImage = await pdfDoc.embedJpg(compressedBytes)
      } catch {
        // If JPEG fails, try PNG
        embeddedImage = await pdfDoc.embedPng(compressedBytes)
      }

      // Replace the original image reference
      xObject.set(ref, embeddedImage.ref)
    }
  }
}

This is a simplified version. Production code handles:

PNG images — Some PDFs embed PNGs; we need to detect format
CMYK color spaces — Common in scanned documents from printers
Image masks — Transparency overlays that complicate replacement
Tiled images — Large images split across multiple stream objects

Saving the Compressed PDF

const pdfBytes = await pdfDoc.save({
  useObjectStreams: true,
  addDefaultPage: false,
})

Two options matter here:

useObjectStreams: true — Packs objects more efficiently, reducing file size by 5–15%
addDefaultPage: false — Prevents pdf-lib from adding an empty page if the input had none

The UI: Before/After Comparison

Users need to trust the result. Showing before/after file size is critical:

<script setup>
import { ref } from 'vue'

const file = ref(null)
const level = ref<'low' | 'medium' | 'high'>('medium')
const processing = ref(false)
const originalSize = ref(0)
const compressedSize = ref(0)

async function handleCompress() {
  if (!file.value) return
  processing.value = true

  try {
    const bytes = await file.value.arrayBuffer()
    const pdfDoc = await PDFDocument.load(bytes)

    const settings = getCompressionSettings(level.value)
    await compressPdfImages(pdfDoc, settings.jpegQuality, settings.maxWidth)

    const compressed = await pdfDoc.save({ useObjectStreams: true })

    originalSize.value = file.value.size
    compressedSize.value = compressed.byteLength

    downloadBlob(new Blob([compressed]), 'compressed.pdf')
  } catch (e) {
    console.error(e)
    alert('Compression failed. Check the file.')
  } finally {
    processing.value = false
  }
}

function downloadBlob(blob: Blob, filename: string) {
  const url = URL.createObjectURL(blob)
  const a = document.createElement('a')
  a.href = url
  a.download = filename
  a.click()
  URL.revokeObjectURL(url)
}
</script>

<template>
  <div class="compression-ui">
    <input type="file" accept=".pdf" @change="handleFile" />

    <div class="levels">
      <button :class="{ active: level === 'low' }" @click="level = 'low'">
        Low — Best quality
      </button>
      <button :class="{ active: level === 'medium' }" @click="level = 'medium'">
        Medium — Recommended
      </button>
      <button :class="{ active: level === 'high' }" @click="level = 'high'">
        High — Smallest file
      </button>
    </div>

    <button @click="handleCompress" :disabled="processing || !file">
      {{ processing ? 'Compressing...' : 'Compress PDF' }}
    </button>

    <div v-if="originalSize && compressedSize" class="results">
      <span>{{ formatSize(originalSize) }}</span>
      <span>→</span>
      <span>{{ formatSize(compressedSize) }}</span>
      <span class="saved">({{ Math.round((1 - compressedSize / originalSize) * 100) }}% smaller)</span>
    </div>
  </div>
</template>

Lessons Learned

Image extraction is messy. PDFs store images in wild ways — rotated, inverted, with masks, in CMYK, split into tiles. Handling every case perfectly is harder than it sounds. The production code is roughly 3× longer than the simplified version above.

JPEG re-encoding is the big win. For scan-heavy PDFs, re-encoding images at 70% quality can cut file size by 80% with barely visible quality loss. This single technique handles 90% of compression requests.

Text-only PDFs don't compress much. If a PDF is mostly text and already optimized with object streams, there's little a browser tool can do. 5–10% reduction is realistic.

Always let users pick the balance. Different documents need different compression. A text-heavy report can handle aggressive compression. A portfolio with screenshots needs a lighter touch. Three presets cover most cases.

Show before/after size. Users need to trust the result. Showing the original and compressed file sizes side by side builds confidence.

Try It

The compression tool is live at en.sotool.top/compress/.

Free, no signup, nothing uploads to a server.

Full source is on GitHub. The compression logic is in src/views/Compress.vue.

Want More Advanced PDF Tools?

If you need OCR, form editing, digital signatures, or batch processing hundreds of files, Wondershare PDFelement is a solid desktop option. It keeps everything local.

This post contains affiliate links.

Have you built client-side PDF tools? What did you use — pdf-lib, PDF.js, or something else?

DEV Community