DEV Community

monkeymore studio
monkeymore studio

Posted on

Compressing PDFs in the Browser: A WebAssembly-Powered Solution

Introduction

PDF files often contain large images, embedded fonts, and uncompressed data that can make them unnecessarily bulky. Traditional PDF compression services require uploading files to a server, which raises privacy concerns and creates dependency on network connectivity.

In this article, we'll explore how we built a pure client-side PDF compression tool that runs entirely in the browser using WebAssembly. By leveraging Ghostscript compiled to WASM and running it in a Web Worker, we achieved professional-grade PDF compression without ever sending user files to a server.

Why Browser-Based PDF Compression?

1. Privacy First

Your PDF files never leave your device. This is essential for sensitive documents like contracts, medical records, or financial statements.

2. No Server Infrastructure

Once deployed, the application runs entirely on the client side. No server costs for PDF processing operations.

3. Instant Processing

No upload/download delays. Compression happens immediately on your device.

4. Works Offline

After the initial page load, the tool works without internet connectivity.

5. Unlimited File Size

Process large PDFs limited only by your device's memory, not server upload limits or API quotas.

Architecture Overview

Our implementation uses a multi-layered architecture:

Core Components

1. The Main Component: compress.tsx

The entry point is a clean React component that orchestrates the compression workflow:

"use client";

import { useState } from "react";
import { useGs } from "@/hooks/usegs";
import { useTranslations } from "next-intl";
import { autoDownloadBlob } from "@/utils/pdf";
import { PdfPage } from "@/app/[locale]/_components/pdfpage";

export const Merge = () => {
  const [files, setFiles] = useState<File[]>([]);
  const { compress, loading } = useGs();
  const t = useTranslations("Compress");

  const mergeInMain = async () => {
    console.log("mergeInMain");
    files.forEach((e) => console.log(e.name));

    const outputFile = await compress(files[0]!);
    console.log("----页面获取到的arraybuffer结果", outputFile?.byteLength);

    if (outputFile) {
      autoDownloadBlob(new Blob([outputFile]), "compress.pdf");
    }
  };

  const onPdfFiles = (files: File[]) => {
    console.log("文件数量或者顺序变化");
    files.forEach((e) => console.log(e.name));
    setFiles(files);
  };

  return (
    <PdfPage
      title={t("title")}
      onFiles={onPdfFiles}
      desp={t("desp")}
      process={mergeInMain}
      loading={loading}
    >
      <div>{t("title")}</div>
    </PdfPage>
  );
};
Enter fullscreen mode Exit fullscreen mode

Key Design Decisions:

  1. Loading State Management: The hook provides a loading state that the UI uses to show processing indicators
  2. Single File Focus: Currently optimized for single file compression
  3. Automatic Download: Uses autoDownloadBlob utility for seamless file downloads

2. The Hook: usegs.ts

This hook manages the Web Worker lifecycle and provides a clean API for compression operations:

import { useRef, useEffect, useState } from "react";
import * as Comlink from "comlink";
import GsWorker from "worker-loader!./gs.worker.js";

export const useGs = () => {
  const workerRef = useRef<Comlink.Remote<WorkerFunctions>>(null);
  const [loading, setLoading] = useState<boolean>(false);

  interface WorkerFunctions {
    init: () => Promise<void>;
    compress: (file: File) => Promise<ArrayBuffer | null>;
  }

  useEffect(() => {
    async function initWorker() {
      if (workerRef.current) return;

      const worker = new GsWorker();

      worker.onerror = (error) => {
        console.error("Worker error:", error);
      };

      worker.onmessageerror = (event) => {
        console.error("Worker message error:", event);
      };

      // Use Comlink to create a proxy for the worker
      workerRef.current = Comlink.wrap<WorkerFunctions>(worker);

      console.log("init in main");
      await workerRef.current.init();

      return () => worker.terminate();
    }

    initWorker().catch(() => {
      return;
    });
  }, []);

  const compress = async (file: File): Promise<ArrayBuffer | null> => {
    console.log("在主线程调用 compress", workerRef.current);

    if (!workerRef.current) return null;

    setLoading(true);
    const r = await workerRef.current.compress(file);
    setLoading(false);

    console.log("在主线程调用 compress 返回", r);
    return r;
  };

  return { compress, loading };
};
Enter fullscreen mode Exit fullscreen mode

Why Comlink?

Comlink is a library from Google Chrome Labs that makes Web Workers enjoyable. Instead of dealing with low-level postMessage APIs and maintaining message type mappings, Comlink allows you to use the worker as if it were a regular JavaScript object:

// Without Comlink - verbose and error-prone
worker.postMessage({ type: 'compress', file: buffer });
worker.onmessage = (e) => {
  if (e.data.type === 'result') {
    // handle result
  }
};

// With Comlink - clean and type-safe
const result = await workerRef.current.compress(file);
Enter fullscreen mode Exit fullscreen mode

3. The Worker: gs.worker.js

This is where the heavy lifting happens. The worker encapsulates Ghostscript WebAssembly and handles PDF compression:

Initializing the WASM Module

async function init() {
  try {
    console.log("onload");

    // Initialize with version check
    const args = ["-v"];

    console.log("Ghostscript args:", args);

    const Module = {
      onRuntimeInitialized: function () {
        console.log("wasm loaded");
      },
      arguments: args,
      print: function (text) {
        console.log("GS:", text);
      },
      printErr: function (text) {
        console.error("GS Error:", text);
      },
      totalDependencies: 0,
      noExitRuntime: 1,
    };

    self.Module = Module;
    loadScript(); // Loads gs-worker.js which contains the WASM
  } catch (e) {
    console.error("Error in init:", e);
  }
}
Enter fullscreen mode Exit fullscreen mode

The Compression Algorithm

function compressInternal(dataStruct, responseCallback) {
  try {
    const {
      operation,
      customCommand,
      pdfSetting,
      files,
      advancedSettings,
      showTerminalOutput,
      showProgressBar,
    } = dataStruct;

    try {
      console.log("onload");

      // Build Ghostscript arguments
      let args = [];

      if (customCommand && customCommand.trim()) {
        // Parse custom command for advanced users
        args = parseCommandArgs(customCommand.trim());
        validateArgs(args, operation);
      } else {
        // Use predefined compression settings
        args = [
          "-sDEVICE=pdfwrite",           // Output device: PDF
          "-dCompatibilityLevel=1.4",    // PDF 1.4 compatibility
          "-dNOPAUSE",                   // Don't pause between pages
          "-dBATCH",                     // Exit after processing
          "-sOutputFile=output.pdf",     // Output filename
        ];

        if (files.length == 0) {
          args = ["-v"]; // Version check only
        }

        // Add QUIET mode unless showing output
        if (!showTerminalOutput && !showProgressBar) {
          args.splice(4, 0, "-dQUIET");
        }

        // Apply compression preset
        if (operation === "compress" && pdfSetting) {
          args.splice(2, 0, `-dPDFSETTINGS=${pdfSetting}`);
        }

        // Apply advanced settings if provided
        if (advancedSettings) {
          args = buildAdvancedArgs(advancedSettings, args);
        }

        args.push("input.pdf");
      }

      console.log("Ghostscript args:", args);

      // Pre-run: Write input file to virtual filesystem
      const preRun = function () {
        try {
          console.log("prerun", self);
          self.Module.FS.writeFile("input.pdf", new Uint8Array(files[0]));
        } catch (e) {
          console.error("Error writing input file:", e);
          responseCallback({
            error: "Failed to write input file: " + e.message,
          });
        }
      };

      // Post-run: Read output and cleanup
      const postRun = function () {
        try {
          var uarray = self.Module.FS.readFile("output.pdf");
          responseCallback({ data: uarray });

          // Cleanup filesystem
          try {
            self.Module.FS.unlink("input.pdf");
            self.Module.FS.unlink("output.pdf");
          } catch (cleanupError) {
            console.warn("Cleanup warning:", cleanupError);
          }
        } catch (e) {
          console.error("Error reading output file:", e);
          responseCallback({
            error: "Failed to generate output file: " + e.message,
          });
        }
      };

      preRun();
      self.Module["callMain2"](args);
      postRun();
    } catch (e) {
      console.error("Error in processing:", e);
      responseCallback({ error: "Processing error: " + e.message });
    }
  } catch (e) {
    console.error("Error in compressInternal:", e);
    responseCallback({ error: "Initialization error: " + e.message });
  }
}
Enter fullscreen mode Exit fullscreen mode

4. Compression Presets

Ghostscript provides several predefined compression profiles:

/*
/screen:最高压缩率,适合屏幕显示(分辨率低,约 72dpi)
/ebook:平衡压缩和质量,适合电子书(分辨率中等,约 150dpi)
/printer:较高质量,适合打印(分辨率高,约 300dpi)
/prepress:最高质量,适合印刷(保留颜色配置文件,不压缩图片)
/default:默认设置,平衡所有因素
*/
Enter fullscreen mode Exit fullscreen mode

These presets control:

  • Image resolution downsampling
  • Compression algorithms (JPEG, Flate, etc.)
  • Font embedding strategies
  • Color space conversions

5. Advanced Settings Builder

For power users, we provide fine-grained control over compression parameters:

function buildAdvancedArgs(advancedSettings, baseArgs) {
  let args = [...baseArgs];

  if (!advancedSettings) {
    return args;
  }

  // Set PDF compatibility level
  const compatIndex = args.findIndex((arg) =>
    arg.startsWith("-dCompatibilityLevel=")
  );
  if (compatIndex >= 0) {
    args[compatIndex] =
      `-dCompatibilityLevel=${advancedSettings.compatibilityLevel}`;
  } else {
    args.splice(
      2,
      0,
      `-dCompatibilityLevel=${advancedSettings.compatibilityLevel}`
    );
  }

  // Color image settings
  if (advancedSettings.colorImageSettings) {
    const colorSettings = advancedSettings.colorImageSettings;

    // Add downsample setting
    if (colorSettings.downsample !== undefined) {
      args.splice(-1, 0, `-dDownsampleColorImages=${colorSettings.downsample}`);
    }

    // Add resolution if downsampling enabled
    if (colorSettings.downsample && colorSettings.resolution) {
      args.splice(-1, 0, `-dColorImageResolution=${colorSettings.resolution}`);
    }
  }

  return args;
}
Enter fullscreen mode Exit fullscreen mode

6. Command Argument Parser

For users who want to provide custom Ghostscript commands:

function parseCommandArgs(commandStr) {
  const args = [];
  let current = "";
  let inQuotes = false;
  let quoteChar = "";

  for (let i = 0; i < commandStr.length; i++) {
    const char = commandStr[i];

    if ((char === '"' || char === "'") && !inQuotes) {
      inQuotes = true;
      quoteChar = char;
    } else if (char === quoteChar && inQuotes) {
      inQuotes = false;
      quoteChar = "";
    } else if (char === " " && !inQuotes) {
      if (current.trim()) {
        args.push(current.trim());
        current = "";
      }
    } else {
      current += char;
    }
  }

  if (current.trim()) {
    args.push(current.trim());
  }

  return args;
}
Enter fullscreen mode Exit fullscreen mode

This handles complex commands with quoted arguments like:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sOutputFile="output file.pdf"
Enter fullscreen mode Exit fullscreen mode

Complete Data Flow

WebAssembly Integration Architecture

Ghostscript Compression Commands Explained

The worker includes detailed comments about various compression strategies:

/**
 * 以下命令都可以压缩,不过对纯文本的pdf效果不好,甚至会变大,对很多图片的pdf会变小
 * 
 * # Basic compression for screen viewing (highest compression)
 * gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen 
 *    -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
 * 
 * # High quality prepress (no image compression)
 * gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress 
 *    -dColorImageFilter=/FlateEncode -dGrayImageFilter=/FlateEncode 
 *    -dMonoImageFilter=/FlateEncode -dNOPAUSE -dQUIET -dBATCH 
 *    -sOutputFile=output.pdf input.pdf
 * 
 * # Force Flate compression on all images
 * gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH 
 *    -dAutoFilterColorImages=false -dAutoFilterGrayImages=false 
 *    -dAutoFilterMonoImages=false -dColorImageFilter=/FlateEncode 
 *    -dGrayImageFilter=/FlateEncode -dMonoImageFilter=/FlateEncode 
 *    -sOutputFile=output.pdf input.pdf
 * 
 * # Optimize fonts
 * gs -sDEVICE=pdfwrite -dEmbedAllFonts=true -dSubsetFonts=true 
 *    -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
 * 
 * # Alternative: Using qpdf for linearization
 * qpdf --linearize --object-streams=generate input.pdf output.pdf
 * qpdf --compress-streams=y --object-streams=preserve input.pdf output.pdf
 */
Enter fullscreen mode Exit fullscreen mode

Performance Considerations

Memory Management

  • Virtual File System: Ghostscript WASM uses an in-memory filesystem. We clean up files immediately after processing
  • ArrayBuffer Transfer: Large files are handled as ArrayBuffers to minimize copies
  • Worker Lifecycle: The worker persists across compressions to avoid WASM reload overhead

Optimization Strategies

  1. Lazy Loading: The WASM module is only loaded when first needed
  2. Worker Persistence: Keep the worker alive for multiple operations
  3. Streaming Output: For very large files, consider chunked processing
  4. Progress Indicators: Show compression progress for large documents

Error Handling & Validation

The implementation includes comprehensive error handling:

function validateArgs(args, operation) {
  if (!args || args.length === 0) {
    throw new Error("No arguments provided");
  }

  // Check for required Ghostscript parameters
  const hasDevice = args.some((arg) => arg.startsWith("-sDEVICE="));
  const hasOutput = args.some((arg) => arg.startsWith("-sOutputFile="));

  if (!hasDevice) {
    throw new Error("Missing -sDEVICE parameter in command");
  }

  if (!hasOutput) {
    throw new Error("Missing -sOutputFile parameter in command");
  }

  return true;
}
Enter fullscreen mode Exit fullscreen mode

Browser Compatibility

Our implementation works in all modern browsers:

  • Chrome/Edge: Full support for WebAssembly and Web Workers
  • Firefox: Full support with excellent WASM performance
  • Safari: Full support (iOS 13+, macOS 10.15+)
  • Mobile: Works on iOS Safari and Android Chrome

Requirements:

  • WebAssembly support (all modern browsers)
  • Web Workers support
  • ES6 module support (for Comlink)

Security Considerations

  1. CSP Compliance: Dynamic script loading requires proper Content Security Policy
  2. Isolated Worker: PDF processing happens in an isolated worker thread
  3. No External Requests: Once loaded, the application works offline
  4. Memory Safety: WASM provides memory isolation from the main JavaScript context

Build Configuration

To use worker-loader with Comlink, your webpack configuration needs:

module.exports = {
  module: {
    rules: [
      {
        test: /\.worker\.js$/,
        use: { loader: 'worker-loader' }
      }
    ]
  }
};
Enter fullscreen mode Exit fullscreen mode

And TypeScript declarations:

declare module "worker-loader!*" {
  const Worker: new (options?: WorkerOptions) => Worker;
  export default Worker;
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

We've demonstrated how to build a professional-grade PDF compression tool that runs entirely in the browser. By combining:

  • Ghostscript WebAssembly for PDF processing
  • Web Workers for non-blocking execution
  • Comlink for elegant worker communication
  • Virtual File System for file operations

We created a solution that offers:

  • 🔒 Complete privacy - files never leave the device
  • Instant processing - no upload/download delays
  • 🌍 Offline capable - works without internet
  • 💰 Zero server costs - pure client-side processing
  • 🎛️ Professional quality - industry-standard Ghostscript engine

Try It Yourself

Ready to compress your PDFs? Visit our online tool:

👉 Compress PDF Online

Our tool is completely free, requires no registration, and processes everything locally in your browser. Reduce your PDF file size instantly while keeping your documents private and secure.


Built with ❤️ using Next.js, Ghostscript WebAssembly, and modern web technologies.

Top comments (0)