DEV Community

Max/Wang
Max/Wang

Posted on

Browser-based DOCX to PDF converter using Pandoc and Typst (WASM)

Hi Fr,

I built a Word-to-PDF converter that runs 100% in your browser. No files ever touch a server.

Link: https://toolkuai.com/word-to-pdf

The Stack

The challenge was handling the complex layout of .docx files without a heavy backend. I settled on a "double-engine" approach using WebAssembly:

  • Pandoc (WASM): Used to parse the DOCX structure and convert it into Typst code. I’m utilizing a WASI shim to run Pandoc's Haskell-compiled WASM in the browser.

  • Typst (WASM): Instead of using a heavy TeX engine, I used Typst's modern rendering engine to turn the intermediate code into a polished PDF.

Technical Highlights

  • Media Extraction: The tool extracts images from the DOCX file via Pandoc’s --extract-media, maps them into a virtual file system (WASI), and then passes them to Typst for final rendering.

  • Custom Fonts: To support multi-language characters (especially CJK), I’m side-loading custom fonts (GenYoGothic) into the Typst compiler within the worker.

  • Svelte 5: The UI is built with Svelte 5, leveraging its new runes ($state, $derived) for a very snappy, reactive file queue management.

  • Web Workers: All the heavy lifting (WASM instantiation and conversion) happens in a dedicated worker to keep the UI at a buttery 60fps.

Why Pandoc + Typst?

Pandoc is the swiss army knife of document conversion, but its direct PDF output usually requires a LaTeX distribution. By routing it through Typst, I can get high-quality PDF output with a much smaller WASM footprint and faster rendering times.

I’d love to hear your thoughts on the performance or any edge cases you find with complex DOCX layouts!

Top comments (0)