DEV Community

Digitalofen
Digitalofen

Posted on

I Tried Running File Conversion Fully in the Browser (WASM, LibreOffice, FFmpeg)

Intro – The Problem

Most online file converters work the same way: you upload your file, wait, then download the result. That raises privacy concerns and doubles transfer time for large files (upload + download = double the time).

Personal context: I'm an audio producer and needed to convert some legacy audio formats for a project. I tried 5 different converter sites but none of them supported the formats I needed - or hit me with endless pop-ups and CAPTCHA walls. The whole experience felt quite sketchy and so I thought: there has to be a better way. Turns out: there is – but it's not as simple as I thought.

I wondered: how much of this can realistically run in the browser?

Turns out: more than I expected, but with hard limits nobody talks about.

The LibreOffice WASM Dream (That Didn't Work)

My first idea: compile LibreOffice headless to WebAssembly. It's the Swiss Army knife for document conversion (DOCX, ODT, PDF, PPTX...). If I could get it running in the browser, I'd be done.

Reality check:

  • Minimum binary size: ~150MB (even stripped)
  • Startup time: 10-15 seconds just to initialize
  • Memory usage: 200-300MB for a single DOCX→PDF conversion
  • Browser memory ceiling: ~500MB before crashes

For converting one file? Not realistic today in a production browser environment.


What Actually Works in the Browser

Instead, I went format-by-format. Here's what I got working:

Video/Audio

  • FFmpeg.wasm (~20MB WASM binary)
  • Performance: ~10-20% of native FFmpeg
  • Example: MP3 conversion takes ~10s (native: 1s)
  • Trade-off: Slower, but files never leave device

Images

  • Sharp.wasm + image-js + canvg
  • Performance: Image resize ~2s (native: 0.2s)
  • Handles: PNG, JPEG, WebP, SVG→PNG, GIF

PDFs

  • pdf-lib (creation/manipulation)
  • pdf.js (rendering/text extraction)
  • Works great for simple operations (merge, split, text extraction)
  • Fails on complex PDFs with embedded fonts/forms

Spreadsheets

  • SheetJS (xlsx.full.min.js)
  • Parses/creates Excel, CSV, ODS in browser
  • Limitations: Large files (>10MB) freeze UI

Archives

  • JSZip + pako + tar.js
  • ZIP/GZIP/TAR creation/extraction
  • Fast enough for most use cases

Optimization strategy:

// Lazy load WASM modules on demand
const loadFFmpeg = async () => {
  if (!ffmpegLoaded) {
    await import('@ffmpeg/ffmpeg');
    ffmpegLoaded = true;
  }
};

// Total WASM payload: ~30MB
// Initial bundle: 150KB
Enter fullscreen mode Exit fullscreen mode

The Performance Win I didn't expect

Here's a trick that saved me:

Problem: MP4→MOV conversion was timing out after >10 minutes. Full re-encoding with FFmpeg.wasm is just too slow.

The Solution: Container remux instead of re-encode.

// Check if codecs are compatible
const compatibleContainers = {
  'mp4': ['mov', 'mkv'],
  'mov': ['mp4', 'mkv'],
  'mkv': ['mp4', 'mov']
};

if (areCompatibleContainers(from, to)) {
  // Stream copy instead of re-encode
  ffmpegOptions.push('-c', 'copy');
  // Result: <10s instead of 10min (100× speedup)
}
Enter fullscreen mode Exit fullscreen mode

This works because MP4 and MOV use the same codecs (H.264, AAC) – you're just changing the container wrapper.


The Hard Parts

1. Browser Memory Limits

Safari: ~500MB, then crash

Chrome: ~1GB, then slow-down

Firefox: ~800MB

Large video files? Forget it. One hits the ceiling fast.

2. Safari vs Chromium Hell

  • Chrome: SharedArrayBuffer works (multi-threading possible)
  • Safari: SharedArrayBuffer requires COOP/COEP headers (breaks CDNs)
  • Solution: Detect and fallback to single-threaded mode
const canUseMultithreading = 
  (typeof SharedArrayBuffer !== 'undefined') &&
  (window.crossOriginIsolated === true);

const threadCount = canUseMultithreading ? '4' : '1';
Enter fullscreen mode Exit fullscreen mode

3. Large File Streaming

You can't load a 500MB video into memory. Browsers will kill the tab.

I tried: File streaming via ReadableStream

The reality: FFmpeg.wasm doesn't support streaming input yet (2025)

4. UI Freezing

Even with Web Workers, WASM blocks the main thread during initialization.

The solution: Show loading screen + delay 100ms to let UI render first.


The Hybrid Model (What I Actually Ended Up With. For Now.)

After weeks of experiments, here's what works:

Architecture split:

const shouldUseServer = (format, category, fileSize) => {
  // Browser-side (90% of conversions)
  if (category === 'image') return false;
  if (category === 'audio') return false;
  if (category === 'video' && fileSize < 50_000_000) return false;

  // Server-side (10% of conversions)
  if (format.includes('docx')) return true; // LibreOffice
  if (format.includes('pdf') && complex) return true; // Poppler
  if (fileSize > 100_000_000) return true; // Too large

  return false;
};
Enter fullscreen mode Exit fullscreen mode

Server-side stack:

  • LibreOffice (headless) for Office docs
  • Pandoc for Markdown/LaTeX/EPUB
  • Native FFmpeg for large videos
  • Poppler (pdftoppm, pdftotext) for complex PDFs
  • ClamAV malware scanning on all uploads (auto-reject infected files)
  • Auto-delete immediately / after 5 minutes

Why hybrid?

  • Pure client-side = unrealistic for complex formats
  • Pure server-side = privacy nightmare + infrastructure costs
  • Hybrid = best UX + privacy trade-off

The Real Trade-offs

Client-Side Pros:

  • Privacy (files never leave device)
  • Zero infrastructure costs
  • Instant conversion (no upload wait)

Client-Side Cons:

  • 10-20% performance of native
  • Memory limits (~50MB practical ceiling)
  • Browser compatibility hell
  • Some formats impossible (Office docs)

Server-Side Pros:

  • Full performance (native tools)
  • Unlimited file size
  • Complex formats supported

Server-Side Cons:

  • Privacy concerns (uploads)
  • Infrastructure costs
  • Upload/download overhead

Client
├── FFmpeg.wasm
├── Image tooling
├── PDF manipulation
└── SheetJS
↓ (fallback)
Server
├── LibreOffice
├── Pandoc
├── Native FFmpeg
└── Poppler

Key Takeaways

  1. Pure browser-side is possible for ~90% of conversions (images, audio, simple video)
  2. LibreOffice WASM is still unrealistic / unstable in 2026 (too large, too slow, too browser/platform-dependent)
  3. Hybrid architecture is the pragmatic solution (browser-first, server fallback)
  4. Container remux > re-encode when possible (100× speedup)
  5. Memory limits are the real bottleneck, not CPU

If you're building something similar: start with WASM, fall back to server only when necessary, and be honest with users about what runs where.


What I Built

Based on these experiments, I ended up packaging this hybrid approach into a small project to test it in production.

Tech stack:

  • Frontend: Next.js (static), FFmpeg.wasm, Sharp.wasm, pdf-lib, SheetJS
  • Backend: Node/Express, LibreOffice, Pandoc, ClamAV, Redis
  • Hosting: Cloudflare CDN (frontend), EU server (backend)

→ If you're curious: anythingconverter.com


Questions? I'm especially curious if anyone has cracked LibreOffice WASM or found better ways to handle large files in the browser.

Top comments (2)

Collapse
 
matthewhou profile image
Matthew Hou

This is really cool. The privacy angle alone makes browser-based conversion worth pursuing — no file ever leaves the user's machine.

Curious about the performance ceiling though. For video conversion with FFmpeg WASM, how does it compare to native FFmpeg on the same hardware? Last time I tried FFmpeg WASM it was about 5-10x slower, which is fine for small files but rough for anything over 100MB.

Collapse
 
digitalofen profile image
Digitalofen • Edited

Thanks! Yes for video you're spot on – I'm seeing 5-10× slower than native, sometimes worse. A simple MP4→MOV transcode that takes 30 seconds native can take 5-10 minutes in WASM.
And for now I'm stuck with single-threaded execution because FFmpeg.wasm's multi-threading support (or browser implementation?) is still experimental/buggy. At least for what i observed.

Audio is actually fast (MP3 conversions finish in seconds), but video is where WASM really struggles. The bottleneck is:

  1. Single-threaded WASM core (no multi-threading yet)
  2. Memory ceiling (~500MB before browsers crash on my side)
  3. H.264 encoding is just CPU-heavy

So... in this regard - not that much seems to have changed since the last time you tried.
If you find any tricks/workarounds I missed, I'd be truely thankful.