DEV Community

Muhammad Omer Mirza
Muhammad Omer Mirza

Posted on

Convert Word and Excel to PDF in the browser (no server, no upload)

Office-to-PDF converters almost always upload your file to a server. For an invoice, a contract or a salary sheet, that's a lot of trust for a format change. But you can convert Word and Excel to PDF entirely in the browser — read the file with the right parser, rebuild it as clean HTML, and let the browser's own print engine produce a pixel-faithful PDF.

Here's the pattern for both, plus the one trick that makes the PDF output rock-solid instead of blank.

The idea: parse → HTML → native print

There's no need for a heavyweight PDF-drawing library. Browsers already have a great PDF renderer behind window.print() ("Save as PDF"). So the job is just: get the document into clean HTML, then print that HTML off-screen.

The reliable HTML→PDF helper

Render the HTML in a hidden <iframe> and call print() on it. This never produces the blank pages that html2canvas-based approaches sometimes do, and it keeps real text (selectable, searchable) instead of rasterizing:

function printHtmlAsPdf(html, { format = "a4", orientation = "portrait", margin = 12 } = {}) {
  const pageCss =
    `<style>@page{size:${format} ${orientation};margin:${margin}mm}
     html,body{margin:0;background:#fff;-webkit-print-color-adjust:exact;print-color-adjust:exact}</style>`;
  const doc = `<!DOCTYPE html><html><head><meta charset="utf-8">${pageCss}</head><body>${html}</body></html>`;

  const iframe = document.createElement("iframe");
  iframe.style.cssText = "position:fixed;left:-10000px;top:0;width:820px;height:1160px;border:0;";
  document.body.appendChild(iframe);

  iframe.onload = () => {
    setTimeout(() => {
      iframe.contentWindow.focus();
      iframe.contentWindow.print();
      setTimeout(() => iframe.remove(), 120000);
    }, 300);
  };
  const d = iframe.contentWindow.document;
  d.open(); d.write(doc); d.close();
}
Enter fullscreen mode Exit fullscreen mode

The user picks "Save as PDF" in the print dialog. That's the only UX trade-off, and you get perfect pagination and crisp text for free.

Word (.docx) → PDF with mammoth.js

mammoth.js converts .docx into semantic HTML — headings, lists, tables, bold/italic, embedded images. It deliberately ignores fiddly Word styling, which is exactly what you want for a clean PDF.

<script src="https://cdnjs.cloudflare.com/ajax/libs/mammoth/1.6.0/mammoth.browser.min.js"></script>
Enter fullscreen mode Exit fullscreen mode
const buf = await file.arrayBuffer();                // .docx read locally
const { value: html } = await mammoth.convertToHtml({ arrayBuffer: buf });

const styled = `
  <style>
    body{font-family:Georgia,serif;font-size:12pt;line-height:1.5;color:#111}
    h1,h2,h3{font-family:Arial,sans-serif;line-height:1.25}
    table{border-collapse:collapse;width:100%}
    td,th{border:1px solid #999;padding:6px 8px}
    img{max-width:100%;height:auto}
  </style>${html}`;

printHtmlAsPdf(styled, { format: "a4", margin: 16 });
Enter fullscreen mode Exit fullscreen mode

Gotcha: .docx only

mammoth handles the modern .docx (Open XML) format, not the legacy binary .doc. Detect it and tell the user to re-save:

if (!/\.docx$/i.test(file.name)) {
  alert("Please use a .docx file (open old .doc in Word and Save As .docx).");
}
Enter fullscreen mode Exit fullscreen mode

Excel (.xlsx/.csv) → PDF with SheetJS

SheetJS reads .xlsx, .xls and .csv, and can emit an HTML table per sheet:

<script src="https://cdnjs.cloudflare.com/ajax/libs/xlsx/0.18.5/xlsx.full.min.js"></script>
Enter fullscreen mode Exit fullscreen mode
const bytes = new Uint8Array(await file.arrayBuffer());
const wb = XLSX.read(bytes, { type: "array" });

let sections = "";
for (const name of wb.SheetNames) {
  const fullHtml = XLSX.utils.sheet_to_html(wb.Sheets[name]);
  // sheet_to_html returns a whole document — pull out just the <table>
  const table = (/<table[\s\S]*<\/table>/i.exec(fullHtml) || [fullHtml])[0];
  sections += `<h2>${name}</h2>${table}`;
}

const styled = `
  <style>
    body{font-family:Arial,sans-serif;font-size:10pt}
    table{border-collapse:collapse;width:100%}
    td,th{border:1px solid #b3b3b3;padding:4px 7px;white-space:nowrap}
    h2 + table{page-break-inside:auto}
  </style>${sections}`;

printHtmlAsPdf(styled, { format: "a4", orientation: "landscape", margin: 12 });
Enter fullscreen mode Exit fullscreen mode

Gotcha: sheet_to_html returns a full document

XLSX.utils.sheet_to_html() gives you a complete <html> page, not a fragment. If you concatenate several of those you get nested documents. Extract just the <table> (regex above) before stitching sheets together. Also: default to landscape — spreadsheets are wide and clip badly in portrait.

What carries over

Source Preserved Dropped
Word .docx headings, lists, tables, images, basic styling text boxes, footnotes, complex columns
Excel .xlsx every sheet's values + table structure charts, conditional formatting, cell colors

The tables and text are rebuilt from content, so the result is clean and readable rather than a pixel copy — and for sharing a finished document, that's the point.

  • Privacy: files are parsed and rendered locally; nothing is uploaded.
  • Cost: pure static hosting.
  • Text stays text: real, selectable PDF text — not a screenshot.

I built both of these as free tools — Word to PDF and Excel to PDF — running fully in the browser with no upload. The whole free PDF toolkit is here. Happy to talk through the mammoth/SheetJS edge cases in the comments.

Top comments (0)