Michael Lip

Posted on Mar 25 • Originally published at zovo.one

From Markdown to PDF Without Losing Your Formatting

#markdown #webdev #productivity #tutorial

I write everything in markdown. Blog posts, documentation, meeting notes, project proposals. But clients and stakeholders want PDFs. The conversion should be simple, but anyone who has tried it knows the formatting often breaks in ways that range from annoying to document-destroying.

Why the conversion is harder than it looks

Markdown is a text format designed for HTML output. PDF is a page-based format designed for print. These are fundamentally different rendering models.

HTML flows. Content wraps at the viewport width. There are no page boundaries. Markdown inherits this model.

PDF is paginated. Content must fit within specific page dimensions. Text that overflows must break at page boundaries. Headers, footers, and page numbers exist in PDF but have no equivalent in markdown.

The conversion must bridge these two models, and the bridge has cracks.

Code block overflow

The most common formatting casualty is code blocks. Markdown code blocks can be arbitrarily wide. In HTML, they scroll horizontally. In PDF, there is no horizontal scroll. A line that extends past the page margin either gets clipped (invisible code), wraps mid-syntax (broken readability), or shrinks the font to fit (unreadable).

The best approach is to keep code blocks under 80 characters wide when you know the output will be PDF. This is good practice anyway, but it becomes essential for PDF output.

# These fit in a standard PDF page at 10pt monospace
function short() { return true; }

# This will overflow and cause problems
function thisIsAVeryLongFunctionNameThatDemonstratesWhyYouShouldKeepLinesShort(parameterOne, parameterTwo, parameterThree) { return null; }

Table rendering

Markdown tables are the second major pain point. Wide tables that look fine on a web page will overflow a PDF page or compress columns until cell content is unreadable.

For PDF output, limit tables to 4-5 columns maximum and keep cell content concise. If you need a wider table, consider splitting it into multiple narrower tables or rotating the page to landscape orientation.

Image sizing

In HTML, images are sized by CSS and the browser viewport. In PDF, images need explicit dimensions. A full-width markdown image that looks fine on screen might appear enormous in a PDF, pushing everything else to subsequent pages.

Most markdown-to-PDF tools support image sizing syntax:

![Alt text](image.png){width=50%}

But this syntax is not standardized across parsers. The safest approach is to resize images before including them in the document.

The conversion pipeline

The typical markdown-to-PDF pipeline has three steps:

Markdown → HTML → PDF

Step 1 (Markdown to HTML) uses a standard parser like marked, markdown-it, or CommonMark.

Step 2 (HTML to PDF) uses a rendering engine. The main options:

Headless Chrome / Puppeteer produces the highest fidelity output because it uses a real browser engine. The downside is that it requires a Chrome installation and is resource-heavy.

const puppeteer = require('puppeteer');

async function htmlToPdf(html, outputPath) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setContent(html);
  await page.pdf({
    path: outputPath,
    format: 'A4',
    margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
    printBackground: true
  });
  await browser.close();
}

wkhtmltopdf is a standalone tool that uses WebKit for rendering. Lighter than Puppeteer but less actively maintained.

Pandoc takes a different approach, converting markdown to PDF via LaTeX. This produces excellent typography but requires a LaTeX installation and has a steeper learning curve.

Page breaks

Markdown has no concept of page breaks. For PDF output, you need to force them manually. The most common approach is a CSS class:

.page-break { page-break-after: always; }

And in the markdown:

## Chapter 1
Content here.

<div class="page-break"></div>

## Chapter 2
Content here.

This works when the conversion tool processes raw HTML in markdown. Some tools strip HTML, in which case you need tool-specific syntax.

Headers and footers

PDF documents typically have headers and footers with page numbers, document titles, and dates. These cannot be specified in markdown and must be configured in the conversion tool.

With Puppeteer:

await page.pdf({
  headerTemplate: '<div style="font-size:8px;text-align:center;width:100%">Document Title</div>',
  footerTemplate: '<div style="font-size:8px;text-align:center;width:100%"><span class="pageNumber"></span> / <span class="totalPages"></span></div>',
  displayHeaderFooter: true
});

Styling the output

A raw markdown-to-PDF conversion looks bland. Adding a CSS stylesheet transforms the output from "text dump" to "professional document."

Key styling targets for PDF:

Body font: a serif font (Georgia, Times) for readability in print
Code font: a monospace font with a slightly smaller size
Headings: clear hierarchy with adequate spacing
Links: styled but also showing the URL (users cannot click a PDF link if they print it)

For converting markdown to well-formatted PDF without setting up a local pipeline, I built a converter at zovo.one/free-tools/markdown-to-pdf. It handles code blocks, tables, and images correctly, adds page numbers, and applies clean typography. Paste your markdown, download the PDF.

I'm Michael Lip. I build free developer tools at zovo.one. 500+ tools, all private, all free.

DEV Community