Emrah G.

Posted on Jan 20

A GitBook-Style Docs Site with Next.js (App Router), Markdown, SEO, and Cloudflare Workers

#webdev #documentation #gitbook #tutorial

Why build your own docs site?

If you already write your documentation in Markdown, the best docs site is often the one that:

Requires zero code changes when you add a new .md file
Feels like GitBook: sidebar navigation, table of contents, next/previous links
Ships with great SEO defaults: canonical URLs, Open Graph, Twitter cards, sitemap, robots, structured data
Deploys as static assets so it’s fast, cheap, and easy to host

That’s exactly what we built for RouteBot’s documentation, available here:

RouteBot Docs: https://docs.routebot.com/

This article walks through the architecture and key implementation details so you can replicate (or adapt) it for your own product.

What we built (feature checklist)

Markdown-powered content pipeline
- Auto-discovers Markdown files from a content/ directory
- Uses filename prefixes (e.g., 01-, 02-) for ordering
- Extracts title/description automatically if frontmatter is missing
GitBook-like UI
- Collapsible category sidebar
- “On this page” table of contents (H2–H4)
- Breadcrumbs
- Prev/Next navigation
Search
- Build-time generated JSON index
- Client-side search modal
SEO
- Per-page metadata (title/description/canonical)
- Open Graph & Twitter cards
- JSON-LD (article + breadcrumbs)
- Static robots.txt + sitemap.xml
Great UX
- Scroll resets to top on navigation
- Support email is copy-to-clipboard (not mailto:)
Deployment
- Next.js static export (out/)
- Cloudflare Workers with Static Assets (wrangler.jsonc)

Tech stack

Next.js 14 (App Router) with static export
Tailwind CSS + @tailwindcss/typography for clean Markdown rendering
remark + remark-gfm + remark-html for Markdown → HTML
gray-matter for optional frontmatter
Cloudflare Workers + Static Assets for hosting the static output

Project structure (high level)

This is the mental model:

content/<variant>/<language>/*.md — your Markdown source of truth
src/lib/docs.js — content loader + TOC extraction + navigation model
src/app/[slug]/page.js — the doc page route (SSG)
src/app/sitemap.js + src/app/robots.js — SEO essentials
src/app/search-index.json/route.js — build-time JSON search index endpoint
src/components/* — UI pieces (sidebar, TOC, search modal, etc.)
next.config.js — output: "export" for static export
wrangler.jsonc — Cloudflare Workers static assets config

1) Markdown pipeline: auto-discovery, slugs, titles, and descriptions

The heart of the system is a function that scans the content directory, then derives everything needed to generate pages:

slug from filename (removing the numeric prefix)
order from filename prefix (so navigation is stable)
title from frontmatter or the first # H1
description from frontmatter or the first non-heading paragraph
lastModified from the filesystem (used for sitemap + “last updated”)

Here’s the core idea (simplified from our implementation):

import fs from "fs";
import path from "path";
import matter from "gray-matter";

const DOCS_DIRECTORY = path.join(process.cwd(), "content", process.env.CONTENT_DIR);

export async function getAllDocs() {
  const files = fs.readdirSync(DOCS_DIRECTORY);

  return files
    .filter((file) => file.endsWith(".md") && file !== "README.md")
    .map((file) => {
      const filePath = path.join(DOCS_DIRECTORY, file);
      const fileContent = fs.readFileSync(filePath, "utf-8");
      const { data: frontmatter, content } = matter(fileContent);

      const titleMatch = content.match(/^#\s+(.+)$/m);
      const title = frontmatter.title || (titleMatch ? titleMatch[1] : file.replace(".md", ""));

      const descriptionMatch = content.match(/^[^#\n].+$/m);
      const description =
        frontmatter.description || (descriptionMatch ? descriptionMatch[0].slice(0, 160) : "");

      const slug = file.replace(".md", "").replace(/^\d+-/, "").toLowerCase();
      const orderMatch = file.match(/^(\d+)-/);
      const order = orderMatch ? parseInt(orderMatch[1], 10) : 999;

      const stats = fs.statSync(filePath);

      return { slug, title, description, fileName: file, order, lastModified: stats.mtime.toISOString() };
    })
    .sort((a, b) => a.order - b.order);
}

Why filename prefixes beat manual nav config

This approach avoids maintaining a hard-coded nav tree in code.

Add 31-new-feature.md → it appears automatically
Rename it to 05-new-feature.md → it moves automatically
You can still add frontmatter later, but you don’t need it to get started

2) Markdown → HTML rendering (with GFM) + Table of Contents

For output, we convert Markdown into HTML at build time using remark:

import { remark } from "remark";
import remarkGfm from "remark-gfm";
import remarkHtml from "remark-html";

const processedContent = await remark()
  .use(remarkGfm)
  .use(remarkHtml, { sanitize: false })
  .process(markdownContent);

const html = processedContent.toString();

For the TOC, we extract headings (H2–H4) and generate stable IDs:

function extractHeadings(content) {
  const headingRegex = /^(#{2,4})\s+(.+)$/gm;
  const headings = [];
  let match;

  while ((match = headingRegex.exec(content)) !== null) {
    const level = match[1].length;
    const text = match[2];
    const id = text.toLowerCase().replace(/[^\w\s-]/g, "").replace(/\s+/g, "-");
    headings.push({ level, text, id });
  }

  return headings;
}

Tailwind Typography (prose) makes the HTML look great with almost no extra work.

3) Navigation: categories derived from filename patterns

To get a GitBook-like sidebar, we group pages into categories based on filename prefixes.

Instead of annotating every Markdown file, we define category rules once:

const CATEGORIES = [
  { name: "Getting Started", pattern: /^(01|02)/, order: 1 },
  { name: "User Management", pattern: /^(05|06|07|08)/, order: 2 },
  { name: "Help & Support", pattern: /^(27|28)/, order: 3 },
];

Then build a navigation model:

export async function getNavigation() {
  const docs = await getAllDocs();
  const categoryMap = new Map();

  docs.forEach((doc) => {
    const category = CATEGORIES.find((cat) => cat.pattern.test(doc.fileName)) || { name: "Other", order: 999 };
    if (!categoryMap.has(category.name)) categoryMap.set(category.name, { name: category.name, order: category.order, items: [] });
    categoryMap.get(category.name).items.push(doc);
  });

  categoryMap.forEach((cat) => cat.items.sort((a, b) => a.order - b.order));
  return Array.from(categoryMap.values()).sort((a, b) => a.order - b.order);
}

This keeps authoring dead simple: you just drop Markdown files into content/ and follow a naming convention.

4) Routing: clean URLs without redundant prefixes

We serve docs at:

https://docs.routebot.com/welcome/

not:

https://docs.routebot.com/docs/welcome/

In Next.js App Router, that’s as simple as placing the route at:

src/app/[slug]/page.js

And for static builds, we generate params from the Markdown files:

export async function generateStaticParams() {
  const docs = await getAllDocs();
  return docs.map((doc) => ({ slug: doc.slug }));
}

5) SEO: metadata, canonical URLs, and JSON-LD structured data

We generate metadata per page using the Markdown-derived title/description, and always emit a canonical URL:

export async function generateMetadata({ params }) {
  const doc = await getDocBySlug(params.slug);
  const url = `${process.env.NEXT_PUBLIC_SITE_URL}/${params.slug}`;

  return {
    title: doc.title,
    description: doc.description,
    alternates: { canonical: url },
    openGraph: { title: doc.title, description: doc.description, url, type: "article" },
    twitter: { card: "summary", title: doc.title, description: doc.description },
  };
}

JSON-LD (TechArticle + BreadcrumbList)

Structured data helps search engines understand what your pages represent. We include:

TechArticle with publish/modified times
BreadcrumbList matching the UI

The pattern looks like:

<script
  type="application/ld+json"
  dangerouslySetInnerHTML={{ __html: JSON.stringify({ "@context": "https://schema.org", "@type": "TechArticle", headline: title }) }}
/>;

6) `sitemap.xml` + `robots.txt` generated from content

Because we already have a complete list of docs (and their timestamps), sitemap generation is trivial.

In src/app/sitemap.js we return an array of URLs:

import { getAllDocs } from "@/lib/docs";
import { SITE_URL } from "@/lib/config";

export default async function sitemap() {
  const docs = await getAllDocs();
  return [
    { url: SITE_URL, lastModified: new Date(), changeFrequency: "weekly", priority: 1.0 },
    ...docs.map((doc) => ({
      url: `${SITE_URL}/${doc.slug}`,
      lastModified: doc.lastModified,
      changeFrequency: "weekly",
      priority: doc.order <= 2 ? 1.0 : 0.8,
    })),
  ];
}

And robots.txt points to it:

export default function robots() {
  return {
    rules: { userAgent: "*", allow: "/" },
    sitemap: `${process.env.NEXT_PUBLIC_SITE_URL}/sitemap.xml`,
  };
}

7) Build-time search index + client-side search modal

To keep hosting simple, we generate a JSON search index at build time and ship it with the static output.

Search index endpoint

src/app/search-index.json/route.js returns a JSON array with caching headers:

export async function GET() {
  const searchIndex = await getSearchIndex();
  return Response.json(searchIndex, { headers: { "Cache-Control": "public, max-age=3600" } });
}

export const dynamic = "force-static";

Client search UX

On the client, a modal loads /search-index.json once, then scores matches by:

title hits (high score)
content hits (bounded)

This gives you instant, offline-ish docs search with no external services.

8) UX details that matter

Scroll to top on navigation

Static docs sites often feel “off” if the scroll position is preserved between pages. A tiny client component fixes it:

"use client";
import { useEffect } from "react";
import { usePathname } from "next/navigation";

export function ScrollToTop() {
  const pathname = usePathname();
  useEffect(() => {
    window.scrollTo({ top: 0, left: 0, behavior: "instant" });
  }, [pathname]);
  return null;
}

Copy-to-clipboard support email (no mailto)

Many users don’t want mailto: links. In the sidebar we display the email in a monospace pill and copy it on click:

await navigator.clipboard.writeText(supportEmail);

9) Static export configuration (Next.js)

In next.config.js, we enable static export:

const nextConfig = {
  output: "export",
  images: { unoptimized: true },
  trailingSlash: true,
  poweredByHeader: false,
};

module.exports = nextConfig;

This generates a fully static site in out/ — perfect for CDN hosting.

10) Deployment to Cloudflare Workers with Static Assets

Cloudflare Workers can host static sites by uploading your out/ directory as assets.

Our wrangler.jsonc is intentionally minimal:

{
  "$schema": "https://json.schemastore.org/wrangler.json",
  "compatibility_date": "2026-01-19",
  "assets": {
    "directory": "./out"
  }
}

Then deploy:

npm run build
npx wrangler deploy

Notes:

If you have multiple deployments (different domains/content variants), it can be convenient to omit a hard-coded name in wrangler.jsonc and let your Cloudflare project settings define it.
On Windows PowerShell, npm run build && npx wrangler deploy may require using ; instead of &&.

11) Multi-variant docs from one codebase (content type + language)

Even if you don’t need it today, it’s worth designing for multiple docs “variants”:

product docs vs. customer help
different domains per locale
separate SEO identities

We drive that with environment variables:

NEXT_PUBLIC_SITE_URL="https://docs.routebot.com"
NEXT_PUBLIC_LANGUAGE="en"
NEXT_PUBLIC_CONTENT_TYPE="admin"
CONTENT_DIR="admin/en"

The rest of the system (content loading, sitemap, metadata, UI strings) reads from those values.

Final notes + next improvements

If you want to take this even further:

Add syntax highlighting (e.g., rehype-pretty-code)
Add “Edit this page” links (pointing to your Git repo)
Add versioned docs (/v1/, /v2/) with separate sitemaps
Add Algolia/Meilisearch for large doc sets (1000+ pages)

If you’d like to browse the live result, start here:

RouteBot Docs: https://docs.routebot.com/

DEV Community

A GitBook-Style Docs Site with Next.js (App Router), Markdown, SEO, and Cloudflare Workers

Why build your own docs site?

What we built (feature checklist)

Tech stack

Project structure (high level)

1) Markdown pipeline: auto-discovery, slugs, titles, and descriptions

Why filename prefixes beat manual nav config

2) Markdown → HTML rendering (with GFM) + Table of Contents

3) Navigation: categories derived from filename patterns

4) Routing: clean URLs without redundant prefixes

5) SEO: metadata, canonical URLs, and JSON-LD structured data

JSON-LD (TechArticle + BreadcrumbList)

6) `sitemap.xml` + `robots.txt` generated from content

7) Build-time search index + client-side search modal

Search index endpoint

Client search UX

8) UX details that matter

Scroll to top on navigation

Copy-to-clipboard support email (no mailto)

9) Static export configuration (Next.js)

10) Deployment to Cloudflare Workers with Static Assets

11) Multi-variant docs from one codebase (content type + language)

Final notes + next improvements

Top comments (0)

Why build your own docs site?

What we built (feature checklist)

Tech stack

Project structure (high level)

1) Markdown pipeline: auto-discovery, slugs, titles, and descriptions

Why filename prefixes beat manual nav config

2) Markdown → HTML rendering (with GFM) + Table of Contents

3) Navigation: categories derived from filename patterns

4) Routing: clean URLs without redundant prefixes

5) SEO: metadata, canonical URLs, and JSON-LD structured data

JSON-LD (TechArticle + BreadcrumbList)

6) sitemap.xml + robots.txt generated from content

7) Build-time search index + client-side search modal

Search index endpoint

Client search UX

8) UX details that matter

Scroll to top on navigation

Copy-to-clipboard support email (no mailto)

9) Static export configuration (Next.js)

10) Deployment to Cloudflare Workers with Static Assets

11) Multi-variant docs from one codebase (content type + language)

Final notes + next improvements

6) `sitemap.xml` + `robots.txt` generated from content