FADI MAMAR for Konstruction Group Inc.

Posted on May 21

I Built a Dynamic llms.txt for Next.js. Then Google Said Don't Bother.

#nextjs #ai #seo #webdev

Two weeks ago I shipped a dynamic llms.txt and llms-full.txt for a Next.js 16 site I run. Hourly revalidation, pulls from the same content sources as the sitemap, auto-categorizes by URL pattern, returns proper text/plain. Took about 40 minutes to build.

Last week Google published their AI Search optimization guide.

The relevant line:

Don't create llms.txt files and other "special" markup.

So now I have a dynamic llms.txt that the search engine I most care about explicitly says not to bother with.

I'm keeping it anyway. Here's why, and here's the code in case you want to do the same.

What llms.txt is supposed to do

The llms.txt spec was proposed in 2024 (by Jeremy Howard at Answer.AI) as a way for sites to give AI assistants a condensed, machine-readable summary of the site's content. Two files:

/llms.txt — short, table-of-contents style. Top-level pages, brief descriptions.
/llms-full.txt — long-form. The actual content for AI ingestion.

The pitch: LLMs that crawl your site to answer user questions get a curated index instead of having to crawl every page.

In practice, it never became a real standard. Anthropic, OpenAI, and Perplexity have never officially committed to reading these files. Google has now publicly said it doesn't either.

But several smaller AI engines and developer-focused indexers do consume llms.txt when present. And the cost to maintain a properly designed one is approximately zero. So the cost/benefit math is interesting.

My implementation in Next.js 16 App Router

Three files. Total: about 90 lines.

Step 1: a renderer that pulls content from your CMS

I keep mine in lib/llms.ts. It reads from Sanity, but the same pattern works with anything: Markdown files, MDX, a database, a JSON config.

// lib/llms.ts
import { getAllBlogPosts, getAllResources } from '@/lib/sanity'

const SITE_URL = 'https://example.com'

function resourceCategory(slug: string): 'permit' | 'comparison' | 'glossary' | 'general' {
  if (slug.startsWith('building-permit-guide-')) return 'permit'
  if (slug.includes('-vs-')) return 'comparison'
  if (slug.startsWith('what-is-')) return 'glossary'
  return 'general'
}

export async function renderLlmsTxt(): Promise<string> {
  const posts = await getAllBlogPosts()
  const resources = await getAllResources()

  const lines: string[] = []
  lines.push('# Example Company')
  lines.push('')
  lines.push('> Toronto-based contractor specializing in framing, drywall, and insulation.')
  lines.push('')
  lines.push('## Core Pages')
  lines.push('- [Home](https://example.com)')
  lines.push('- [Services](https://example.com/services)')
  lines.push('- [About](https://example.com/about)')
  lines.push('- [Contact](https://example.com/contact-us)')
  lines.push('')

  const permits = resources.filter(r => resourceCategory(r.slug.current) === 'permit')
  const comparisons = resources.filter(r => resourceCategory(r.slug.current) === 'comparison')
  const glossary = resources.filter(r => resourceCategory(r.slug.current) === 'glossary')

  if (permits.length) {
    lines.push('## Building Permit Guides')
    for (const r of permits) {
      lines.push(`- [${r.title}](${SITE_URL}/resources/${r.slug.current})`)
    }
    lines.push('')
  }

  if (comparisons.length) {
    lines.push('## Comparison Pages')
    for (const r of comparisons) {
      lines.push(`- [${r.title}](${SITE_URL}/resources/${r.slug.current})`)
    }
    lines.push('')
  }

  if (glossary.length) {
    lines.push('## Glossary')
    for (const r of glossary) {
      lines.push(`- [${r.title}](${SITE_URL}/resources/${r.slug.current})`)
    }
    lines.push('')
  }

  if (posts.length) {
    lines.push('## Blog')
    for (const p of posts) {
      lines.push(`- [${p.title}](${SITE_URL}/blog/${p.slug.current})`)
    }
  }

  return lines.join('\n')
}

export async function renderLlmsFullTxt(): Promise<string> {
  // Same shape as renderLlmsTxt, but inline the actual body content per
  // resource/post. Pull body from your CMS, convert to plaintext, append
  // under each section header.
  // omitted for brevity
}

The category split is my preference. The spec is loose. What matters is that a machine reading the file gets a clean hierarchical view of your site.

Step 2: the route handlers

Next.js 16 App Router treats files inside /app as routes. Drop a route.ts inside any folder named for your URL path:

// app/llms.txt/route.ts
import { renderLlmsTxt } from '@/lib/llms'

export const revalidate = 3600 // 1 hour cache

export async function GET() {
  const body = await renderLlmsTxt()
  return new Response(body, {
    headers: { 'content-type': 'text/plain; charset=utf-8' },
  })
}

// app/llms-full.txt/route.ts
import { renderLlmsFullTxt } from '@/lib/llms'

export const revalidate = 3600

export async function GET() {
  const body = await renderLlmsFullTxt()
  return new Response(body, {
    headers: { 'content-type': 'text/plain; charset=utf-8' },
  })
}

The revalidate = 3600 line is the key. It tells Vercel/Next.js to serve a cached version for an hour, then regenerate on the next request. If you publish 10 blog posts a day, the file picks them up within 60 minutes. If you publish once a week, it picks them up within an hour of publishing. No build step required.

Step 3: deploy and verify

curl https://example.com/llms.txt

Should return your generated markdown-flavored text in seconds.

That's it. Total code: about 90 lines. Total time: 40 minutes if you've used Next.js before.

Then Google said don't bother

Google's AI optimization guide (published this month) is explicit:

Don't create llms.txt files and other "special" markup.

The reasoning, per Google: their AI features (AI Overviews, AI Mode) ingest the same content humans see. They crawl pages, parse the rendered HTML, and synthesize. They don't read llms.txt. Creating one is "unnecessary effort."

That's an honest position. Google isn't penalizing you for having an llms.txt. They're telling you it's wasted work from their perspective.

But Google isn't the only AI engine.

Why I'm keeping mine anyway

Four reasons.

1. The maintenance cost is zero

Because the file is dynamic and pulls from the same content source as the rest of the site, it updates itself. There's no separate workflow to maintain, no monthly chore, no risk of going stale. The code sat in the repo for two weeks with zero attention required. The next time I add a blog post or resource page, the file picks it up within an hour.

If maintenance cost were measured in minutes per month, llms.txt would be at zero. There's nothing to abandon.

2. Not every AI engine has spoken

Perplexity, ChatGPT, Claude, and several smaller AI search products have not officially said whether they read llms.txt. Some open-source RAG indexers do. Some agent frameworks default to checking for it.

If even one of these tools reads my llms.txt and produces a citation that a human user clicks, the file paid for itself. The opportunity cost of having it is approximately the bytes Vercel serves on the rare requests that hit the endpoint.

3. It functions as a structured fallback

The llms.txt is one of the cleanest, most human-readable summaries of my site that exists anywhere. When someone asks me for an overview of my site's content, I can paste the llms.txt into a chat and have an instant brief. It's a free side effect of building it.

4. Building it was a useful exercise

Going through the exercise of categorizing every page, deciding what's important, and giving each section a clean summary is exactly the kind of content audit most sites should do anyway. The output (the llms.txt file) is the artifact. The thinking is the value.

When to skip it

If your site is purely commerce, your content is highly dynamic (millions of product pages), or your content lives mostly behind login, llms.txt has less value. AI engines aren't going to ingest your product catalog from a flat text file.

If you're a content site, a docs site, a portfolio, an editorial publication, or anything where the page-level content matters and the URL structure has meaning, llms.txt is worth the 40 minutes.

Working example

You can see the live output at:

Both regenerate every hour. Both pull from the same Sanity CMS that drives the rest of the site. Total runtime memory cost: indistinguishable from zero. Pageviews: low but non-zero.

The bigger point

Google publishing guidance that says "don't bother with X" is not the same as Google penalizing X. Sometimes Google is telling you the optimal allocation of your effort. Sometimes you should agree. Sometimes you should keep doing X because the cost is zero and the upside, while small, is nonzero.

llms.txt is in the second bucket. Build it once, let it run, ignore the file for six months, see what happens.

If in 18 months Google publishes "we now use llms.txt to inform AI Overviews," you'll be glad you left it running. If they don't, the file cost you 40 minutes and approximately zero ongoing attention.

The optimal play with low-cost, low-confidence bets is to make a lot of them and let outcomes resolve themselves.

Build the llms.txt.

DEV Community