TL;DR — AI bots (ChatGPT, Claude, Perplexity, Google's AI Overviews) now make up a huge slice of your website traffic. They don't want your beautifully styled HTML. They want clean Markdown — and they'll thank you with citations. This post shows you exactly how to serve it using a 27-year-old HTTP feature that suddenly matters again: content negotiation.
The Day I Realized My Real Audience Wasn't Human
A few months ago I shipped two side projects. Nothing fancy — a transcription tool and a small productivity app. I did the usual: meta tags, Open Graph, sitemap, lighthouse score in the 90s, the works. The 2018 SEO playbook, basically.
Then I opened my Cloudflare analytics one morning and almost spilled my coffee.
In the last 24 hours alone:
-
Googlebot→ 99 hits -
Applebot→ 48 hits -
ClaudeBot→ 19 hits -
PerplexityBot,GPTBot,Bingbot,ClaudeSearchBot→ dozens more
A meaningful chunk of my "traffic" wasn't humans clicking blue links. It was machines. Specifically, it was AI agents scraping my content to answer somebody else's question on ChatGPT or Claude.
And here's the kicker: I had optimized my entire site for humans and Googlebot — neither of which is now the deciding visitor.
That's when it hit me: the old SEO is dead. Or more accurately, it's been demoted. The new game is optimizing for the bots that feed Large Language Models. And the way you do that is nothing like what you learned in 2015.
Let me walk you through what I figured out — and the small middleware change that can make your site dramatically more visible inside ChatGPT, Claude, and Perplexity.
The Three Acronyms You Need to Know in 2026
Before we get to the fix, here's the vocabulary nobody handed you:
| Term | What it optimizes for | Who reads your content |
|---|---|---|
| SEO (Search Engine Optimization) | Ranking on Google's blue links | Humans, via Googlebot |
| AEO (Answer Engine Optimization) | Getting quoted in AI Overviews, featured snippets, voice search | AI summarizers |
| GEO (Generative Engine Optimization) | Being cited by ChatGPT, Claude, Perplexity, Gemini | LLMs synthesizing answers |
A 2025 Gartner forecast estimated that organic search traffic to commercial websites would drop ~25% by the end of 2026 as answer engines absorb informational queries. BrightEdge has reported AI Overviews appearing on over 40% of Google results pages. Roughly 60% of Google searches now end without a click.
Translation: even when you "rank," fewer people are clicking. The AI ate your snippet and walked away.
So winning in 2026 isn't about ranking number one. It's about being the source the AI cites when it answers. And to be that source, the AI has to be able to read you efficiently. That's where the real story begins.
The Token Problem (Or: Why AI Bots Hate Your HTML)
Here's something that took me embarrassingly long to internalize:
Everything in AI is tokens.
Cost? Tokens. Context window? Tokens. Speed? Tokens. Accuracy? Tokens. Whether the model can even fit your page into its working memory? Tokens.
So when an AI agent crawls your site, every wasted token is wasted money, wasted context, and a worse answer for the user. And HTML is spectacularly wasteful.
Let me prove it. Suppose all you want to communicate is the phrase "Hello World."
In plain text, that's 11 characters. About 3 tokens.
In real-world HTML, the same phrase looks more like this:
<div class="hero hero--centered" data-testid="greeting-block" id="welcome">
<h1 class="text-4xl font-bold tracking-tight text-slate-900 dark:text-white">
Hello World
</h1>
</div>
That's ~200 characters and easily 60+ tokens to convey the same 3 tokens of actual meaning. That's a 20× tax the model pays to extract one phrase. Multiply that across an entire blog post buried inside a Tailwind component tree, a sticky nav, a cookie banner, three modal portals, and a footer with 47 links — and you start to see why an LLM treats most websites the way you'd treat a 400-page contract written in Comic Sans.
The bot is trying to help its user answer a question. You're handing it a pile of <div> confetti and asking it to find the meaning.
So the AI's response is rational: "Don't give me HTML. Give me the words."
"Just Send Plain Text" — The Trap That Doesn't Work
The obvious next thought is: fine, I'll just strip the HTML and serve raw text.
Don't. There are two problems.
Problem 1: Browsers need HTML. You can't host your site as a .txt file. Visitors expect a real, rendered page. Strip the CSS off your homepage and it looks like a 1996 Geocities accident.
Problem 2: Plain text loses structure. Was that line a heading? A list item? A code block? An external link? An image caption? Plain text flattens all of it. The AI gets your words but loses the relationships between them. Information loss in, hallucinations out.
You need a format that's:
- ✅ Structured enough to preserve meaning (headings, lists, links, code)
- ✅ Compact enough to be token-cheap
- ✅ Already trained into every major LLM as a first-class language
There's exactly one format that fits all three boxes.
Enter Markdown: The Native Tongue of LLMs
If you've ever pasted a doc into ChatGPT, you've already used it. Markdown is what every major LLM is trained on, what they output by default, and what they parse most reliably.
A heading is #. A list is -. Bold is **. Code is backticks. That's it. No attributes, no nested wrappers, no semantic divs.
Same "Hello World," in Markdown:
# Hello World
That's 13 characters. Maybe 4 tokens. The AI gets the exact same semantic information — "this is a top-level heading saying Hello World" — for less than 10% of the cost.
Now multiply that savings across an entire knowledge base. It's not a small efficiency gain. It's the difference between an LLM being able to read your whole product documentation in a single context window vs. catching fragments and guessing at the rest.
Cloudflare, Vercel, Stripe, Coinbase, Mintlify, Fern — every developer-tools company that takes AI agents seriously is now serving Markdown alongside HTML. According to Webflow, the number of sites adopting llms.txt (a related Markdown standard) grew 1,835% in roughly a year. This isn't a fringe idea anymore. It's becoming infrastructure.
But here's the question: how do you serve Markdown to bots without breaking the experience for humans?
The Magic Trick: Content Negotiation (a.k.a. "Markdown Negotiation")
The solution is delightfully old-school. HTTP has had content negotiation baked into it since 1999. It's the reason the same URL can return JSON to your mobile app, RSS to a feed reader, and HTML to a browser.
The mechanism is the Accept header. Every HTTP request includes one. Browsers send:
Accept: text/html,application/xhtml+xml,...
So the server happily returns HTML. But an AI agent crawler can send:
Accept: text/markdown
…and your server, if you've configured it correctly, can return a clean Markdown version of the exact same page. Same URL. Same canonical content. Different format.
This is what Cloudflare now calls "Markdown for Agents," and it's a one-toggle feature on their platform. But you don't need Cloudflare to do it — you can implement it yourself in any framework in about 15 minutes.
Implementation: A Universal Pattern
Here's the mental model. You write your content once (ideally in MDX or Markdown to begin with). Your server has a thin middleware that looks at the Accept header and decides what to send back.
The Express.js Version
// content-negotiation.js
import express from 'express';
import { renderHtml, renderMarkdown } from './renderers.js';
const app = express();
app.get('/blog/:slug', async (req, res) => {
const post = await getPostBySlug(req.params.slug);
if (!post) return res.status(404).send('Not found');
const accept = req.headers.accept || '';
// AI agent? Serve Markdown.
if (accept.includes('text/markdown')) {
res.set('Content-Type', 'text/markdown; charset=utf-8');
// Optional but nice: tell the bot how big this is in tokens
res.set('x-markdown-tokens', String(estimateTokens(post.markdown)));
return res.send(renderMarkdown(post));
}
// Browser? Serve HTML as usual.
res.set('Content-Type', 'text/html; charset=utf-8');
res.send(renderHtml(post));
});
app.listen(3000);
The Next.js (App Router) Version
// app/blog/[slug]/route.ts — handles GET requests
import { NextRequest } from 'next/server';
import { getPostBySlug } from '@/lib/posts';
export async function GET(
req: NextRequest,
{ params }: { params: { slug: string } }
) {
const post = await getPostBySlug(params.slug);
if (!post) return new Response('Not found', { status: 404 });
const accept = req.headers.get('accept') || '';
if (accept.includes('text/markdown')) {
return new Response(post.markdown, {
headers: {
'Content-Type': 'text/markdown; charset=utf-8',
'Cache-Control': 'public, max-age=3600',
},
});
}
// Fall through to your normal page render
return new Response(renderHtmlShell(post), {
headers: { 'Content-Type': 'text/html; charset=utf-8' },
});
}
⚠️ Important: don't try to convert your HTML to Markdown on the fly with a library. The whole point is to ship a separate, hand-curated, token-minimal file. Skip the navbar. Skip the footer. Skip the "related posts" widget. Strip the images unless they're load-bearing. Keep only what an AI needs to answer questions about that page: title, intro, key facts, conclusion.
The "Just Drop a Companion File" Pattern
If middleware feels like overkill, follow the pattern Cloudflare and Vercel use: expose every page as /page-url/index.md or append .md to any URL.
https://yoursite.com/docs/getting-started → HTML
https://yoursite.com/docs/getting-started.md → Markdown
https://yoursite.com/docs/getting-started/index.md → also Markdown
This is dead simple to host (it's literally a static file) and works perfectly with bots that don't yet send the Markdown Accept header. Belt and suspenders.
The Cherry on Top: llms.txt
While we're here, let's talk about the proposal that's quietly becoming the robots.txt of the AI era: llms.txt.
Proposed by Jeremy Howard (FastAI / AnswerAI) in late 2024, llms.txt is a single Markdown file at the root of your domain — yourdomain.com/llms.txt — that tells AI agents the shape of your site. Think of it as a hand-curated table of contents written specifically for LLMs.
A minimal example:
# Acme Widgets
> Acme Widgets makes open-source developer tools for shipping
> faster with fewer bugs. We focus on TypeScript, edge runtimes,
> and observability.
## Docs
- [Getting Started](https://acme.dev/docs/getting-started.md): 5-minute setup guide.
- [API Reference](https://acme.dev/docs/api.md): Full endpoint and SDK reference.
- [Pricing](https://acme.dev/pricing.md): Tiers, limits, and SLA.
## Guides
- [Deploying to Vercel](https://acme.dev/guides/vercel.md)
- [Self-hosting on AWS](https://acme.dev/guides/aws.md)
## Optional
- [Changelog](https://acme.dev/changelog.md)
- [Brand assets](https://acme.dev/brand.md)
And often a sibling file, llms-full.txt, which inlines the entire text of every important page in one Markdown blob — perfect for an AI agent that wants to load your whole knowledge base into a single context window.
Companies already shipping llms.txt: Cloudflare, Anthropic, Vercel, Mintlify, ElevenLabs, Cash App, Stripe-adjacent docs sites, and a fast-growing list. If your competitors aren't doing this yet, you have a window.
Don't Forget the Doorman: robots.txt
This part is unglamorous but matters. Your robots.txt is where you decide which AI bots are allowed to read you in the first place. With AI training and AI search now distinct activities, you can be granular:
# Allow AI search/answer bots (they cite you)
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Google-Extended
Allow: /
# (Optional) block scrapers you don't want
User-agent: CCBot
Disallow: /
Sitemap: https://yoursite.com/sitemap.xml
A few rules of thumb:
- If you want citations in ChatGPT/Claude/Perplexity, allow their search-time bots.
- If you don't want your content used for model training, you can selectively block training-time bots (e.g.
GPTBotfor OpenAI,Google-Extendedfor Google's training). - If you block a bot in
robots.txt, you're also blocking it from reaching your shiny new Markdown endpoints. Consistency matters.
The 2026 Checklist for Shipping AI-Friendly Pages
If you do nothing else today, do these five things:
-
Audit who's actually crawling you. Filter your access logs by user-agent. You'll be surprised how much of your "traffic" is
ClaudeBotandGPTBot. -
Add Markdown content negotiation. Either via the
Accept: text/markdownheader, or via companion.mdURLs, or both. -
Hand-write a
llms.txtat your domain root. Don't auto-generate it from your sitemap — curate it. Treat it like a conversation with a smart, busy intern. -
Update
robots.txtto explicitly allow the AI bots you want to be cited by, and block the ones you don't. - Strip your Markdown ruthlessly. No nav, no footer, no related-posts, no "subscribe to newsletter" CTAs. Just the answer. AI doesn't care about your conversion funnel.
Do this and you'll start showing up as a cited source in Perplexity and Claude answers within weeks. Skip it, and your competitors will own those citations forever.
A Last Honest Word
I want to be careful here. I'm not telling you SEO is dead in the literal sense. Google still drives the bulk of direct traffic for most sites. Strong fundamentals — fast pages, clear structure, real authority, original content — still matter. In fact, they matter more, because AI engines lean on the same trust signals (and a Princeton study from 2024 found that pages cited in AI Overviews correlate ~92% with pages already ranking in Google's top 10).
What's dead is the idea that optimizing for one audience is enough. In 2026, every page you ship has at least three readers: a human, a search crawler, and an AI agent. They each need a different version of the truth.
The good news? Once you set up the middleware, it's mostly free. You write the content once. The server decides who gets what. The AI bots eat their clean Markdown. The humans get their pretty page. And you get cited in the answers that are quietly replacing the search box.
Welcome to the agentic web. Ship the Markdown.
If this was useful
Drop a ❤️, follow for more deep-dives on the practical edge of AI engineering, and tell me in the comments: have you started seeing AI bots in your analytics? Which one is hitting you hardest? I'm curious whether my Claude-heavy mix is normal or just a function of my niche.
If you're shipping a side project right now, this is the cheapest, highest-leverage change you can make this weekend. Go look at your access logs. Then go write a llms.txt. Your future self — and a few thousand AI agents — will thank you.
Tags: #webdev #ai #seo #javascript #nextjs #llm #agents #markdown



Top comments (0)