Posted on Jun 10

Anthropic’s engineer just told you to stop using markdown. Here’s what’s actually going on.

#ai #webdev #programming #claude

The “HTML vs Markdown” war broke the internet last week. Both sides got it wrong and the real answer was buried in the footnotes the whole time.

Last week, the engineering lead for Claude Code dropped a post called “The Unreasonable Effectiveness of HTML” and the dev internet split in half before the coffee finished brewing.

Thariq Shihipar who runs engineering on Anthropic’s Claude Code published 20 working examples showing why AI agents should output HTML instead of Markdown. Interactive navigation. Collapsible sections. Color-coded code reviews. Embedded visualizations. Shareable links. The post hit 4.4 million views in 16 hours.

The response was exactly what you’d expect from a community that treats tooling preferences like religion.

Team HTML declared Markdown dead. Team Markdown called it a security risk wrapped in a token tax. Threads filled up. Quote-tweets went sideways. Someone definitely posted a “this is why we can’t have nice things” reply. The usual.

Here’s the problem: both sides were arguing about the wrong question entirely.

The HTML camp got the direction right but hand-waved the costs the 3–5x token overhead, the AI-generated JavaScript risks, and the slightly awkward fact that Anthropic profits directly from you using more tokens. The Markdown camp identified real risks but is defending a set of constraints that haven’t been real since context windows hit a million tokens. They’re optimizing for a 2022 problem in a 2026 world.

The actual question the one neither camp bothered asking is simpler than any of that: who reads this output, and what do they do with it?

That’s it. That’s the whole framework. Everything else is noise.

So let’s talk about how Markdown became the default, why three of its core assumptions are quietly rotting, what the token math actually looks like when you run it, and why the format war was always a distraction from the decision tree that was sitting there the whole time.

How markdown became the default AI output (nobody chose it it was inherited)

Markdown didn’t win a format war. It just kept showing up at the right time, three times in a row, until nobody questioned it anymore.

The first wave was developers. John Gruber built Markdown in 2004 as a way to write readable plain text that converted cleanly to HTML. Convenient tool for bloggers. Then GitHub adopted it for READMEs, issues, and pull request descriptions and overnight, every open-source project on earth was writing Markdown. Not because it was evaluated and selected. Because it was already there.

The second wave was knowledge workers. Through the 2010s, Notion, Obsidian, and Jekyll built their entire editing experience around it. It became the default for wikis, note-taking, and static sites. The pitch was the same every time: human-readable and machine-parseable. Write it in any text editor, render it anywhere. Simple enough that anyone could pick it up in an afternoon, powerful enough that you never really needed anything else.

The third wave was AI. When ChatGPT launched in late 2022, it rendered responses in Markdown. Not because OpenAI ran a format evaluation. Because the training data was saturated with it GitHub repos, technical docs, wikis, blog posts, READMEs as far as the eye could see. Markdown was what the model had seen most, so Markdown was what the model produced. Every chatbot and coding assistant since has followed the same default.

I still have READMEs in repos from 2016 that look structurally identical to my Claude outputs today. Same heading hierarchy. Same bullet pattern. Same code block style. That should’ve been the first clue that something was on autopilot.

Three waves. Each one reinforcing the last. Nobody evaluated Markdown for AI output and decided it was the best fit. It inherited the job because it was already wearing the right shirt from the previous three interviews.

That inheritance is the problem. Because the world Markdown was designed for and the world we’re actually building in now are not the same world. And three assumptions that were completely reasonable when Markdown took over are breaking at the same time.

Three assumptions baked into markdown that are quietly rotting

Markdown became the default AI output under three premises. All three made sense in 2022. None of them really hold in 2026.

Premise 1: Humans edit the output.

Markdown was designed for people who write and revise their own text. That’s still how READMEs, docs, and blog posts work someone opens the file, rewrites a paragraph, pushes a commit. But agent output is different. You send a prompt. The agent generates a 2,000-word implementation plan, a code review, a competitive analysis. You read it. Maybe you share it. You almost never open it in an editor and start rewriting paragraphs.

When was the last time you actually did that?

Took a Claude output, opened it in VS Code, and edited the prose?

The format’s core value proposition easy to write and revise by hand no longer matches the actual use case. The agent wrote it. You’re just the reader now.

Premise 2: Content is small.

A 500-word doc renders fine in Markdown. A 3,000-word agent-generated architecture decision with trade-off tables, code samples, and implementation notes does not. Past roughly 100 lines, Markdown becomes a wall. No navigation, no collapsible sections, no way to jump to the part you actually care about without scrolling through everything you don’t.

Thariq’s observation on this is blunt: nobody really reads a Markdown file longer than 100 lines. They skim, miss things, and close it. The format that was perfect for a README is actively fighting you when the output is a full technical report.

Premise 3: Output is read-only.

The old workflow was linear.

Prompt → generate → read → close.

Done. But the agent era is pushing toward something different. Filter a table. Adjust a parameter. Compare two options side by side. Export a subset. Feed the result back into the next prompt as structured input. Markdown can’t carry any of that. It’s a one-way street with no exits.

Here’s the reframe that cuts through all three premises at once: Markdown is a report. HTML is an interface. You read a report and close it. You operate on an interface and feed the result forward.

That distinction matters more than any token cost calculation. But since token cost is the number everyone keeps citing, let’s actually run it.

The token math nobody actually ran

The main ammunition Team Markdown keeps loading is the token overhead. HTML costs 3–5x more tokens. They say it like it ends the conversation.

Almost nobody checks what that actually means in dollars.

Same 2,000-word report, three formats. Plain Markdown comes in around 3,000 output tokens. Lean semantic HTML proper structure, no heavy styling runs about 7,200. Full HTML with CSS, embedded charts, and interactive sections hits roughly 14,400. The “3–5x” range you’ve seen quoted is real. For rich HTML, you’re burning close to 5x the tokens.

Here’s what that costs per report at current Anthropic API pricing:

Markdown report: ~$0.072 on Claude Sonnet
Lean HTML: ~$0.17
Full HTML with styling: ~$0.34

The overhead on a single HTML report is less than the electricity cost to charge the phone you’re reading this on.

You need to generate 171 HTML reports on Claude Sonnet to spend one extra dollar compared to Markdown. One dollar. That’s the number people are building their entire format philosophy around.

This is what I’d call the Token Trap. Optimizing for a cost that’s a rounding error in your actual engineering budget while ignoring the cost that actually matters.

But the math has a second act, and Team Markdown deserves credit for it.

Scale it up and the numbers shift. At 100 reports per day, the HTML overhead on Claude Sonnet runs roughly $500 a month extra. At enterprise volume — thousands of agent calls daily across a whole platform you’re looking at real line items, not pocket change. Team Markdown isn’t wrong about this. They’re just applying it everywhere instead of where it actually applies.

Here’s what both camps keep skipping: human attention has a price too.

A senior engineer earns somewhere between $75 and $150 an hour. Fifteen minutes spent scrolling a Markdown wall hunting for the architecture decision buried in paragraph nine, re-reading a table that should have been filterable, copy-pasting a section into Slack because there’s no shareable link costs between $19 and $38 in engineer time.

The token overhead for that same report in HTML? Seventeen cents on Sonnet.

The Token Trap runs in both directions. Individual developers waste time debating $0.17. Enterprise teams burn thousands in engineer attention to save hundreds in token costs. In both cases, the format decision is being made on the wrong variable entirely.

The right variable is simpler. It’s not what the tokens cost. It’s who reads the output and what they do with it.

Which is exactly where we’re going next.

The decision tree that ends the format war

Every agent output has one of three audiences. The format choice follows directly from which one you’re dealing with. That’s the whole framework.

Reader 1: A human.

Your stakeholder opens a browser tab. They scan for the section they care about, screenshot a chart for Slack, share a link with the team, click through a collapsible architecture section without reading the parts that don’t apply to them. This is the use case Thariq built 20 examples around code reviews with inline severity colors, implementation plans with jump navigation, design system comparisons with live swatches you can actually interact with.

HTML wins here because the output is a destination. The reader navigates it, operates on it, shares it forward. Markdown flattens all of that into a scroll and hopes for the best.

Reader 2: Another agent.

Your output feeds a downstream pipeline. An agent reads the analysis, extracts structured data, makes a decision, triggers the next step. No human ever sees it. This is where Markdown still wins cleanly lightweight, parseable, diffable, and processable without any rendering overhead. Other models consume it without friction. Git tracks it. CI pipelines process it without choking.

Using HTML for agent-to-agent communication is like printing a spreadsheet, laminating it, and handing it to someone who’s going to retype all the numbers anyway.

Reader 3: Both.

This is the most common case in real engineering workflows, and it’s the one neither camp bothers addressing. A developer generates a PR review they read it themselves, and they also want it tracked in the repo. A team lead generates a weekly status report stakeholders view it in the browser, and the data feeds into next week’s planning prompt. Human and machine, same output, different needs.

The answer here isn’t picking a side. It’s: Markdown source, HTML artifact.

Keep Markdown as the editable, diffable, git-tracked source of truth. Generate an HTML companion for the humans who need to navigate and share it. This is actually what Thariq recommends in his own post it got buried under the tribal response, but it’s in there: keep Markdown in repositories, generate HTML as the companion artifact for review.

Both camps were arguing against a recommendation that never said what they thought it said.

The decision tree is three questions. Does only a human read this?

Use HTML. Does only an agent read this? Use Markdown. Do both read it?

Markdown source, HTML artifact. Screenshot that and you never have to read another format war thread again.

The framework is clean. The real world, predictably, is not.

Where it actually breaks and who profits from this shift

The decision tree is clean. Before you rewrite your CLAUDE.md to default HTML output, here are the risks Team Markdown got right and one they didn't mention at all.

Security is the real concern, not the token bill.

AI-generated HTML can include JavaScript. JavaScript means potential XSS vulnerabilities, local data leaks, and code execution you never asked for and definitely didn’t audit. This isn’t a theoretical edge case. If you’re generating HTML for internal tools, dashboards, or anything that touches real user data, you need either a strict no-JS constraint baked into your prompt or a review step before anything hits production.

Thariq’s own guidelines for generating HTML are pretty direct about this: no external CDN links, no unpkg imports, system fonts only, zero network calls at runtime. The vision is clean. The default behavior of most AI-generated HTML is not. You have to prompt for the guardrails explicitly, and most people don’t.

Accessibility doesn’t come for free.

AI-generated HTML misses WCAG compliance by default. No alt text on images, inconsistent focus order, contrast ratios that would fail a basic audit. If your outputs go anywhere near a public-facing interface or a team with accessibility requirements, you have to ask for it explicitly WCAG 2.2 AA, descriptive alt text, logical focus order, minimum 4.5:1 contrast. It’s solvable. It’s just not automatic, and the HTML enthusiasm tends to skip this part entirely.

Reviewability needs a pattern, not a format change.

HTML diffs are noisy. A one-line content change can generate 50 lines of diff because surrounding markup shifts. For teams that live in pull requests, this is real friction. The fix is the template-plus-data pattern keep the HTML structure static, store variable content in a JSON payload, diff only the JSON. Clean version control, rich visual output. Slightly more setup. Worth it if your team reviews agent output in git.

Now the part most coverage skipped.

Anthropic profits directly from this shift. HTML output burns 3–5x more tokens than Markdown. More tokens means more API revenue. And beyond the immediate billing, HTML output creates ecosystem stickiness once your team builds workflows around Claude-generated interactive reports and dashboards, switching to another model means rebuilding all of those workflows from scratch.

This isn’t a conspiracy theory. It’s just an incentive structure worth understanding before you adopt a recommendation wholesale. The engineer making the case works at the company that gets paid per token. That doesn’t make him wrong. Thariq’s examples are genuinely compelling and the use cases are real.

It just means you should read the footnotes.

The argument for HTML is strong for the right contexts. The tooling, the guardrails, and the security patterns are still catching up to the vision. Both things are true at the same time.

Markdown isn’t dying. It’s being promoted.

Here’s the reframe that actually resolves this: Markdown was always better as a machine-readable format than a human-readable one. The agent era just made that obvious.

Think about what Markdown actually is at its core. Structured plain text with lightweight syntax. Easy to parse, easy to diff, easy to version control, easy to feed into the next system in the pipeline. That’s not a display format. That was always a protocol. We just kept using it as a display format because nothing better had shown up yet for the AI output layer and because the training data made it the path of least resistance.

HTML isn’t the future of everything. It’s the future of output that a human actually needs to read, navigate, and act on. For everything else agent pipelines, git-tracked docs, machine-consumed reports, anything that feeds forward into another model Markdown stays. It just stops pretending to be something it isn’t.

The skill that actually matters now isn’t picking the right format. It’s knowing your reader before you write the prompt. Everything else follows from that one question. Who opens this output? What do they do with it? Do they scroll it, share it, diff it, or pipe it into something else? Answer that and the format choice becomes obvious every time.

The format war was always a distraction. Two camps arguing about tooling aesthetics while the actual decision was sitting there quietly the whole time, waiting for someone to ask the right question.

Thariq’s post wasn’t a declaration of war on Markdown. It was a reminder that the default was never chosen it was inherited. And inherited defaults are worth questioning, especially when the use cases have moved on.

So question it. Not because an Anthropic engineer said to. Because your outputs are being read by humans who deserve better than a 3,000-word scroll with no navigation, and by agents who deserve clean structured text without presentation overhead baked in.

Give each reader what they actually need. That’s the whole job.

Drop a comment if your team has already made the switch or if you’ve hit the security issues nobody’s talking about yet. Would genuinely like to know what patterns people are landing on in production.