Mohammad Oveisi

Posted on Feb 21 • Edited on Feb 22

Serve AI-Ready Markdown from Your Symfony App

#php #symfony #markdown #ai

Introduction

AI agents don't read your website the way humans do.

They don't care about your Tailwind or any CSS classes!
They ignore your navigation menus!
They don't execute your JavaScript!

They just want clean, structured content. As tools like ChatGPT, Claude, Perplexity, and autonomous crawlers increasingly browse the web, serving raw HTML to them becomes inefficient. HTML contains layout noise --- headers, footers, scripts, styles, cookie banners --- all of which waste tokens and reduce semantic clarity.

What AI systems prefer is simple:

Clean structure!
Clear headings!
Minimal markup!
Content without presentation noise!

Instead of duplicating templates or maintaining a second content pipeline, I built a solution.

Introducing `symfony-markdown-response-bundle`

👉 GitHub: https://github.com/soleinjast/symfony-markdown-response-bundle

📦 Packagist: https://packagist.org/packages/soleinjast/symfony-markdown-response-bundle

A lightweight Symfony bundle that transparently converts your existing HTML responses into Markdown — only when requested.

No duplicate routes!
No separate Markdown controllers!
No extra maintenance!

Just AI-ready Symfony responses.

Why This Matters

For 20+ years, we optimized the web for browsers.

Now we also need to optimize it for machines that read and reason.

When an AI agent visits your site:

It consumes tokens.
It processes structure.
It extracts meaning.

Serving full HTML with layout wrappers increases token usage and reduces signal quality.

Markdown, on the other hand:

Preserves headings and hierarchy.
Keeps paragraphs intact.
Removes presentation noise.
Improves semantic clarity.
Is dramatically lighter.

What This Bundle Does

symfony-markdown-response-bundle intercepts Symfony responses and, when appropriate:

Detects whether the request prefers Markdown.
Cleans the HTML output.
Converts it to Markdown.
Returns text/markdown instead of text/html.

All transparently.

Your controllers remain unchanged.

Conversion Drivers

The bundle supports two conversion backends.

`local` (default)

Uses league/html-to-markdown.

Conversion happens in-process with no external dependencies.
Before conversion, the following nodes are stripped automatically:

head
script
style
nav
footer
aside

`cloudflare`

You can offload conversion to a Cloudflare Workers AI endpoint.

symfony_markdown_response:
    driver: cloudflare
    cloudflare_endpoint: 'https://your-worker.example.workers.dev/to-markdown'

Requires symfony/http-client.

The HTML is posted to your Worker endpoint, and the response body is returned as Markdown.

Smart Caching

Converted Markdown is cached by default using a PSR-6 cache pool.

Cache key: xxh3 hash of the preprocessed HTML
Default TTL: 3600 seconds

Cache resolution order:

cache_service (if configured)
cache.app
cache.system
No caching

To disable caching:

symfony_markdown_response:
    cache_enabled: false

How It Works

1️⃣ Opt-In via Attribute

use Soleinjast\MarkdownResponseBundle\Attribute\ProvideMarkdownResponse;

#[Route('/about')]
#[ProvideMarkdownResponse(enabled: true)]
public function about(): Response
{
    return $this->render('about.html.twig');
}

2️⃣ Smart Request Detection

The bundle determines whether to serve Markdown based on:

Accept: text/markdown
.md URL suffix (e.g. /about.md)
Known AI user agents (GPTBot, ClaudeBot, ChatGPT-User, etc.)

Normal browser traffic continues receiving HTML.
AI agents get Markdown automatically.

3️⃣ Clean Conversion Pipeline

Before conversion, the bundle:

Removes layout noise
Keeps semantic content intact

Then it converts the cleaned HTML into structured Markdown.

Optional caching ensures performance remains optimal.

Installation

composer require soleinjast/symfony-markdown-response-bundle

Then enable it in your Symfony project and annotate your routes.

Final Thoughts

The web is no longer consumed only by humans.

If your application already produces meaningful HTML, why not let machines consume it in the format they understand best?

symfony-markdown-response-bundle bridges that gap --- cleanly,
transparently, and with zero duplication.

Top comments (6)

Harsh • Feb 23

This is such an underrated topic! Finally someone talking about AI-first content architecture.

I've noticed the same thing while testing how ChatGPT reads my blog posts — it completely ignores the fancy UI and just grabs the raw text. Makes me wonder if we should all start serving parallel markdown versions of our content.

Question: Do you think this will eventually become a standard like sitemaps or robots.txt? Something like ai-content.md or content.llm.txt? 🤔

Mohammad Oveisi • Feb 23

Exactly.

AI systems don’t care about layout — they care about semantic clarity and token efficiency.

HTML was optimized for humans and browsers.
Markdown is closer to optimized for machines.

I genuinely think we’ll see some convention emerge — maybe not ai-content.md, but something equivalent to what robots.txt did for search engines.

The web evolved once for search.
It’s probably about to evolve again for AI.

david duymelinck • Feb 22 • Edited

I have seen a similar library in the Laravel ecosystem. And I think your library has the same problem, namely assuming the HTML content is the right content for an LLM.

HTML pages are meant for humans and search bots. We could get away adding extra information for search bots because their main goal is to make the pages discoverable in a search.
AI scapers use the content to get knowledge for their output, that is a totally different use case.

Converting HTML output to markdown is only a partial solution. The content needs to be as meaningful for the LLM as possible because you don't want it to fill the context window with useless information.

I saw you added a preprocessor, for me manipulating content should not be the task of a markdown output library. It should be the task of the controller.
You can still have a single controller for multiple output types. That is how API endpoint output worked for years.

Mohammad Oveisi • Feb 22

Thanks for the thoughtful feedback — I agree that HTML designed for humans is not automatically optimal for LLM consumption.
That said, this bundle does not aim to optimise or redesign domain content for LLMs. Its responsibility is intentionally narrow: handling content negotiation and transforming an already-rendered HTML response into a Markdown representation.
It does not rewrite, summarise, or semantically restructure content. Controllers remain fully responsible for deciding what data is exposed and how it is structured.
If an application requires LLM-specific shaping or restructuring, that logic should live at the controller or domain level. The bundle simply provides extension points for cases where additional processing is explicitly desired.
I appreciate the architectural perspective — it’s a valuable distinction.

david duymelinck • Feb 22 • Edited

The main reason that I reacted is because the solution is becoming mainstream, Cloudflare has functionality to do it, also Laravel cloud. Because it is getting more in the spotlight I think developers are going to assume that is the best way forward.

I think the better option could be as simple as return $this->render($markdownOutput ? 'home.md.twig : 'home.html.twig', $data);. This gives you the most options.