Introduction
AI agents don't read your website the way humans do.
They don't care about your Tailwind or any CSS classes!
They ignore your navigation menus!
They don't execute your JavaScript!
They just want clean, structured content. As tools like ChatGPT, Claude, Perplexity, and autonomous crawlers increasingly browse the web, serving raw HTML to them becomes inefficient. HTML contains layout noise --- headers, footers, scripts, styles, cookie banners --- all of which waste tokens and reduce semantic clarity.
What AI systems prefer is simple:
Clean structure!
Clear headings!
Minimal markup!
Content without presentation noise!
Instead of duplicating templates or maintaining a second content pipeline, I built a solution.
Introducing symfony-markdown-response-bundle
π GitHub: https://github.com/soleinjast/symfony-markdown-response-bundle
π¦ Packagist: https://packagist.org/packages/soleinjast/symfony-markdown-response-bundle
A lightweight Symfony bundle that transparently converts your existing HTML responses into Markdown β only when requested.
No duplicate routes!
No separate Markdown controllers!
No extra maintenance!
Just AI-ready Symfony responses.
Why This Matters
For 20+ years, we optimized the web for browsers.
Now we also need to optimize it for machines that read and reason.
When an AI agent visits your site:
It consumes tokens.
It processes structure.
It extracts meaning.
Serving full HTML with layout wrappers increases token usage and reduces signal quality.
Markdown, on the other hand:
Preserves headings and hierarchy.
Keeps paragraphs intact.
Removes presentation noise.
Improves semantic clarity.
Is dramatically lighter.
What This Bundle Does
symfony-markdown-response-bundle intercepts Symfony responses and, when appropriate:
Detects whether the request prefers Markdown.
Cleans the HTML output.
Converts it to Markdown.
Returns
text/markdowninstead oftext/html.
All transparently.
Your controllers remain unchanged.
Conversion Drivers
The bundle supports two conversion backends.
local (default)
Uses league/html-to-markdown.
Conversion happens in-process with no external dependencies.
Before conversion, the following nodes are stripped automatically:
-
head -
script -
style -
nav -
footer -
aside
cloudflare
You can offload conversion to a Cloudflare Workers AI endpoint.
symfony_markdown_response:
driver: cloudflare
cloudflare_endpoint: 'https://your-worker.example.workers.dev/to-markdown'
Requires symfony/http-client.
The HTML is posted to your Worker endpoint, and the response body is returned as Markdown.
Smart Caching
Converted Markdown is cached by default using a PSR-6 cache pool.
- Cache key:
xxh3hash of the preprocessed HTML - Default TTL: 3600 seconds
Cache resolution order:
-
cache_service(if configured) -
cache.app -
cache.system - No caching
To disable caching:
symfony_markdown_response:
cache_enabled: false
How It Works
1οΈβ£ Opt-In via Attribute
use Soleinjast\MarkdownResponseBundle\Attribute\ProvideMarkdownResponse;
#[Route('/about')]
#[ProvideMarkdownResponse(enabled: true)]
public function about(): Response
{
return $this->render('about.html.twig');
}
2οΈβ£ Smart Request Detection
The bundle determines whether to serve Markdown based on:
-
Accept: text/markdown -
.mdURL suffix (e.g.Β/about.md) - Known AI user agents (GPTBot, ClaudeBot, ChatGPT-User, etc.)
Normal browser traffic continues receiving HTML.
AI agents get Markdown automatically.
3οΈβ£ Clean Conversion Pipeline
Before conversion, the bundle:
- Removes layout noise
- Keeps semantic content intact
Then it converts the cleaned HTML into structured Markdown.
Optional caching ensures performance remains optimal.
Installation
composer require soleinjast/symfony-markdown-response-bundle
Then enable it in your Symfony project and annotate your routes.
Final Thoughts
The web is no longer consumed only by humans.
If your application already produces meaningful HTML, why not let machines consume it in the format they understand best?
symfony-markdown-response-bundle bridges that gap --- cleanly,
transparently, and with zero duplication.
Top comments (6)
This is such an underrated topic! Finally someone talking about AI-first content architecture.
I've noticed the same thing while testing how ChatGPT reads my blog posts β it completely ignores the fancy UI and just grabs the raw text. Makes me wonder if we should all start serving parallel markdown versions of our content.
Question: Do you think this will eventually become a standard like sitemaps or robots.txt? Something like ai-content.md or content.llm.txt? π€
Exactly.
AI systems donβt care about layout β they care about semantic clarity and token efficiency.
HTML was optimized for humans and browsers.
Markdown is closer to optimized for machines.
I genuinely think weβll see some convention emerge β maybe not ai-content.md, but something equivalent to what robots.txt did for search engines.
The web evolved once for search.
Itβs probably about to evolve again for AI.
I have seen a similar library in the Laravel ecosystem. And I think your library has the same problem, namely assuming the HTML content is the right content for an LLM.
HTML pages are meant for humans and search bots. We could get away adding extra information for search bots because their main goal is to make the pages discoverable in a search.
AI scapers use the content to get knowledge for their output, that is a totally different use case.
Converting HTML output to markdown is only a partial solution. The content needs to be as meaningful for the LLM as possible because you don't want it to fill the context window with useless information.
I saw you added a preprocessor, for me manipulating content should not be the task of a markdown output library. It should be the task of the controller.
You can still have a single controller for multiple output types. That is how API endpoint output worked for years.
Thanks for the thoughtful feedback β I agree that HTML designed for humans is not automatically optimal for LLM consumption.
That said, this bundle does not aim to optimise or redesign domain content for LLMs. Its responsibility is intentionally narrow: handling content negotiation and transforming an already-rendered HTML response into a Markdown representation.
It does not rewrite, summarise, or semantically restructure content. Controllers remain fully responsible for deciding what data is exposed and how it is structured.
If an application requires LLM-specific shaping or restructuring, that logic should live at the controller or domain level. The bundle simply provides extension points for cases where additional processing is explicitly desired.
I appreciate the architectural perspective β itβs a valuable distinction.
The main reason that I reacted is because the solution is becoming mainstream, Cloudflare has functionality to do it, also Laravel cloud. Because it is getting more in the spotlight I think developers are going to assume that is the best way forward.
I think the better option could be as simple as
return $this->render($markdownOutput ? 'home.md.twig : 'home.html.twig', $data);. This gives you the most options.Very useful