DEV Community

Discussion on: Serve AI-Ready Markdown from Your Symfony App

Collapse
 
xwero profile image
david duymelinck • Edited

I have seen a similar library in the Laravel ecosystem. And I think your library has the same problem, namely assuming the HTML content is the right content for an LLM.

HTML pages are meant for humans and search bots. We could get away adding extra information for search bots because their main goal is to make the pages discoverable in a search.
AI scapers use the content to get knowledge for their output, that is a totally different use case.

Converting HTML output to markdown is only a partial solution. The content needs to be as meaningful for the LLM as possible because you don't want it to fill the context window with useless information.

I saw you added a preprocessor, for me manipulating content should not be the task of a markdown output library. It should be the task of the controller.
You can still have a single controller for multiple output types. That is how API endpoint output worked for years.

Collapse
 
mohammad_oveisi_9625d74d1 profile image
Mohammad Oveisi

Thanks for the thoughtful feedback — I agree that HTML designed for humans is not automatically optimal for LLM consumption.
That said, this bundle does not aim to optimise or redesign domain content for LLMs. Its responsibility is intentionally narrow: handling content negotiation and transforming an already-rendered HTML response into a Markdown representation.
It does not rewrite, summarise, or semantically restructure content. Controllers remain fully responsible for deciding what data is exposed and how it is structured.
If an application requires LLM-specific shaping or restructuring, that logic should live at the controller or domain level. The bundle simply provides extension points for cases where additional processing is explicitly desired.
I appreciate the architectural perspective — it’s a valuable distinction.

Collapse
 
xwero profile image
david duymelinck • Edited

The main reason that I reacted is because the solution is becoming mainstream, Cloudflare has functionality to do it, also Laravel cloud. Because it is getting more in the spotlight I think developers are going to assume that is the best way forward.

I think the better option could be as simple as return $this->render($markdownOutput ? 'home.md.twig : 'home.html.twig', $data);. This gives you the most options.