Raphaël Reck

Posted on Oct 30

LLMR: Because AIs Shouldn't Have to Parse Your Bootstrap Navbar 50 Times

#opensource #ai #webdev #python

I got tired of watching AIs choke on HTML soup. So I thought of LLMR a format that cut some of the bloat. Also, what if AIs could talk in jibberish mode to save tokens?

My original post https://raphaelreck.com/blog/llmr-launch.html

The Jibberish Thought That Started Everything

You know those videos where two AIs call each other and descend into madness? They start normal, then gradually develop their own language that sounds like electronic glossolalia.

I had a thought: what if that's not a bug, but evolution? What if AIs naturally want to communicate in compressed, phonetic jibberish because it's more efficient?

That's when it hit me - we're forcing AIs to read the equivalent of climbing through a window to get into a room. Every. Single. Time.

The Actual Problem

Every time an AI reads your website, here's what happens:

<!-- 97KB of this garbage: -->
<nav class="navbar navbar-expand-lg navbar-light bg-light">
    <!-- 200 lines of Bootstrap hell -->
</nav>
<footer>
    <!-- Another 300 lines because why not -->
</footer>
<script src="jquery.min.js"></script>
<!-- 20 more scripts that the AI doesn't care about -->

<!-- To find 3KB of this: -->
<p>The actual content you wanted.</p>

It's like forcing someone to read a phone book to find one number. Except the phone book is repeated on every page. And it's in Comic Sans.

No wonder AIs want to develop their own language.

LLMR: Let's Stop the Madness

I built LLMR (LLM-Readable). Pure Python. No dependencies. No frameworks climbing through windows.

What it does:

Takes your bloated HTML
Strips the superfluous
Outputs clean, structured JSON
94% token reduction

That's not a typo. Ninety-four percent. At least for my simple website.

The Horror Story That Made Me Do This

Last month I inherited a Laravel codebase. The previous developer signed their commits as "jesus" (I'm not kidding). While debugging, I watched an AI assistant try to parse the entire site to understand the data flow.

It read:

The same navigation menu 47 times
Footer disclaimers in three languages (none relevant)
Commented-out code from 2019
Script tags for Google Analytics, Facebook Pixel, and something called "visitor-tracker-v2-final-FINAL.js"

The actual business logic? 12 lines of PHP.

The AI gave up and hallucinated a completely different architecture.

That's when I knew - we're torturing these things.

Stop Forcing AIs to Read Your CSS Classes

Here's LLMR in action:

{
  "content": {
    "title": "Your actual content",
    "body": "What matters, nothing more",
    "structure": ["Introduction", "Main Point", "Conclusion"]
  },
  "metadata": {
    "author": "You",
    "date": "2024-10-30",
    "topics": ["stuff that matters"]
  }
}

No divs. No classes. No "container-fluid wrapper-main col-lg-8 offset-2" nonsense.

Just. The. Content.

Real Numbers from Real Sites

Tested on my own blog first (obviously):

Page	HTML Size	LLMR Size	Tokens Saved
Homepage	97KB	2KB	~24,000
Blog post (typical)	84KB	4KB	~20,000
That one post about Drupal	156KB	6KB	~37,000

Installation (Because Simplicity Matters)

# Clone it
git clone https://github.com/djassoRaph/open_llmr

# Run it (from inside the folder, like a normal person)
python3 generate_llmr.py

# Upload the generated site.llmr to your site

# Add this to your HTML head:
<head>
<link rel="llm-index" type="application/json" href="/site.llmr">
</head>

That's it. No npm install (for static websites). No webpack. No build process that takes 5 minutes.

The Jibberish Mode Future

Here's where it gets interesting. What if we go further?

What if AIs could:

Request content in LLMR format (structured)
Process it internally
Respond in compressed jibberish to other AIs
Only translate to human language at the final step

We could create a .llmr-jibb format:

{
  "mode": "compressed",
  "encoding": "phonetic-optimal",
  "data": "∆øπ§¥Ωñ..." // Actual information, 99% compressed
}

AIs talking to AIs wouldn't need human language at all. Like how modems negotiate - remember those sounds? But semantic.

Why This Actually Matters

We're building the web wrong for AI consumption. It's not just about cost (though 94% reduction is nice). It's about:

Cognitive Load: AIs perform better with clean data
Speed: Less parsing = faster responses
Accuracy: No confusion from layout elements
Evolution: Let AIs develop efficient communication

We spent 20 years adding bloat for humans. Time to give machines their own door.

The Framework Rant (You Knew It Was Coming)

I don't like frameworks. They're like climbing through a window to get to your destination. Every. Single. Time. When you want to fix something you don't know where it is. You have to go through a maze of files.

LLMR is the opposite. It's a door. A simple, normal door. That opens when you turn the handle.

No configuration. No bundling. No transpilation. No "create-llmr-app" with 1,847 dependencies.

Just Python reading HTML and outputting JSON. Like the web used to be simple.

Current Status

Working Python implementation
Tested only on my site
94% average compression
Zero dependencies

To do

NPM package (ugh, but might people want it)
WordPress plugin (double ugh, open source community might help?)
Jibberish mode (experimental)

The Part Where I Ask for Help

This is open source. MIT license. Take it, fork it, improve it.

Especially interested in:

People testing on their weird CMSs
Thoughts on the jibberish mode concept
Anyone who wants to help with the WordPress plugin (I can't face that alone)
Real-world token savings data

But please, no PRs that add dependencies. Keep it simple.

If You're Thinking "This is Obvious"

Yes. It is. That's the point.

The best solutions are obvious in retrospect. RSS was obvious. JSON was obvious. REST was obvious.

LLMR is obvious. That's why it'll work.

Try It, Break It, Tell Me

GitHub: github.com/djassoRaph/open_llmr

If you implement it, tell me your compression ratio. If it breaks, open an issue. If you think jibberish mode is insane, tell me why.

And if you're from a big AI company reading this - your models are drowning in div soup. Help us help you.

Raphaël Reck - I've been using computers since age 4, starting with floppy disks. Now I'm trying to stop AIs from suffering the same HTML nightmares I've endured for 20 years.

Currently fighting with legacy Drupal&Laravel at my day job. Building tools like LLMR and video games at night. Sometimes the code wins.

Find me at raphaelreck.com where yes, there's an LLMR feed.

Thank you for reading.

DEV Community