I had a thought the other day that won't leave me alone.
We've spent decades building the web for humans. HTML gives us structure — headings, tables, buttons, forms. A human looks at a page and immediately understands what's going on.
Machines don't.
An LLM sees a <table> and has no idea if it's pricing tiers, analytics data, a leaderboard, or a comparison tool. It just sees a table. It has to guess why it's there.
And that guess becomes the foundation for AI browsing, accessibility systems, personalized interfaces, and agents operating software — basically everything we're trying to build next.
The web gives machines structure, but not intent. And I think that's the missing layer.
The Semantic Web Tried This. It Failed.
This isn't a new observation. Schema.org, RDFa, microformats, the entire Semantic Web movement — they all tried to make web content machine-readable.
They all struggled with adoption. And they all struggled for the same reason:
The developer does the work. Someone else gets the value.
A developer adds itemprop="author" or typeof="Product" and... Google's search results get richer. A knowledge graph gets smarter. But the developer? They got nothing they could feel.
There's a second reason they struggled: the machines of that era couldn't read natural language. So every standard required structured, rigid, machine-parseable formats. Controlled vocabularies. Ontologies. Committee-approved taxonomies.
You had to learn the vocabulary. You had to use it exactly right. And if your use case didn't fit the taxonomy, you were stuck.
That constraint no longer exists.
LLMs Changed the Consumer
This is the part that matters.
The machine reading the web is fundamentally different now. An LLM doesn't need type="Product". It can understand "this section sells shoes."
That single shift changes everything about what a machine-readable web could look like.
Previous standards required rigid formats because the parsers on the other end were rigid. Schema.org needed exact property names. RDFa needed exact predicates. The entire Semantic Web was built around the limitation that machines couldn't interpret — they could only match.
LLMs don't have that limitation. They interpret. They reason. They understand context, nuance, and implied meaning. A sentence describing why a UI element exists carries more signal to an LLM than any structured attribute ever could.
The machines changed. The web hasn't caught up.
That's the gap. And the fix might be simpler than anyone expected.
Intent: The Missing Primitive
What if web elements carried a description of their purpose — not as a structured label, but as a natural language statement that an LLM can understand?
<intent>
This section helps the user compare pricing tiers and decide which
plan fits their needs. The table shows monthly vs annual pricing
with feature breakdowns. The primary action is selecting a plan.
</intent>
Or:
<intent>
This modal confirms subscription cancellation and attempts to retain
the user with a discount offer. It's a critical conversion moment —
the user should not be rushed past this.
</intent>
Or:
<intent>
This section onboards new users while simultaneously building trust
through social proof. The testimonials are positioned here to reduce
signup hesitation.
</intent>
No controlled vocabulary. No schema to memorize. No committee to wait on. A developer just describes what the section does and why it exists — in their own words.
The LLM on the other end handles the semantics. That's its job. And for the first time in the history of the web, the consumer is actually capable of doing that job.
You don't even need a new HTML element to start. An HTML comment prefixed with intent: would work just as well — it's already invisible to browsers, already readable by crawlers, and already a habit developers have. The format barely matters. What matters is that intent is expressed, and that something on the other end can now understand it.
Why This Is Different From Everything Before
Every previous attempt at machine-readable web content required structured formats because the machines couldn't understand language. They needed rigid schemas because parsers needed exact keys.
Intent doesn't need any of that. The consumer is an LLM. The format is natural language. The authoring cost is writing a sentence.
This removes every barrier that killed previous standards:
1. No standardization bottleneck. There's no taxonomy to agree on. A developer describes what the section does in their own words. The LLM figures out the classification, the relationships, the context — all of it.
2. Authoring is trivially easy. Writing "this form collects shipping info for checkout" isn't a new skill. Developers already write comments, documentation, and PR descriptions. This is the same muscle — applied to the DOM.
3. It scales with the LLM. As models get better, the same intent descriptions become more useful — not less. A better model extracts richer meaning from the same sentence. The investment compounds over time without the developer changing anything.
What This Actually Unlocks
Once intent exists on a page, the UI no longer has to be fixed. It becomes interpretable. Adaptable.
Imagine browsing Reddit, but you choose how you experience it:
- TikTok-style vertical feed
- Twitter-style stream
- Minimal reading mode
- Research mode with everything expanded
Same content. Completely different UI. The content stays the same because the intent stays the same — only the presentation changes based on who you are and how you prefer to consume information.
Intent becomes the bridge between content and experience.
When Is Intent Even Necessary?
If the HTML is already semantically clear — a <table> inside a section titled "Pricing" — a machine can probably infer that. You don't need to annotate everything.
Intent shines when the purpose isn't obvious from structure alone:
- "This table isn't just data — it's designed to push users toward the annual plan by making monthly look expensive."
- "This section looks like filler, but it's the primary trust-building mechanism on the page."
- "This modal is the most important conversion moment in the entire flow."
That level of nuance can't come from a type="Product" label. It can only come from natural language — and it can only be consumed by something that understands natural language.
That's the key insight. Intent as a concept only becomes viable when the consumer can interpret free-form descriptions. We finally have that consumer.
Architecture: Three Pillars
The system breaks into three parts.
1. User Profile
A user defines how they want to experience the web. Simple structured preferences:
preferred_layout: feed
reading_density: high
modal_tolerance: low
theme: dark
interaction_style: keyboard-first
What do you want to see when you browse? How do you want to consume content? This is just an onboarding step — straightforward preference collection.
2. Web Intent Index
This is the real work. Sites are crawled and their intent is mapped.
Two paths exist:
A — Explicit Intent (Best Case)
Developers add natural language intent descriptions to their markup. Rich, contextual, high-signal. The LLM on the indexing side ingests these and builds a purpose map of the site.
B — Inferred Intent
If intent annotations aren't present, a crawler scrapes the page, maps the structure, and uses an LLM to infer intent — then hashes the result and stores a semantic index.
Here's where things get interesting: the explicit and inferred paths aren't competing — they compound. A crawler's inferred intent might say "this looks like a pricing table." A developer's explicit intent says "this table exists to push users toward the annual plan by making the monthly price look bad." The explicit version carries far more signal — and an LLM can reconcile both.
You could also inject inferred intent into existing pages and let site owners refine it later if they want to participate. That gives you immediate coverage without requiring anyone to change their code first.
This becomes a semantic sitemap of the web. Not just links — purpose.
3. Rendering Engine
When a user visits a page:
- A browser extension intercepts the request
- A page hash lookup occurs
- The intent map is retrieved
- The user profile is matched
- The best rendering path is selected
The key insight: this doesn't need LLM reasoning at runtime.
All the heavy work — the crawling, the inference, the intent mapping — already happened during indexing. At runtime, it's just similarity matching:
User vector → Intent vectors → Closest match → Render
No slow AI calls. Cached intent maps, hashed page structures, vector similarity. You could put this on a CDN. You could put this at the DNS level. At that point, it's just numbers and lookups — not reasoning.
This Doesn't Require Rebuilding the Web
Intent is an overlay. It sits on top of whatever already exists.
The supporting infrastructure could be:
- A crawler that indexes intent annotations
- A CDN layer that caches intent maps
- A browser extension that matches users to intents
- A developer habit: describe what your sections do
Existing sites continue to work. Nobody rewrites anything. React, Svelte, plain HTML — doesn't matter. Intent is framework-agnostic by nature because it describes purpose, not implementation.
Where This Idea Lives
This sits somewhere between the Semantic Web, search indexing, AI browsing, and adaptive UI systems. But instead of encoding data relationships, it encodes user experience purpose.
Not:
"This is a Product."
But:
"This helps the user decide which product to buy — and it's designed to make the premium option feel like the obvious choice."
The Semantic Web tried to make the web machine-readable at the data level using structured formats for machines that needed structured formats.
This is about making the web machine-readable at the experience level using natural language for machines that now understand natural language.
Different era. Different consumer. Different approach.
The Actual Point
Right now, the web tells machines what things are.
It doesn't tell them why they exist.
Previous attempts to fix this required rigid schemas because the machines couldn't read language. That constraint is gone. LLMs can interpret natural language descriptions of purpose — and that changes what's possible.
If the next generation of browsing is going to be powered by agents, assistants, and adaptive interfaces, the missing layer might just be intent — written in plain language, for machines that can finally understand it.
I'm calling it IntentML until someone gives me a better name.
Top comments (0)