Raw HTML is noise. Screenshots burn tokens. Accessibility trees lose visual context.
So we built SiFR — a structured format that gives LLMs usable runtime UI context.
This post explains what's inside.
What is SiFR?
SiFR stands for Semantic Information for Representation.
(And yes — it's also meant to sound like "see far".)
SiFR is a JSON schema that captures the runtime state of a web page in a way that's:
- Token-efficient (often 10–50× smaller than raw HTML on complex pages)
- Semantically structured (models can reason over it without reconstructing the UI from markup)
- Visually aware (preserves layout relationships without pixels)
It's not a scraper. It's not an accessibility tree.
It's a preprocessing layer that sits between the DOM and your AI — turning "what the browser rendered" into "what the model can reason about".
Why not just send HTML?
Let's use a real-world example: large e-commerce pages.
Raw HTML commonly contains:
- deeply nested layout wrappers
- duplicated markup for responsive layouts
- client-side frameworks with non-semantic containers
- hidden / disabled / off-screen elements that still exist in the DOM
So when you send HTML to an LLM, you're asking it to do two jobs:
- reconstruct runtime UI state
- then solve the task
That's where most failures happen.
Here's what a typical "find the button" path looks like in raw markup:
div > div > div > div > div > div > ... > button
With SiFR, the same interface becomes "structure first, then the important elements".
For example:
{
"id": "btn042",
"text": "Add to Cart",
"actions": ["clickable"],
"salience": "high",
"cluster": "product-actions"
}
The LLM sees what it is, how it behaves, and which part of the page it belongs to — without reverse-engineering UI meaning from markup.
Anatomy of a SiFR Document
Every SiFR snapshot has five sections:
1) METADATA
Page-level context: URL, viewport size, capture timestamp, and capture stats.
{
"url": "https://www.costco.com/...",
"viewport": { "width": 1920, "height": 1080 },
"stats": {
"totalNodes": 2847,
"salienceCounts": { "high": 12, "med": 89, "low": 2746 }
}
}
This is the "frame" the model needs before it reads anything else.
2) NODES
The structural skeleton — hierarchy without heavy details.
Think of it as the page's table of contents: what regions exist, what contains what, and what the high-level UI shape is.
3) SUMMARY
High-level layout blocks. This is where SiFR becomes "structure-first".
{
"layoutBlocks": [
{ "role": "header", "contains": ["logo", "nav", "search"] },
{ "role": "sidebar", "contains": ["filters", "categories"] },
{ "role": "main", "contains": ["product-grid"] }
]
}
Before the model sees thousands of elements, it already has the page skeleton:
header at top, sidebar on the side, main content in the center.
4) DETAILS
Element-specific data: selectors, text, runtime visibility, interaction state, and relevant computed info.
{
"btn042": {
"selector": "button.add-to-cart",
"text": "Add to Cart",
"actions": ["clickable"],
"styles": { "visible": true, "disabled": false }
}
}
This is where "runtime truth" matters: visible vs hidden, enabled vs disabled, actual text content, etc.
5) RELATIONS
Spatial relationships between important elements.
Not pixel coordinates — semantic positioning.
{
"btn042": {
"inside": "card-product-123",
"below": "price-display",
"rightOf": "quantity-selector"
}
}
The model can reason: "the Add to Cart button is inside the product card, below the price" — without seeing a single pixel.
Key Concepts
Visual Salience
Not all nodes matter equally.
SiFR assigns salience so the model focuses on signal:
- High: primary actions, main content, user inputs
- Medium: secondary nav, supporting info
- Low: wrappers, containers, decorative elements
This is one of the biggest reasons SiFR stays usable on very large pages.
Layout Block Summarization
Instead of listing 3000 elements immediately, SiFR begins with a map:
PAGE STRUCTURE:
├── Header (logo, nav, search, cart)
├── Sidebar (filters)
└── Main
├── Breadcrumbs
├── Product Grid (24 items)
└── Pagination
Models don't "scan HTML". They build mental structure.
This gives them the structure up front.
Adaptive Complexity
A simple blog post doesn't need the same capture density as a complex dashboard.
SiFR adjusts automatically — more detail where it matters, less where it doesn't.
The goal is stable signal-to-noise, not maximal completeness.
Real Numbers
Here are representative examples from our internal benchmarks (token counts vary by capture options and page state):
| Site | HTML Tokens | SiFR Tokens | Compression |
|---|---|---|---|
| Costco | ~1,280,000 | ~24,000 | ~53× |
| Amazon | ~600,000 | ~50,000 | ~12× |
On complex pages, SiFR makes LLM workflows practical where raw HTML often doesn't fit in context.
Try It Yourself
SiFR is implemented in Element to LLM — a free browser extension:
- Chrome Web Store (also works on Arc, Brave, Edge)
- Firefox Add-ons
If you want to stress-test the format, try these two pages:
- costco.com — a realistic, framework-heavy enterprise UI
- arngren.net — extreme visual density and chaotic layout
Capture a snapshot and share:
- what compression ratio did you get?
- could your LLM reason about the structure?
- did you find a site where SiFR struggles?
If it breaks — that's useful data. Seriously.
What SiFR Enables
With structured runtime UI context, LLMs can:
- Debug layouts — paste JSON → spot z-index / visibility / layout issues
- Generate selectors — Playwright/Cypress tests based on real DOM structure
- Navigate autonomously — agents that understand "where to click" without screenshots
- Recreate components — translate UI structure into React/Tailwind scaffolds
The Standard Question
We're actively developing SiFR as an open specification. Current version: v2.
The schema is strict and versioned, designed for automation pipelines — not just one-off prompt experiments.
If you're building LLM-powered UI tools, I'd love feedback on the format:
- What feels missing?
- What feels redundant?
- What would make this more useful in your workflow?
Series Index
Previous posts:
Links
Found a site that breaks SiFR? Drop it in the comments. That's the fastest way to improve the spec.
Top comments (0)