TL;DR: MCP burns 45k+ tokens on tool descriptions before your first prompt. E2LLM = 0 tokens until you paste the UI snapshot. For CSS debugging — scalpel, not Swiss Army knife.
Runtime Snapshots is a series about what happens when you give LLMs the actual runtime state of a UI — not the HTML source, not a screenshot, not a description. Start from #1 or jump in here.
We covered the basic MCP cost argument back in September. This is the architectural explanation of why it happens.
Before your AI assistant reads a single line of your code, it has often already consumed 40,000–50,000 tokens.
That's not a bug in your setup. That's MCP working as designed.
What MCP Actually Loads
Model Context Protocol is a genuinely useful standard. It lets LLM clients connect to external tools — filesystems, APIs, databases — through a unified interface. But "unified" has a heavy tax.
When you connect a typical MCP server to your AI client (like Claude Desktop or Cursor), the protocol negotiates a session. During that negotiation, it sends the client every tool the server exposes:
- names
- descriptions
- input schemas
- output formats
All of it, upfront, for every new session.
A modest MCP setup — filesystem access, a browser tool, and a code search tool — generates a system prompt between 40,000 and 50,000 tokens before you type a single character. Larger configurations go higher.
Since most LLM APIs charge per token regardless of whether those tokens were "useful," you're paying this tax on every single conversation.
Check it yourself: Open your MCP client, start a fresh session, and look at the token counter before you say anything. The number is real.
The Surgical Alternative
E2LLM was built to solve a specific, painful problem: explaining runtime DOM state to an AI assistant.
Not the HTML source. Not the static structure. The live state — computed styles, visibility flags, ARIA roles, z-index stacks, responsive quirks.
The gap between what HTML says and what browser renders is where most annoying bugs live.
Standard workflow was brutal:
- screenshot → 3 paragraphs description → hope AI understands
- OR full page HTML → context window filled with nav/footer noise
E2LLM does one thing: click element → get structured JSON snapshot.
No server. No session. No overhead. Zero tokens until paste.
{
"tag": "button",
"text": "Submit",
"computedStyles": {
"display": "none",
"visibility": "hidden",
"opacity": "0"
},
"ariaRole": "button",
"ariaDisabled": "true",
"boundingRect": { "width": 0, "height": 0 }
}
AI sees actual truth of element. Runtime reality.
Two Different Philosophies
MCP = Swiss Army knife
→ broad persistent access to environment
→ agentic workflows ("fix entire codebase")
→ upfront cost OK when agent roams freely
E2LLM = scalpel
→ one precise cut: exact runtime context
→ no persistent connection, no session
→ pay only for what you send
Mistake: using Swiss Army knife for surgery
Button not clickable? Don't need filesystem/git/DB access. Need pointer-events: none + z-index: -1 parent. 200 tokens, not 50k.
The Real Cost Comparison
| Scenario | Session overhead | Query tokens | Useful overhead |
|---|---|---|---|
| MCP (DOM debug) | ~45k tokens | 2k–10k | ~0% |
| E2LLM snapshot | 0 tokens | 150–800 | ~100% |
$3/M tokens → 20 DOM sessions/day MCP = $2.70 overhead daily.
Every day. Tokens that don't solve bugs.
When to Use Each
Use MCP when:
- agentic workflow (multi-file/API/systems)
- dynamic tool discovery needed
- building automation pipelines
Use E2LLM when:
- debugging specific UI/CSS issue
- showing computed element state
- precise snapshot, no context burn
The Broader Point
MCP standardized agent-system connections. Important.
But standardization ≠optimization.
Current MCP default: "load everything, always."
This = money + latency + distraction.
Scalpel doesn't replace Swiss Army knife.
For surgery, use scalpel.
Previous: #12 — Reflection in the Code
E2LLM — free Chrome/Firefox extension. Local only.
GitHub | Chrome
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.