<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: WP</title>
    <description>The latest articles on DEV Community by WP (@11pyo).</description>
    <link>https://dev.to/11pyo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3891475%2F270d8df5-9051-4d6a-9113-16e3e17e895f.png</url>
      <title>DEV Community: WP</title>
      <link>https://dev.to/11pyo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/11pyo"/>
    <language>en</language>
    <item>
      <title>AI Manifest: How I Cut AI Agent Tokens by 82% on Multi-Step Web UIs</title>
      <dc:creator>WP</dc:creator>
      <pubDate>Wed, 22 Apr 2026 00:35:56 +0000</pubDate>
      <link>https://dev.to/11pyo/ai-manifest-how-i-cut-ai-agent-tokens-by-82-on-multi-step-web-uis-57n</link>
      <guid>https://dev.to/11pyo/ai-manifest-how-i-cut-ai-agent-tokens-by-82-on-multi-step-web-uis-57n</guid>
      <description>&lt;p&gt;AI Manifest: How I Cut AI Agent Tokensai, webdev, opensource, standards by 82% on Multi-## TL;DRToday I shipped an open protocol that lets AI agents (Claude, MCP clients, Playwright-driven bots) execute multi-step web workflows without repeatedly analyzing the DOM.&lt;strong&gt;Benchmark results&lt;/strong&gt; (30 iterations, ERP-style two-step order entry, tiktoken &lt;code&gt;cl100k_base&lt;/code&gt;):| Metric | Baseline (DOM analysis) | AI Manifest | Improvement ||---|:---:|:---:|:---:|| Mean input tokens | 1887.6 | &lt;strong&gt;341.0&lt;/strong&gt; | &lt;strong&gt;−81.9%&lt;/strong&gt; || Task success rate | 20% (6/30) | &lt;strong&gt;100%&lt;/strong&gt; (30/30) | &lt;strong&gt;+80 %p&lt;/strong&gt; |Released today under MIT license (code) + FRAND terms (patent claims):- Repo: &lt;a href="https://github.com/11pyo/AINavManifest" rel="noopener noreferrer"&gt;github.com/11pyo/AINavManifest&lt;/a&gt;- IETF Internet-Draft: &lt;a href="https://datatracker.ietf.org/doc/draft-han-ai-manifest/" rel="noopener noreferrer"&gt;draft-han-ai-manifest-00&lt;/a&gt;- Korean Patent Application: KR 10-2026-0071716---## The problemWatch any modern AI agent operate a complex web UI — an ERP transaction, a journal submission form, a government e-service — and you see the same pattern:1. Load the page2. Read the entire DOM (or screenshot) into the LLM context3. Ask the LLM: "which element should I click?"4. Click it5. Repeat for every single stepEvery step burns thousands of input tokens on DOM content the agent has already seen. And even then, the agent misidentifies ambiguous form fields and fails on roughly 80% of multi-step transactions in my benchmarks.It's inefficient and unreliable — both problems with the same root cause.## The insightFor many workflows, the website operator &lt;strong&gt;already knows&lt;/strong&gt; the correct UI operation sequence. There is no ambiguity on their side. The ambiguity only exists inside the agent, which is rediscovering deterministic knowledge every single session.So instead of making the agent guess, let the site publish an &lt;strong&gt;executable declaration&lt;/strong&gt; of the workflow.## The protocol*&lt;em&gt;AI Manifest&lt;/em&gt;* is a JSON document the site embeds in (or alongside) the page:&lt;br&gt;
&lt;br&gt;
&lt;code&gt;json{  "version": "1.0",  "publisher": "acme.com",  "manifestId": "po_submission_v1",  "registry_url": "https://registry.aimanifest.io/verify",  "task": {    "id": "submit_purchase_order",    "steps": [      {"step": 1, "action": "fill",  "selector": "#vendor",        "value": "{vendor}"},      {"step": 2, "action": "fill",  "selector": "#item_code",     "value": "{item_code}"},      {"step": 3, "action": "fill",  "selector": "#quantity",      "value": "{quantity}"},      {"step": 4, "action": "click", "selector": "#btn_next"},      {"step": 5, "action": "fill",  "selector": "#ship_to",       "value": "{ship_to}"},      {"step": 6, "action": "fill",  "selector": "#delivery_date", "value": "{delivery_date}"},      {"step": 7, "action": "click", "selector": "#btn_submit"}    ]  }}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Three ways to embed it*&lt;em&gt;Method A — Well-Known URI&lt;/em&gt;* (recommended):
&lt;/h3&gt;

&lt;p&gt;&lt;br&gt;
&lt;code&gt;html&amp;lt;meta name="ai-manifest" content="/.well-known/ai-manifest.json"&amp;gt;&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
with the JSON served at &lt;code&gt;/.well-known/ai-manifest.json&lt;/code&gt; per IETF RFC 8615.&lt;strong&gt;Method B — Hidden DOM element:&lt;/strong&gt;&lt;br&gt;
&lt;br&gt;
&lt;code&gt;html&amp;lt;div id="ai-manifest" style="display:none" aria-hidden="true"     data-manifest='{"version":"1.0", ...}'&amp;gt;&amp;lt;/div&amp;gt;&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
&lt;strong&gt;Method C — HTTP response header:&lt;/strong&gt;&lt;br&gt;
&lt;br&gt;
&lt;code&gt;X-AI-Manifest: url=/.well-known/ai-manifest.json; hash=sha256:...&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
Sites can combine methods (A + C is particularly clean — one request-response confirms the URL and the hash, before fetching the body).### Agent-side flow1. Fetch headers / check for meta tag / search DOM (in that priority)2. If a manifest is found, compute its SHA-256 hash over a canonical form (keys sorted, UTF-8 encoded)3. POST &lt;code&gt;{publisher, manifestId, hash}&lt;/code&gt; to the registry URL over HTTPS4. Registry returns &lt;code&gt;{status: "white" | "black" | "unknown"}&lt;/code&gt;5. On &lt;code&gt;white&lt;/code&gt;: execute the &lt;code&gt;steps&lt;/code&gt; array in declared order, skipping any additional DOM-based LLM reasoning6. On &lt;code&gt;black&lt;/code&gt;: refuse and warn the user7. On &lt;code&gt;unknown&lt;/code&gt;: warn the user, optionally fall back to DOM inference## Why this matters: prompt injection defenseThe obvious worry with "just execute what the page tells you" is that a malicious page could tell the agent to exfiltrate credentials or click the wrong button.The &lt;strong&gt;central trust registry&lt;/strong&gt; is the mitigation:- Publishers pre-register their manifests (hash-only — the registry doesn't need the body).- The registry performs &lt;strong&gt;static analysis&lt;/strong&gt; at registration time: it rejects or black-lists manifests whose &lt;code&gt;steps&lt;/code&gt; contain suspicious patterns — selectors targeting &lt;code&gt;iframe&lt;/code&gt; for cross-origin form submission, actions outside the registered safe set (&lt;code&gt;fill&lt;/code&gt; / &lt;code&gt;click&lt;/code&gt; / &lt;code&gt;select&lt;/code&gt; / &lt;code&gt;upload&lt;/code&gt;), &lt;code&gt;value&lt;/code&gt; fields that would trigger requests to external URLs.- Community reporting channel for black-listing compromised manifests.Multiple interoperable registries can coexist — each manifest declares which one is authoritative via &lt;code&gt;registry_url&lt;/code&gt;.## The benchmark in detailThe experimental setup:- A Flask server that implements an ERP-style two-step order-entry flow (vendor, item_code, quantity, currency → delivery info → submit)- Two AI agents:  - &lt;strong&gt;Baseline&lt;/strong&gt;: reads the full DOM into the LLM context every step  - &lt;strong&gt;Manifest-aware&lt;/strong&gt;: fetches the manifest, verifies with the registry, executes deterministically- 30 iterations per agent with seeds 1000-1029- Input tokens counted with &lt;code&gt;tiktoken&lt;/code&gt; and &lt;code&gt;cl100k_base&lt;/code&gt; encoding (Anthropic/OpenAI compatible)Raw numbers:| | Baseline mean | Baseline std | Manifest mean | Manifest std ||---|:---:|:---:|:---:|:---:|| Input tokens | 1887.6 | 634.8 | &lt;strong&gt;341.0&lt;/strong&gt; | 0.0 || Success rate | 20.0% | — | &lt;strong&gt;100.0%&lt;/strong&gt; | — || Mean LLM calls | 1.4 | — | 1.0 | — || Wall time (ms) | 28.6 | 16.2 | 54.2 | 8.6 |The manifest agent is slightly slower in wall-clock because it makes 3.4 more HTTP requests (the registry verification). But each of those requests carries a payload two orders of magnitude smaller than DOM-based inference, and the result is cachable by hash, so the real cost is dominated by tokens — which dropped 5.5×.The full benchmark harness is in the repo: clone, &lt;code&gt;pip install -r requirements.txt&lt;/code&gt;, &lt;code&gt;python benchmark/run_benchmark.py --repeats 30&lt;/code&gt;.## Licensing modelThe code and schema are MIT-licensed. The patent claims are offered under &lt;strong&gt;Fair, Reasonable, and Non-Discriminatory (FRAND)&lt;/strong&gt; terms — any good-faith implementer of the published specification is covered, with defensive termination only if someone sues the applicant over technology covered by this spec.Rationale: the value of a protocol like this is in its ubiquity, not in a rent on each implementation. The patent's role is to keep anyone else from enclosing it.This mirrors how Apple offered FRAND terms on H.264-era codec patents, how Microsoft did on some OOXML patents, and how the Apache/Mozilla foundations coordinate with their members: broad freedom to implement, defensive shield preserved.## What's next- &lt;strong&gt;IETF draft revisions&lt;/strong&gt;: draft-01 in ~6 months based on implementer feedback.- &lt;strong&gt;Reference registry deployment&lt;/strong&gt;: &lt;code&gt;registry.aimanifest.io&lt;/code&gt; goes live once I have feedback from at least one serious browser-agent implementer.- &lt;strong&gt;SDK auto-generators&lt;/strong&gt;: tools that read a site's UI and propose a starting manifest.- &lt;strong&gt;Integration with MCP / OpenAI Agents SDK / Gemini Deep Research&lt;/strong&gt;: the protocol composes cleanly with all of them; looking for partners.## Feedback wantedIf you're building agent systems, running a site with known AI agent traffic, or thinking about agent-web interop standards: please open a GitHub issue, comment below, or email me (&lt;a href="mailto:pk102h@naver.com"&gt;pk102h@naver.com&lt;/a&gt;).Specific things I'd love input on:- Registry governance: single operator, federated, fully site-declared?- Manifest versioning strategy when a site UI changes- Interaction with existing &lt;code&gt;agents.txt&lt;/code&gt; / &lt;code&gt;llms.txt&lt;/code&gt; / &lt;code&gt;ai-plugin.json&lt;/code&gt;- Benchmark scenarios beyond ERP (ScholarOne submission, SAP MM01, e-gov) — send yours## Links- &lt;strong&gt;Repo&lt;/strong&gt;: &lt;a href="https://github.com/11pyo/AINavManifest" rel="noopener noreferrer"&gt;github.com/11pyo/AINavManifest&lt;/a&gt;- &lt;strong&gt;JSON Schema&lt;/strong&gt;: &lt;a href="https://github.com/11pyo/AINavManifest/blob/main/ai-manifest.schema.json" rel="noopener noreferrer"&gt;ai-manifest.schema.json&lt;/a&gt;- &lt;strong&gt;IETF Internet-Draft&lt;/strong&gt;: &lt;a href="https://datatracker.ietf.org/doc/draft-han-ai-manifest/" rel="noopener noreferrer"&gt;draft-han-ai-manifest-00&lt;/a&gt;- &lt;strong&gt;FRAND declaration&lt;/strong&gt;: &lt;a href="https://github.com/11pyo/AINavManifest/blob/main/docs/FRAND.md" rel="noopener noreferrer"&gt;docs/FRAND.md&lt;/a&gt;- &lt;strong&gt;Benchmark code&lt;/strong&gt;: &lt;a href="https://github.com/11pyo/AINavManifest/tree/main/validation" rel="noopener noreferrer"&gt;validation/&lt;/a&gt;If this is useful to you, a GitHub star helps with visibility for the IETF process.— Won-pyoStep Web UIs&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>opensource</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
