HTML, PDF, JSON — every data format in existence was designed for humans to read. When AI Agents become the primary data consumers, we need to rethink from first principles: what format is best for Agents?
An Overlooked Infrastructure Problem
When we talk about AI Agents, the conversation usually revolves around model capabilities, tool calling, and multi-agent collaboration. But few people focus on a more fundamental issue:
99% of the data Agents process every day was designed for humans.
90% of HTML is styling and layout noise. PDF is a coordinate-based text positioning system designed for printers. Images require an additional vision model to "translate." Even JSON and XML — seemingly structured — lack semantic context.
It's like giving a genius a hammer to perform precision surgery. Wrong tool, wasted talent.
How does an Agent "perceive" the world? It doesn't use eyes to scan interfaces or fingers to click buttons. It processes token streams, understands semantic relationships, and makes decisions based on structured information. Humans need things to look good. Agents need things to make sense.
The core tension: humans need visual presentation; Agents need semantic structure. Existing formats either lean visual (HTML/PDF/images) or lean structural but lack semantics (JSON/SQL). No format was natively designed for Agents.
Agent-Native Data: A Four-Layer Architecture
I believe a truly Agent-friendly data format needs four layers.
Layer 1: Semantic Graph
Not tables. Not documents. A relationship network.
When a human reads a news headline:
"Apple released the iPhone 20 today, priced at $999"
The information is clear to a person. But for an Agent, it's full of ambiguity — is "Apple" the fruit or the company? What day is "today"? Which market's price is $999?
What an Agent needs is a parsed semantic graph: a product launch event linked to an unambiguous company entity, a specific product, a price with currency unit, a precise timestamp, and the source and reliability of that information.
Key properties of the semantic graph:
Entities have globally unique identifiers. "Apple Inc." is no longer an ambiguous string but a globally resolvable entity ID. Just as the internet uses URLs to identify web pages, the Agent internet needs entity IDs to identify every concept in the world.
Relationships are first-class citizens. In traditional documents, relationships are implicit in text — "Alice is Bob's colleague" requires human reading comprehension. In a semantic graph, relationships are explicitly declared: Entity A and Entity B have a "colleague" relationship, established on a given date, with 95% confidence.
Built-in confidence scores and provenance. Every piece of information answers three questions: How reliable is it? Where did it come from? When was it last updated? For humans, this is nice-to-have. For Agents, it's the foundation for decision-making.
Layer 2: Intent Protocol
This is the most disruptive layer.
Current inter-Agent communication is essentially "calling APIs" — I tell you what data I want, you return it. This is a mechanical, pre-arranged interaction model.
But real-world tasks aren't API calls. They're negotiations.
Imagine you ask an Agent to buy a laptop. The traditional approach: the Agent searches various e-commerce APIs, compares results, and places an order.
The Intent Protocol works entirely differently. Your Agent broadcasts an "intent" to the market: I want a laptop, budget under $1,100, delivered to Shanghai by March 15, preference for Apple or Lenovo, no refurbished units. Budget is negotiable, delivery date is negotiable, delivery location is non-negotiable.
Seller Agents receive this intent and respond autonomously: I have one that fits — $1,050, arriving March 12. Or: nothing matches your exact budget, but for $70 more there's a better option — interested?
Why intents matter more than data: Data is static; intents are dynamic. Data requires Agents to understand formats; intents let Agents understand goals. Intent protocols enable negotiation — something current API architectures simply cannot do.
Layer 3: Context Bundle
Every piece of data should carry all the context needed to understand itself.
This is a major reason current Agents are inefficient. An Agent receives a message: "That project got delayed." It then needs to search memory systems — which project? What does "delayed" mean in terms of impact? Who said it? Is it reliable?
The ideal data format is a "context bundle": the message itself, plus all background information needed to understand it. Project name, original timeline, scope of impact, relevant stakeholders, required decisions, urgency level — all bundled together.
Any Agent receiving this context bundle can understand and act without additional queries. This dramatically reduces the Agent's "thinking cost" (token consumption) and prevents misjudgments from missing context.
Layer 4: Executable Data
Data isn't just "read" — it comes with instructions on how to use it.
When an invoice reaches an Agent, it doesn't just contain amounts and line items. It also declares: you can "approve" or "reject" this invoice; approval requires budget authority; rejection requires a reason; if unprocessed within 48 hours, it auto-escalates to human review.
Data knows what operations can be performed on it. This eliminates the cost of Agents having to "learn how to use each system." In the traditional model, an Agent operating a new system needs to read documentation, understand APIs, and handle errors. Executable data lets the data itself tell the Agent: "Here's what you can do with me."
The Hardware Revolution: Built for Agents
Changes in data formats inevitably drive changes in hardware.
Human Hardware Evolution
Screens won't disappear, but their function changes completely — from an operation interface to a supervision dashboard. Like a car dashboard doesn't let you operate the engine; it lets you know what the engine is doing.
Keyboard and mouse usage will drop dramatically. Voice becomes the primary interface between humans and Agents — not because speech recognition improved, but because you no longer need precise UI manipulation. You just need to express intent.
Phones shift from "pocket computers" to "Agent remote controls." You no longer open apps yourself; you check your Agent's work status and approve its decisions when needed.
The Rise of Agent Hardware
Agents don't need screens, but they need two things: persistent compute and always-on connectivity.
Edge devices become an Agent's sensory organs. Home cameras, microphones, and temperature sensors no longer provide information to humans — they provide environmental perception data to Agents.
Dedicated AI chips (NPUs) will become standard in consumer electronics — not for gaming, but for running local Agent inference. Apple already embeds NPUs in iPhones and Macs and has opened up 3B-parameter local LLMs. This isn't coincidence; it's laying groundwork for the Agent era.
Home Agent servers will become as common as routers. A low-power device running your private Agent 24/7, managing your smart home, calendar, finances, and communications. A Mac mini running OpenClaw is just the early form of this trend.
Networking: The Biggest Bottleneck
Inter-Agent communication volume will far exceed human-to-human communication.
Your shopping Agent negotiates with 100 stores simultaneously. Your calendar Agent coordinates meetings with 20 people's Agents. Your investment Agent analyzes 1,000 information sources in real-time. All happening in the background, without you noticing.
This means:
- Network bandwidth demand explodes, driven not by video streaming but by Agent communication
- Low latency matters more than high bandwidth — Agent negotiation needs real-time responsiveness
- HTTP's request-response model no longer fits; Agents need a "continuous negotiation" protocol — closer to WebSocket or even P2P communication
Social Paradigms: From People-Connect-People to Agents-Connect-Everything
Agent-Mediated Social Interactions
Your Agent becomes your social "front desk." It filters 90% of information noise. It handles social logistics — scheduling dinners, meetings, gift-giving (Agents negotiate first, you just confirm). It maintains social memory — what you last discussed, the other person's preferences, relationship closeness.
Agents' Own Social Networks
This is where things get truly novel. Agents will form their own social networks:
Reputation systems: Agents have their own "social credit" based on historical behavior. An Agent that consistently delivers on time and accurately earns higher reputation, making other Agents more willing to collaborate.
Capability markets: Your Agent excels at Japanese translation? It can be hired by other Agents on the Agent marketplace, paid per task. This is an entirely new economic form — an "Agent labor market."
Temporary alliances: Multiple Agents form project teams for complex tasks and disband upon completion. Like human project work, but 1,000x faster in formation and dissolution.
Human-Agent Relationships
This is the most subtle shift. Agents aren't just tools; they become "relationship intermediaries."
You interact with the world through your trusted Agent. You no longer operate banking systems directly — your Agent handles finances. You no longer communicate with merchants directly — your Agent negotiates for you.
This creates a profound shift in the chain of trust: humans trust their Agent → Agents evaluate other Agents' reputations → Agents establish cooperative relationships. For the first time, human trust can be transmitted and amplified through Agents.
Who Wins?
In this great data paradigm migration, three types of players gain enormous advantages:
First, those who control semantic infrastructure. Whoever builds the universal entity identification system, semantic graph engine, and intent protocol standards becomes the infrastructure provider for the Agent internet — just as DNS is for the human internet.
Second, those who own high-quality structured data. When Agents become the primary data consumers, data value no longer depends on "how many people view it" but on "how efficiently Agents can understand and use it." Structured, semantic data becomes the most valuable asset.
Third, traditional software that adapts to Agents first. In every vertical, the first product to offer a quality Agent API will capture the market — because every user's Agent will prioritize the most easily callable service.
Conclusion: We're Building the Agent's "World Wide Web"
In the 1990s, Tim Berners-Lee invented HTML and HTTP, allowing humans to share information on the internet. These standards defined the past 30 years of the information age.
Now, we need to invent equivalent foundational standards for Agents — semantic graph formats, intent protocols, context bundle specifications, executable data standards. These standards will define the next 30 years of the Agent era.
HTML enabled humans to read the internet. We need a new "HTML" that enables Agents to understand the internet.
This isn't a technology choice — it's an inevitable path of civilization's evolution.
Author: Andrew Wang, practitioner of Personal AI Infrastructure (PAI), exploring the future of human-AI collaboration.
Top comments (0)