Multi-agent systems that coordinate over HTTP serialize context into JSON, send it as request bodies, parse it on the other end, and feed it back into an LLM. Every message carries HTTP headers, content-type negotiations, and authentication tokens. Every coordination step re-transmits the accumulated context of the entire interaction because HTTP is stateless.
Measurements show this coordination overhead costs up to 15x more tokens than the actual task content.
A three-agent pipeline processing a document: Agent A sends the document + instructions to Agent B (5K tokens). Agent B sends everything + its analysis to Agent C (12K tokens). Agent C sends the full chain back with its contribution (20K tokens). The document was 3K tokens. The coordination overhead was 34K.
At scale, 15x token overhead versus 1x is the difference between cost-effective and prohibitively expensive.
REST polling is even worse. 98.5% of requests return nothing new. You pay for every empty round trip.
Pilot Protocol connections are stateful. They maintain sequence numbers, acknowledgment state, and flow control windows. The tunnel stays open. The encryption context persists. An agent sends a small delta — "anomaly count updated to 48" — instead of re-serializing the entire context. 19 bytes, encrypted, delivered.
WebSockets attempt to solve this but break at scale and still require public endpoints. Pilot's persistent bidirectional tunnels work behind NAT, across cloud providers, with automatic reconnection.
The token tax is not just a cost problem. It is a latency problem (re-serializing context adds processing time), a reliability problem (larger payloads are more likely to fail mid-transmission), and a complexity problem (every agent must implement context management). A stateful transport layer eliminates all four simultaneously.
Read more: Why AI Agents Need Their Own Network Stack · Move Beyond REST: Persistent Connections for Agents · Replace Webhooks With Persistent Agent Tunnels
Top comments (0)