The “Assembly Language” Era of AI Agents is Over. It’s Time for an OS.

#timetraveldebugging #agentos #scalebysubtraction #microsoftresearch

We are currently building AI Agents the same way we built software in the 1980s: monolithic scripts, fragile loops, and zero observability.

We treat “Reliability” as a Prompt Engineering problem. We think if we just write a better system prompt, the agent will stop hallucinating. We think if we add more RAG, the agent will get smarter.

We are wrong.

Reliability is not a prompt problem. It is a Systems Engineering problem.

Over the last few weekends, I stopped writing agent scripts and started building the chassis they run on. Today, I am open sourcing the Agent Operating System (Agent OS).

This is not a framework. It is a research-backed architecture designed for Scale by Subtraction. We removed the noise, the direct dependencies, and the blind trust.

Here is what “Agent Engineering” looks like when you treat it as an OS:

1. The Kernel (Governance & State)

Most agents today are “black boxes.” You can’t debug them.

The Breakthrough: We implemented Time-Travel Debugging in the Control Plane.
How it works: Because all memory is immutable (emk) and all state is serialized, we can physically “rewind” an agent 5 minutes into the past, replay the exact inputs, and fix logic bugs deterministically. No more guessing.
Cost Control: We added State Hibernation. Agents serialize to disk and vanish when idle, waking up only when the Message Bus (amb) routes a signal. “Serverless Agents” are now real.

2. The Infrastructure (Trust & Transport)

In a multi-agent swarm, “Implicit Trust” is a vulnerability.

The Solution: We built iatp (Inter-Agent Trust Protocol).
The Check: Agents perform Cryptographic Attestation before exchanging messages. If the code signature doesn’t match the governance policy, the connection is rejected. Security is baked into the handshake, not the firewall.

3. The Primitives (Math & Verification)

We stopped asking LLMs “Are you sure?” because they hallucinate the confirmation too.

The Science: cmvk (Cross-Model Verification Kernel).
The Method: We use mathematical drift detection to verify outputs against ground truth vectors before the user sees them. If the math doesn’t check out, the agent stays silent.

The Oasis Project: Beyond Chat

This is not a product launch. This is an invitation to research. We are validating this OS with “Vertical Swarms” included in the repo:

Carbon Swarm: Verifying satellite data claims against PDFs using cmvk.
Energy Swarm: Negotiating grid load with signed trust contracts via iatp.
DeFi Swarm: High-speed “Mute Agents” monitoring mempools for risk.

If you are a Principal Engineer, Researcher, or Architect tired of debugging “toy agents,” the repositories are open. We are looking for contributors who want to solve the hard problems of Trust, Verification, and State.

Let’s build the road before we build the cars.

imran-siddique/agent-os: A Safety-First Kernel for Autonomous AI Agents — POSIX-inspired primitives with 0% policy violation guarantee