RFC v0.3 — design proposal, not a shipping product. CC0 licensed. Feedback and critique welcome.
GitHub: MouseRider/agentmanifest-rfc
When you run AI agents across more than one role, the execution environment turns out to matter more than it first appears. The model gets most of the attention — benchmarks, leaderboards, capability comparisons — but the harness shapes runtime behavior in ways that model selection alone doesn’t account for.
A personal assistant, an ops monitor, a coding agent, a trading bot: these aren’t the same agent with different prompts. They need different memory models, different autonomy levels, different guardrail enforcement, different lifecycle behaviors. Current agent harnesses are mostly either finished platforms you adopt wholesale, or open-ended toolkits that reward deep specialisation. There’s no standardised, composable layer in between: a way to declare what an agent needs, select the right harness for its role, and assemble the configuration portably.
AgentManifest is a design proposal for that missing layer.
This is part of an ongoing series on building persistent AI agents. Article 1 covered TSVC — context isolation across topics. Article 2 covered agent epistemology — how an agent knows what it knows. AgentManifest grew out of the same body of work: a production personal assistant running on OpenClaw, and the questions that surface when you push a system like that into real daily use.
The Spec
Dockerfile-like syntax. FROM selects the harness — the primary design decision in any manifest.
# Personal Assistant
FROM openclaw:latest
MODEL claude-opus
ROLE personal-assistant
TOOLS browser, email, calendar, file-system, sub-agents
MEMORY persistent, cross-session
PERSONALITY ./soul.md
GUARDRAILS approval-for-external-sends, budget-cap-daily=5.00
AUTONOMY high
HEARTBEAT interval=30m, quiet-hours=23:00-08:00
CHANNELS telegram=in-out, email=in-out, twitter=out
SPENDING daily-cap=50.00, per-transaction-cap=20.00
IDENTITY did:web:agents.example.com:assistant
DEPLOY always-on
RESTART on-failure
# Ops Monitor
# Same harness. Completely different agent.
FROM openclaw:latest
MODEL claude-haiku
ROLE ops-monitor
TOOLS file-system, ssh, docker, http, alerting
MEMORY session-only
GUARDRAILS strict-instructions, no-generative-output, read-only-by-default
AUTONOMY medium
HEARTBEAT interval=5m
ALERT_CHANNEL telegram-ops-thread
ON_ERROR alert-and-retry, max-retries=3
DEPLOY always-on
RESOURCES memory=256m
Same base harness. Completely different agent. The spec makes the differences explicit, auditable, and portable — without requiring both to fit a single one-size-fits-all runtime.
Swap the harness and the same directives target a different execution environment:
FROM langgraph:latest
# or
FROM claude-code:latest
# or
FROM crewai:latest
Why Harness Selection Belongs in the Spec
Model selection is reasonably well-served by existing tooling — benchmarks, leaderboards, capability comparisons are all mature. Harness selection is less well-served, and it has more influence over runtime behavior than the current tooling reflects.
Here’s a concrete distinction worth making explicit. Writing “always ask for approval before deleting files” in a system prompt is a soft constraint — the model follows it as part of its instruction-following behavior. A deterministic guardrail at the harness level enforces the same rule unconditionally, independent of context length or task complexity. Both are valid approaches; they’re not equivalent, and the choice between them is a meaningful design decision that currently lives in implementation rather than in the agent definition.
Different roles suit different harness configurations:
- A coding agent fits Claude Code — git integration, sandboxed terminal, pre-commit guardrails in the infrastructure
- A research pipeline fits LangGraph — graph-native execution, defined workflow shape, explicit checkpoints
- A personal assistant fits OpenClaw — persistent memory, heartbeat behavior, cross-session continuity, sub-agent delegation (see the TSVC article for what running this in production actually looks like)
- A team workflow fits CrewAI — role-based agent structure, structured task handoffs, shared goal propagation
AgentManifest makes that selection explicit and portable. The spec sits above the harness layer — it doesn’t replace harnesses, it selects and configures them.
Three Directives Worth Examining
GUARDRAILS
GUARDRAILS strict-instructions, read-only-by-default, no-external-sends
Guardrails in AgentManifest are compiled into the harness configuration, not embedded in the prompt. The harness enforces them at the infrastructure level. This is the practical distinction between a behavioral instruction and a behavioral constraint.
IDENTITY
IDENTITY did:web:agents.example.com:purchasing-agent
SPENDING daily-cap=500.00, per-transaction-cap=100.00
IDENTITY assigns a cryptographic identity — immutable per manifest version, verifiable by external systems. Once identity is verifiable, it becomes the binding point for systems that require an accountable party on the other end of a transaction or access request.
Wallets and payment systems. An agent with a stable cryptographic identity can be issued a spending account scoped to that identity. SPENDING declares the limits; the wallet enforces them at infrastructure level. If something goes wrong, the audit trail is complete: which agent, which manifest version, which guardrails were active, what it spent and when.
OAuth and API credentials. Rather than embedding credentials in config or prompts, the harness can resolve access rights from the agent’s verified identity at runtime. An agent identity can be an OAuth client_id, a service account in Azure AD or AWS IAM, or a member of a permissioned data feed — scoped to that agent specifically, not a shared credential.
Inter-agent trust. In a multi-agent system, a coordinator can verify that the specialist it’s delegating to is genuinely running the manifest it claims — same spec version, same guardrails in force. This connects to the coordinator model described in the TSVC article: one coordinator, many specialists, each independently verifiable.
PROMPT_PROFILE and LOCALE
PROMPT_PROFILE claude-opus
LOCALE en-GB
The harness adapts prompt scaffolding to the selected model and language. The spec author doesn’t maintain model-specific variants or locale-specific rewrites. The harness handles that as an implementation detail.
agent-compose: Coordination Above the Single Agent
A single AgentManifest defines a single agent. agent-compose is the layer above — the analog to docker-compose for multi-agent systems. It references individual manifests, defines inter-agent interfaces, and declares the coordination topology.
Hierarchy
The most common pattern. A lead agent delegates to specialists; each specialist runs whatever harness suits its role.
topology: hierarchy
agents:
coordinator:
manifest: ./coordinator.agentmanifest
role: lead
researcher:
manifest: ./researcher.agentmanifest # FROM langgraph:latest
role: specialist
coder:
manifest: ./coder.agentmanifest # FROM claude-code:latest
role: specialist
delegation:
coordinator -> [researcher, coder]:
protocol: task-dispatch
The coordinator doesn’t need to know which harness each specialist uses. Harness heterogeneity is internal to the system.
Council
For high-stakes decisions, a council routes a proposal to a set of agents for independent evaluation before any action is taken. No single agent’s judgment is final.
topology: council
agents:
proposer:
manifest: ./agents/proposer.agentmanifest
council:
- manifest: ./agents/compliance-reviewer.agentmanifest
- manifest: ./agents/context-checker.agentmanifest
- manifest: ./agents/risk-assessor.agentmanifest
council_config:
trigger: action-type=financial OR confidence < 0.7
evaluation: independent
quorum: all
on_rejection: halt-and-alert
evaluation: independent matters — agents evaluate without seeing each other’s output first, preventing anchoring.
Consensus
A more flexible variant. Rather than unanimous approval, agents reach a decision through structured agreement with configurable thresholds.
topology: consensus
agents:
council:
- manifest: ./agents/reviewer-a.agentmanifest
weight: 1.0
- manifest: ./agents/reviewer-b.agentmanifest
weight: 1.0
- manifest: ./agents/senior-reviewer.agentmanifest
weight: 2.0
consensus_config:
method: weighted-majority # options: majority, supermajority, unanimity, weighted-majority
threshold: 0.6
on_no_consensus: hold-for-human
Useful for moderation decisions, borderline classification cases, or any workflow where structured disagreement should surface before acting. The conditions that trigger a council, the quorum required, and the fallback behavior are all declarable in the spec — not embedded in custom orchestration code.
When council members carry verifiable IDENTITY credentials, the audit trail for a decision includes the verified identity of each participating agent, the manifest version each was running, and the guardrails in force at the time.
Landscape
| Oracle Agent Spec | docker-agent | gitagent | AgentManifest | |
|---|---|---|---|---|
| Goal | Portability across runtimes | Declarative config, one runtime | Git-native definition, export anywhere | Role-appropriate harness per agent |
| Harness selection | Abstracted away | Fixed | Adapter-based | First-class (FROM) |
| Behavioral enforcement | Framework-dependent | Prompt-based | RULES.md + compliance config | Harness-compiled |
| Multi-agent | Single spec | Coordinator model | Inheritance + deps | agent-compose with topology declarations |
| Identity / payments | Not in scope | Not in scope | Not in scope | First-class directives |
| Format | YAML | YAML | File system structure | Dockerfile-like DSL |
| Status | Shipped | Shipped | Shipped | Design proposal / RFC |
On gitagent: it’s worth using today if your goal is git-native agent versioning and framework portability. AgentManifest is working on a different axis — not how to make the runtime invisible, but how to declare it explicitly. The two are potentially complementary: a gitagent repo could reference an AgentManifest to declare its harness requirements.
What This Is and Isn’t
AgentManifest is RFC v0.3. The spec is concrete enough to debate; no implementation exists yet. Validator tooling, a reference harness resolver, and a formal grammar are on the roadmap.
The spec is CC0. I’d genuinely welcome a working group or standards body taking it further — the goal was to get the idea into a form concrete enough to argue with.
Open Questions
A few things the spec doesn’t resolve yet, where input would be useful:
Harness resolver ecosystem. The spec works best if harness maintainers ship their own resolvers. That requires community buy-in that isn’t there yet. How do you bootstrap that?
Inter-agent protocol. agent-compose defines topology; it doesn’t yet commit to a wire protocol for agent-to-agent communication. Candidates on the table: A2A (Google’s agent communication protocol), MCP (Anthropic’s tool protocol, which is seeing increasing use for agent-to-agent calls), or plain HTTP with interfaces declared in the compose file. Each has different tradeoffs around standardisation, harness coupling, and implementation complexity.
Testing and simulation. For safety-critical agents — trading bots, autonomous purchasing agents — dry-run capability seems important. How do you test guardrail firing without live tool execution?
Cross-harness observability. When agents on different harnesses participate in a shared workflow, coherent distributed tracing is an open problem. The spec creates a clear seam where it needs to be solved via the IDENTITY directive; it doesn’t solve it.
Repo
-
MANIFEST.md— full spec, v0.3 -
examples/— AgentManifest files for six agent roles -
docs/design-rationale.md— why harness heterogeneity, not portability -
docs/agent-compose.md— topology patterns and multi-agent coordination -
docs/identity.md— identity model, wallet binding, inter-agent trust
If you’ve run agents across multiple roles in production and have thoughts on where this framing holds or breaks down — open an issue. The RFC is designed to be argued with.
AgentManifest was designed in collaboration with a persistent AI agent running on OpenClaw and through extended conversations with Claude AI (claude.ai). The spec, the repo, and this article are the output of that process — an example of the kind of work the system is designed to support.
Top comments (0)