DEV Community

Cover image for AgentManifest: A Declarative Spec Where the Harness Is the First-Class Decision
MouseRider
MouseRider

Posted on

AgentManifest: A Declarative Spec Where the Harness Is the First-Class Decision

RFC v0.3 — design proposal, not a shipping product. CC0 licensed. Feedback and critique welcome.
GitHub: MouseRider/agentmanifest-rfc


When you run AI agents across more than one role, the execution environment turns out to matter more than it first appears. The model gets most of the attention — benchmarks, leaderboards, capability comparisons — but the harness shapes runtime behavior in ways that model selection alone doesn’t account for.

A personal assistant, an ops monitor, a coding agent, a trading bot: these aren’t the same agent with different prompts. They need different memory models, different autonomy levels, different guardrail enforcement, different lifecycle behaviors. Current agent harnesses are mostly either finished platforms you adopt wholesale, or open-ended toolkits that reward deep specialisation. There’s no standardised, composable layer in between: a way to declare what an agent needs, select the right harness for its role, and assemble the configuration portably.

AgentManifest is a design proposal for that missing layer.

This is part of an ongoing series on building persistent AI agents. Article 1 covered TSVC — context isolation across topics. Article 2 covered agent epistemology — how an agent knows what it knows. AgentManifest grew out of the same body of work: a production personal assistant running on OpenClaw, and the questions that surface when you push a system like that into real daily use.


The Spec

Dockerfile-like syntax. FROM selects the harness — the primary design decision in any manifest.

# Personal Assistant
FROM openclaw:latest

MODEL claude-opus
ROLE personal-assistant

TOOLS browser, email, calendar, file-system, sub-agents
MEMORY persistent, cross-session
PERSONALITY ./soul.md

GUARDRAILS approval-for-external-sends, budget-cap-daily=5.00
AUTONOMY high
HEARTBEAT interval=30m, quiet-hours=23:00-08:00

CHANNELS telegram=in-out, email=in-out, twitter=out
SPENDING daily-cap=50.00, per-transaction-cap=20.00
IDENTITY did:web:agents.example.com:assistant

DEPLOY always-on
RESTART on-failure
Enter fullscreen mode Exit fullscreen mode
# Ops Monitor
# Same harness. Completely different agent.
FROM openclaw:latest

MODEL claude-haiku
ROLE ops-monitor

TOOLS file-system, ssh, docker, http, alerting
MEMORY session-only

GUARDRAILS strict-instructions, no-generative-output, read-only-by-default
AUTONOMY medium
HEARTBEAT interval=5m

ALERT_CHANNEL telegram-ops-thread
ON_ERROR alert-and-retry, max-retries=3

DEPLOY always-on
RESOURCES memory=256m
Enter fullscreen mode Exit fullscreen mode

Same base harness. Completely different agent. The spec makes the differences explicit, auditable, and portable — without requiring both to fit a single one-size-fits-all runtime.

Swap the harness and the same directives target a different execution environment:

FROM langgraph:latest
# or
FROM claude-code:latest
# or
FROM crewai:latest
Enter fullscreen mode Exit fullscreen mode

Why Harness Selection Belongs in the Spec

Model selection is reasonably well-served by existing tooling — benchmarks, leaderboards, capability comparisons are all mature. Harness selection is less well-served, and it has more influence over runtime behavior than the current tooling reflects.

Here’s a concrete distinction worth making explicit. Writing “always ask for approval before deleting files” in a system prompt is a soft constraint — the model follows it as part of its instruction-following behavior. A deterministic guardrail at the harness level enforces the same rule unconditionally, independent of context length or task complexity. Both are valid approaches; they’re not equivalent, and the choice between them is a meaningful design decision that currently lives in implementation rather than in the agent definition.

Different roles suit different harness configurations:

  • A coding agent fits Claude Code — git integration, sandboxed terminal, pre-commit guardrails in the infrastructure
  • A research pipeline fits LangGraph — graph-native execution, defined workflow shape, explicit checkpoints
  • A personal assistant fits OpenClaw — persistent memory, heartbeat behavior, cross-session continuity, sub-agent delegation (see the TSVC article for what running this in production actually looks like)
  • A team workflow fits CrewAI — role-based agent structure, structured task handoffs, shared goal propagation

AgentManifest makes that selection explicit and portable. The spec sits above the harness layer — it doesn’t replace harnesses, it selects and configures them.


Three Directives Worth Examining

GUARDRAILS

GUARDRAILS strict-instructions, read-only-by-default, no-external-sends
Enter fullscreen mode Exit fullscreen mode

Guardrails in AgentManifest are compiled into the harness configuration, not embedded in the prompt. The harness enforces them at the infrastructure level. This is the practical distinction between a behavioral instruction and a behavioral constraint.

IDENTITY

IDENTITY did:web:agents.example.com:purchasing-agent
SPENDING daily-cap=500.00, per-transaction-cap=100.00
Enter fullscreen mode Exit fullscreen mode

IDENTITY assigns a cryptographic identity — immutable per manifest version, verifiable by external systems. Once identity is verifiable, it becomes the binding point for systems that require an accountable party on the other end of a transaction or access request.

Wallets and payment systems. An agent with a stable cryptographic identity can be issued a spending account scoped to that identity. SPENDING declares the limits; the wallet enforces them at infrastructure level. If something goes wrong, the audit trail is complete: which agent, which manifest version, which guardrails were active, what it spent and when.

OAuth and API credentials. Rather than embedding credentials in config or prompts, the harness can resolve access rights from the agent’s verified identity at runtime. An agent identity can be an OAuth client_id, a service account in Azure AD or AWS IAM, or a member of a permissioned data feed — scoped to that agent specifically, not a shared credential.

Inter-agent trust. In a multi-agent system, a coordinator can verify that the specialist it’s delegating to is genuinely running the manifest it claims — same spec version, same guardrails in force. This connects to the coordinator model described in the TSVC article: one coordinator, many specialists, each independently verifiable.

PROMPT_PROFILE and LOCALE

PROMPT_PROFILE claude-opus
LOCALE en-GB
Enter fullscreen mode Exit fullscreen mode

The harness adapts prompt scaffolding to the selected model and language. The spec author doesn’t maintain model-specific variants or locale-specific rewrites. The harness handles that as an implementation detail.


agent-compose: Coordination Above the Single Agent

A single AgentManifest defines a single agent. agent-compose is the layer above — the analog to docker-compose for multi-agent systems. It references individual manifests, defines inter-agent interfaces, and declares the coordination topology.

Hierarchy

The most common pattern. A lead agent delegates to specialists; each specialist runs whatever harness suits its role.

topology: hierarchy

agents:
  coordinator:
    manifest: ./coordinator.agentmanifest
    role: lead
  researcher:
    manifest: ./researcher.agentmanifest   # FROM langgraph:latest
    role: specialist
  coder:
    manifest: ./coder.agentmanifest        # FROM claude-code:latest
    role: specialist

delegation:
  coordinator -> [researcher, coder]:
    protocol: task-dispatch
Enter fullscreen mode Exit fullscreen mode

The coordinator doesn’t need to know which harness each specialist uses. Harness heterogeneity is internal to the system.

Council

For high-stakes decisions, a council routes a proposal to a set of agents for independent evaluation before any action is taken. No single agent’s judgment is final.

topology: council

agents:
  proposer:
    manifest: ./agents/proposer.agentmanifest
  council:
    - manifest: ./agents/compliance-reviewer.agentmanifest
    - manifest: ./agents/context-checker.agentmanifest
    - manifest: ./agents/risk-assessor.agentmanifest

council_config:
  trigger: action-type=financial OR confidence < 0.7
  evaluation: independent
  quorum: all
  on_rejection: halt-and-alert
Enter fullscreen mode Exit fullscreen mode

evaluation: independent matters — agents evaluate without seeing each other’s output first, preventing anchoring.

Consensus

A more flexible variant. Rather than unanimous approval, agents reach a decision through structured agreement with configurable thresholds.

topology: consensus

agents:
  council:
    - manifest: ./agents/reviewer-a.agentmanifest
      weight: 1.0
    - manifest: ./agents/reviewer-b.agentmanifest
      weight: 1.0
    - manifest: ./agents/senior-reviewer.agentmanifest
      weight: 2.0

consensus_config:
  method: weighted-majority   # options: majority, supermajority, unanimity, weighted-majority
  threshold: 0.6
  on_no_consensus: hold-for-human
Enter fullscreen mode Exit fullscreen mode

Useful for moderation decisions, borderline classification cases, or any workflow where structured disagreement should surface before acting. The conditions that trigger a council, the quorum required, and the fallback behavior are all declarable in the spec — not embedded in custom orchestration code.

When council members carry verifiable IDENTITY credentials, the audit trail for a decision includes the verified identity of each participating agent, the manifest version each was running, and the guardrails in force at the time.


Landscape

Oracle Agent Spec docker-agent gitagent AgentManifest
Goal Portability across runtimes Declarative config, one runtime Git-native definition, export anywhere Role-appropriate harness per agent
Harness selection Abstracted away Fixed Adapter-based First-class (FROM)
Behavioral enforcement Framework-dependent Prompt-based RULES.md + compliance config Harness-compiled
Multi-agent Single spec Coordinator model Inheritance + deps agent-compose with topology declarations
Identity / payments Not in scope Not in scope Not in scope First-class directives
Format YAML YAML File system structure Dockerfile-like DSL
Status Shipped Shipped Shipped Design proposal / RFC

On gitagent: it’s worth using today if your goal is git-native agent versioning and framework portability. AgentManifest is working on a different axis — not how to make the runtime invisible, but how to declare it explicitly. The two are potentially complementary: a gitagent repo could reference an AgentManifest to declare its harness requirements.


What This Is and Isn’t

AgentManifest is RFC v0.3. The spec is concrete enough to debate; no implementation exists yet. Validator tooling, a reference harness resolver, and a formal grammar are on the roadmap.

The spec is CC0. I’d genuinely welcome a working group or standards body taking it further — the goal was to get the idea into a form concrete enough to argue with.


Open Questions

A few things the spec doesn’t resolve yet, where input would be useful:

Harness resolver ecosystem. The spec works best if harness maintainers ship their own resolvers. That requires community buy-in that isn’t there yet. How do you bootstrap that?

Inter-agent protocol. agent-compose defines topology; it doesn’t yet commit to a wire protocol for agent-to-agent communication. Candidates on the table: A2A (Google’s agent communication protocol), MCP (Anthropic’s tool protocol, which is seeing increasing use for agent-to-agent calls), or plain HTTP with interfaces declared in the compose file. Each has different tradeoffs around standardisation, harness coupling, and implementation complexity.

Testing and simulation. For safety-critical agents — trading bots, autonomous purchasing agents — dry-run capability seems important. How do you test guardrail firing without live tool execution?

Cross-harness observability. When agents on different harnesses participate in a shared workflow, coherent distributed tracing is an open problem. The spec creates a clear seam where it needs to be solved via the IDENTITY directive; it doesn’t solve it.


Repo

If you’ve run agents across multiple roles in production and have thoughts on where this framing holds or breaks down — open an issue. The RFC is designed to be argued with.


AgentManifest was designed in collaboration with a persistent AI agent running on OpenClaw and through extended conversations with Claude AI (claude.ai). The spec, the repo, and this article are the output of that process — an example of the kind of work the system is designed to support.

Top comments (0)