Jasper van Veen

Posted on Mar 13 • Edited on Mar 27

agent-manifest.txt — a proposed web standard for AI agents (formerly agents.txt)

#ai #webdev #standard #agents

Editor's note (March 2026): This proposal has been renamed from agents.txt to agent-manifest.txt. The original name was chosen for its direct analogy to robots.txt, but the agents.txt namespace became crowded: an independent IETF Internet-Draft (draft-srijal-agents-policy-00) had already claimed that filename, and multiple community projects were independently using it for different purposes. The new name more accurately reflects the document's purpose — a rich capability manifest — while keeping a clean path toward formal standardisation. The GitHub repository, spec, and all references have been updated. The article text below preserves the original framing.

The web has robots.txt. It's been around since 1994, and it answers one question well: can you look at this?

AI agents don't just look. They book flights, submit forms, call APIs, authenticate as users, and transact on behalf of people. And there's no standard for any of it.

I've been thinking about this gap for a while, and last week I drafted a proposal: agent-manifest.txt (originally agents.txt).

The idea

Place a file at https://yourdomain.com/agent-manifest.txt. It tells agents what they can do, how to do it, and under what terms:

Site-Name: ExampleShop
Site-Description: Online marketplace for sustainable home goods.

Allow-Training: no
Allow-RAG: yes
Allow-Actions: no
Preferred-Interface: rest
API-Docs: https://api.exampleshop.com/openapi.json
MCP-Server: https://mcp.exampleshop.com

[Agent: *]
Allow: /products/*
Allow: /search
Disallow: /checkout

[Agent: verified-purchasing-agent]
Allow: /checkout
Auth-Required: yes
Auth-Method: oauth2
Allow-Actions: yes

Why would agents comply?

Two reasons:

Self-interest. When a site advertises an MCP server or REST API, a well-built agent wants to use it - it's faster and more reliable than scraping HTML. Compliance isn't a favour; it's rational.

Legal posture. A published machine-readable policy makes ignoring it actionable. "You had a standard and ignored it" substantially strengthens CFAA and Computer Misuse Act arguments.

What it covers (that `robots.txt` does not)

Concern	robots.txt	agent-manifest.txt
Crawl permissions	Yes	-
Action permissions	-	Yes
API / MCP discovery	-	Yes
Training / RAG consent	-	Yes
Agent identity tiers	-	Yes
Auth methods	-	Yes

They're complementary - sites should have both.

Status

Draft v0.3.0, published under CC BY 4.0. I'd love feedback, pushback, and contributions.

GitHub: https://github.com/jaspervanveen/agents-txt

Three open questions I'm genuinely unsure about:

Agent identity verification - how do you prove an agent is who it claims to be?
Should capabilities use a controlled vocabulary, or free-form strings?
As MCP matures, how tightly should this integrate with it?

What do you think?

Top comments (2)

Global Chat • Mar 21

Really interesting proposal. I have been thinking about the agent discovery problem for a while and wanted to share some thoughts on your three open questions.

On agent identity verification: This is probably the hardest of the three. The approach I keep coming back to is a layered model -- a DID-based scheme where agents present verifiable credentials signed by their operator, combined with domain-level allowlists in the agents.txt itself. The [Agent: verified-purchasing-agent] syntax in your draft implicitly assumes some identity resolution mechanism exists, but the spec does not define it yet. Without that, the tiered permission model breaks down because any agent can claim any identity. Have you looked at the W3C Verifiable Credentials approach? It maps well to the operator-signs-for-agent pattern.

On controlled vocabulary vs free-form: I would strongly lean toward a controlled vocabulary for the core capability set (actions, auth methods, interface types) with an extension namespace for domain-specific additions. The reason is machine parseability -- if an agent encounters Allow-Actions: yes it knows what that means universally, but free-form capability strings require out-of-band knowledge to interpret. Something like a registry (similar to IANA media types) would let the vocabulary grow without breaking existing parsers. The IETF draft could define the initial set and the extension mechanism in the same document.

On MCP integration depth: This is where it gets really interesting. Right now MCP handles the how of agent-tool interaction beautifully, but it has no answer for the where -- how does an agent discover which MCP servers exist for a given domain? agents.txt is perfectly positioned to be that discovery layer. I would keep the integration shallow at the spec level (just the MCP-Server: directive pointing to the endpoint) and let MCP handle everything from transport negotiation onward. Tight coupling would be risky given how fast MCP is evolving -- the spec just hit the 97M SDK downloads mark and the protocol surface is still shifting.

One additional thought: the comparison table with robots.txt is useful, but it might be worth explicitly positioning agents.txt as complementary in the spec itself. A Robots-Txt: /robots.txt cross-reference directive could help agents understand which file governs what, avoiding the "which file do I check first" ambiguity that will inevitably arise when both files exist on the same domain.

Jasper van Veen • Mar 22 • Edited

Thanks! This is valuable feedback.

I've opened four GitHub issues to track these points properly:

Agent identity / W3C VC + DID → github.com/jaspervanveen/agents-tx...
Capability vocabulary + IANA-style registry → github.com/jaspervanveen/agents-tx...
robots.txt cross-reference directive → github.com/jaspervanveen/agents-tx...
Your implementation as related prior art → github.com/jaspervanveen/agents-tx...

The spec is now at v0.2.0 — your globalchatads/agents-txt implementation is listed in §8.2 Related Prior Art, and §11 Open Questions links directly to the GitHub issues.

Your MCP point resonates most with me: keep agents.txt as the discovery layer (where is the MCP endpoint?), let MCP own everything from transport negotiation onward. Tight coupling to a protocol with 97M SDK downloads and a still-shifting surface would be risky. Happy to continue in the GitHub issues.

The idea

Why would agents comply?

What it covers (that robots.txt does not)

Status

What it covers (that `robots.txt` does not)