DEV Community

AI Domain Data Standard
AI Domain Data Standard

Posted on

Open, vendor-neutral authoritative domain data consumed by AI systems, search, and other automated agents

AI systems don’t know who your domain represents

AI assistants are increasingly the first layer between users and websites. People ask chatbots what a site is, who runs it, how to contact it, or whether it’s the “official” source for something.

Today, AI systems infer that information indirectly, from:

• partial crawls
• inconsistent metadata
• third-party aggregators
• heuristics that usually work, until they don’t

This leads to common failure modes:

• misattribution (the wrong org, product, or contact)
• conflating similarly named domains
• inferring identity from whatever page happened to be crawled

These aren’t ranking problems, rather domain assertion problems.

What’s missing

There is no simple, first-party, domain-level place where a domain can say:

“This domain represents X.”
“This is the official site.”
“This is how to contact us.”

Today we have:

• schema.org (page-level semantics)
• robots.txt (crawler policy)
• security.txt (security contact)
• ai.txt (usage policy)

But nothing that is:

• domain-level
• identity-focused
• machine-readable
• self-hosted
• boring and predictable

Introducing the AI Domain Data Standard (AIDD)

AIDD is a small, open specification for publishing domain-level identity assertions for AI systems and automated agents.

It’s a single JSON document hosted by the domain itself:
https://example.com/.well-known/domain-profile.json

Minimal example:

{
"spec": "https://ai-domain-data.org/spec/v0.1",
"name": "Example Corp",
"description": "Open-source infrastructure for X",
"website": "https://example.com",
"contact": "https://example.com/contact"
}

That’s it.

Optional fields include:

• entity_type (aligned with schema.org types)
• logo
• embedded JSON-LD for interoperability

There’s also an optional DNS TXT fallback for resolvers that can’t fetch HTTPS.

What this is:

AIDD is:
• first-party and domain-controlled
• self-hosted
• vendor-neutral
• versioned and schema-validated
• composable with existing identity and trust systems

AIDD is not:

• an identity provider
• a verification or trust system
• a ranking signal
• a replacement for crawling

Think of it like security.txt, but for domain identity instead of security reporting.

Why this matters

Entity resolution literature is clear: identity inference degrades when signals are partial, indirect, or noisy.

AIDD doesn’t “fix AI,” but it gives AI systems a clean anchor signal for who a domain claims to represent. Consumers can:

• weigh it
• corroborate it
• or ignore it

The key point is attribution, not truth enforcement.

Tooling (so it’s not just a spec)

To keep this practical, there’s already tooling:
• CLI to init / validate / emit records
• Schema validation tests
• Resolver SDK
• Integrations:
o Next.js: https://www.npmjs.com/package/@ai-domain-data/nextjs
o WordPress: https://wordpress.org/plugins/ai-domain-data/
o Jekyll: https://rubygems.org/gems/jekyll-ai-domain-data
• Online generator: https://ai-domain-data.org/generator/
• and checker: https://ai-domain-data.org/checker/

Everything is open source and MIT licensed.

Repo:
https:// github.com/ai-domain-data/spec
Spec:
https://ai-domain-data.org/spec/v0.1/

Who this is for

If you:

• run a site, project, or organization
• build crawlers, agents, or AI ingestion pipelines
• maintain CMS or hosting tooling
• care about clean web metadata

This might be useful.
If not, ignore it. It’s intentionally small.

What’s next

The current version is deliberately minimal. Future work may explore optional layers like:

• cryptographic signing
• registrar or registry signals
• higher-assurance identity assertions

But the core goal stays the same:
a simple, universal, domain-hosted declaration surface that anyone can publish.

Feedback extremely welcome. Adoption, even more so.

Top comments (0)