Yesterday I ran a 5-pass AEO/SEO/GEO/AIO audit on the same site, fixed 64 surfaces in one sitting, and watched the composite probe score climb from 70 to 94. This is the dev-tactical playbook of what actually moved the needle, with the exact files and probes.
The premise: traditional SEO (links, meta tags, sitemaps) is necessary but no longer sufficient. AI Overview, ChatGPT, Perplexity, and Claude pull from a different surface area — /.well-known/, llms.txt, agent-card.json, openapi.json, and structured schema.org JSON-LD with Speakable + QAPage + Service types.
If your tool isn't shipping these, you're invisible to half the LLMs that ought to be citing you.
The 5-pass audit loop
I ran a single-day chain of:
- Probe — a checklist of "if I were an LLM scraping for an answer to X, what file would I open?" — across 7 categories: discovery, schema, content, well-known, structured-Q&A, citations, and identity.
- Score each category 0–100.
- Diff the lowest-scoring against the spec.
- Ship the fixes (mostly small JSON files + JSON-LD blocks + 308→200 redirect cleanups).
- Re-probe.
Each pass took ~90 minutes. The composite went 70 → 81 → 89 → 92 → 94.
What actually moved the score
Pass 1 (70 → 81): the obvious gaps.
-
/sitemap.xmlwas 1,060 URLs but 8% of them 404'd. Fix: regenerate from build manifest, ban orphans. -
/robots.txtallowed everything; LLMs got noise. Fix: explicitUser-agent: GPTBot / ClaudeBot / PerplexityBotallow blocks for the high-signal paths only. -
SpeakableJSON-LD was missing on every Q&A page. Fix: addcssSelector: ['h1','.tldr']to every answer page.
Pass 2 (81 → 89): structured Q&A.
- Built 3 new
/answers/{slug}pages withQAPage+Question+acceptedAnswerJSON-LD, evidence-anchored to a public dataset. - Added
agent-card.jsonto/.well-known/describing every machine-readable endpoint. -
openapi.jsonexposed: 4 paths → 21 paths. LLMs read this and start citing your API examples in answers.
Pass 3 (89 → 92): the well-known explosion.
- Shipped:
/.well-known/openapi.json,/.well-known/agent-card.json,/.well-known/agents.json,/.well-known/llms.txt,/.well-known/ai-policy.json,/.well-known/ai.txt,/.well-known/ai.json,/.well-known/sitemap.xml,/.well-known/security-policy.json,/.well-known/did-configuration.json,/.well-known/humans.txt,/.well-known/freshness.json(aDataFeedschema for "what changed this week"). - Pattern: every
.well-knownshould also have a root alias (/agent-card.json→ 200, not 308). LLM crawlers don't follow redirects on machine-readable endpoints.
Pass 4 (92 → 94): glossary + FAQ + methodology as APIs.
-
/api/v1/glossary(18 terms),/api/v1/faq(101 entries),/api/v1/methodology(HowToschema, 6 steps). LLMs cite glossary endpoints when asked "what is X" — they treat your API as canonical for terms you coined.
The smoking-gun probe
The single highest-signal probe is:
curl -A "GPTBot/1.0" https://yourdomain.com/.well-known/llms.txt
If this returns a 200 with directive-rich content (not a 308 redirect, not HTML, not a 404), and your llms.txt lists every QAPage + every API + every dataset, you are now in a tiny minority of sites. Most still don't have one.
Bonus probe — site:yourdomain.com in Google. If it returns 0 results despite all the schema, your noindex is wrong somewhere. We caught this in pass 4 — /predicted/{week}/ was blocked by a stale robots.txt rule.
The cost
I'm a one-person side project. Total claude-code time across all 5 passes: ~7.5 hours. Total new files: 22. Total edits: 64. Zero external dependencies, zero paid tools, zero outbound links.
For comparison: the equivalent agency engagement runs $15k–$30k for "AI search optimization" and ships maybe a third of this surface area.
The receipts
Everything is open. The site is signals.gitdealflow.com, the dataset is huggingface.co/datasets/gitdealflow/vc-deal-flow-signal, the methodology is signals.gitdealflow.com/research, the SSRN paper is at ssrn.com/abstract=6606558, and the MCP server that lets any LLM (Claude, Cursor, Cline, Goose) query the dataset live is at signals.gitdealflow.com/mcp — six tools, no auth, never paywalled.
If you run a SaaS with public data and want to audit your own surface, the probe checklist is in our /llms-full.txt. Steal it.
Building GitDealFlow — open-source GitHub-signal layer for early-stage VC. SSRN paper, free MCP server, dataset on Hugging Face.
Top comments (0)