DEV Community

Chudi Nnorukam
Chudi Nnorukam

Posted on • Originally published at chudi.dev

Why DA Is Irrelevant for AI Citations (Data from 7 Site Audits)

Originally published at chudi.dev


Domain authority does not predict AI citations. Ahrefs has DA 92 and gets cited by AI platforms only 5% of the time. citability.dev launched with DA under 10 and achieved a 15% citation rate on day one. That 3x gap, between the most authoritative domain and a brand-new site with almost no backlinks, is not noise. It is the clearest possible signal that AI source selection runs on completely different rules than Google rankings.

I ran AI Visibility Readiness audits on 7 websites and tested each against ChatGPT, Perplexity, and Claude. The finding is consistent: DA has zero predictive value for whether AI will cite your URL. What predicts citations is content structure, freshness signals, and original data that AI cannot source elsewhere.

What Does Domain Authority Actually Measure?

Domain authority is a Moz metric that scores your backlink profile on a 1-to-100 logarithmic scale. More high-quality sites linking to you means a higher DA. Google uses backlink graphs as one major ranking signal, so DA became a widely-used proxy for "how authoritative is this site?"

The problem is the assumption buried in that proxy: that what works for Google works for AI. It does not.

Google PageRank is a graph algorithm. Trustworthiness flows through backlink networks. A site vouched for by high-authority domains earns authority itself.

AI answer engines do not use link graphs at all. They select sources based on whether the content is extractable, verifiable, and attributable. None of those three properties have anything to do with who links to you.

Backlinks are social proof for a graph algorithm. AI needs structured, dated, original content. These are completely different inputs to completely different systems.

What Does Our Benchmark Data Show?

The table below shows DA scores, infrastructure readiness results from the AI Visibility Readiness Framework, and measured citation rates across ChatGPT, Perplexity, and Claude.

Site DA Infrastructure AI Visible AI Cited
reddit.com 97 Not ready Untested Untested
x.com 96 Not ready Untested Untested
medium.com 95 Not ready Untested Untested
ahrefs.com 92 Foundation-ready 100% 5%
semrush.com 91 Foundation-ready Partial Partial
chudi.dev 28 Foundation-strong 25% 0%
citability.dev under 10 Foundation-strong 44% 15%

Sort by DA. No pattern emerges. The three highest-DA sites in the dataset failed infrastructure readiness entirely. The lowest-DA site has the highest citation rate.

citability.dev vs. chudi.dev is the most instructive comparison. chudi.dev has DA 28 with years of content and backlinks, yet 0% citation rate. citability.dev has DA under 10 and launched with a focused content structure and original benchmark data. The newer, lower-authority site outperformed on citations because it was built for AI extraction from the start.

Reddit, X, and Medium fail infrastructure checks for similar reasons. Reddit blocks AI crawlers in robots.txt. X serves content through JavaScript that most AI crawlers cannot execute. Medium routes content through a platform domain rather than author domains, fragmenting citation attribution. These are not problems backlinks can fix.

How Do AI Platforms Select Sources?

There are two pathways through which AI cites a URL, and only one of them is influenced by your content decisions.

The first pathway is training data. AI models internalize billions of pages during training. Ahrefs is in that training data at massive scale. When you ask ChatGPT about SEO tools, it knows Ahrefs without fetching anything. That is why Ahrefs is 100% visible despite low citation rates: the AI already knows everything it needs to know about them. Training data visibility does not require infrastructure. It requires being large and old.

The second pathway is retrieval-augmented generation (RAG) and live fetching. When an AI platform needs to answer a question and its training data is insufficient or potentially stale, it fetches external sources. This is where infrastructure determines outcome.

For RAG citations, three factors drive selection. First, the content must be machine-readable: no JavaScript blocking, clear HTML structure, structured data markup. Second, the content must appear current: dateModified schema, recent publication dates, and references to recent data. Research from Semrush indicates that 95% of ChatGPT citations come from recently updated content. Third, the content must contain specific claims the AI cannot make from memory alone. Original data, proprietary benchmarks, and recent statistics create citation necessity.

The 12% figure captures the scale of this divergence: only 12% of URLs cited by LLMs appear in Google's top 10 results for the same queries. If you are optimizing purely for Google, you are optimizing for a system with only 12% overlap with AI citation behavior.

What Should You Build Instead of Backlinks?

The benchmark data points to three infrastructure investments that directly increase AI citation rates. None of them involve link acquisition.

Answer-first content structure. AI extraction systems scan pages for the first concise, factual statement they can use. If your answer is buried in paragraph 4 behind context-setting, the AI may not reach it, or may retrieve a weaker version of your claim.

The fix is mechanical: move the direct answer to the first 100 words. Use question-based H2 headings that match how users phrase queries to AI. Keep paragraphs under 40 words. Remove qualifying language from opening statements. The opening paragraph of this article is built on this principle. The claim is in sentence one. Every word after it supports and extends that claim.

For a complete guide to this technique, see the AEO guide.

dateModified schema with substantive updates. Pages with Article or TechArticle schema that includes a valid dateModified field receive roughly 1.8x more AI citations than pages without. But the signal only works when backed by real content changes. Updating the date without changing the content is a pattern AI platforms are learning to discount.

The safe approach: update content quarterly with at least 100 words of substantive new material, new statistics, or revised conclusions. Only update dateModified when the change is real. Fake freshness signals have a short shelf life and create downside risk on Google rankings.

Original data that creates citation necessity. AI has internalized most widely available information from training. When AI encounters a question where its training data runs out, it fetches. Original data forces fetching because the AI has no other source for it.

The table above is an example. The specific DA-versus-citation data from this 7-site audit exists only here. When AI references it, it must cite this source. That is the mechanism. Publish data no one else has published, and AI must come to you for it.

Pages with inline statistics receive 40% more AI citations on average. Benchmark tables, audit results, survey data, and comparison analyses all qualify. One piece of original research per month creates sustained citation opportunities that no backlink campaign can replicate.

Does SEO Still Matter?

Yes, with a precise qualification. Google AI Overviews show 76% overlap with traditional top 10 search results. If you want to appear in Google AI Overviews, traditional SEO still applies. High DA still helps with that specific product.

But for standalone AI platforms, primarily ChatGPT and Perplexity, the 12% divergence means Google optimization is largely orthogonal to AI citation. You need both strategies, and they require different optimization layers.

The good news: the infrastructure changes that improve AI citability also strengthen traditional SEO in parallel. Answer-first content improves featured snippet eligibility. Structured data enables rich results. Content freshness signals help for queries that trigger Google's freshness algorithm. The overlap is real, even if the primary ranking factors diverge.

The mistake is assuming that building backlinks alone will carry you into AI citations. It will not. The game has changed. A new site with DA under 10 and the right content structure outperforms DA 92 on AI citations. That is not an anomaly. It is the new default.

Where to Start

If you have been allocating budget to link building with the assumption it will help AI visibility, here is a more direct path.

Run a free infrastructure scan at citability.dev/assess. It checks 10 baseline signals in under 60 seconds: robots.txt, sitemap, structured data, answer-first content, freshness signals, and more. The scan tells you exactly where your site falls short and which fixes will have the highest impact.

Then read the full benchmark breakdown in I Audited 7 Websites for AI Citability, which walks through each site's specific failures and what was done to improve the results.

Domain authority was a useful shorthand for Google trustworthiness. It is not a shorthand for AI trustworthiness. The infrastructure that makes AI cite you is different, measurable, and largely within your control right now.

Sources

Top comments (0)