Nadia Mohamed

Posted on Apr 2

How I Built a Free AI-Powered Article Analyzer That Scores Content for AI Search Readiness

#ai #aeo #seo

Most SEO tools tell you if your content ranks on Google. None of them tell you if ChatGPT can cite it.

I built the AEO Article Analyzer to solve that problem. It scores any article against 10 criteria that determine whether AI search engines — ChatGPT, Perplexity, Gemini, Google AI Overviews — can extract, parse, and cite your content.

This is the technical story of how I built it, what I learned, and why the 10 scoring criteria matter.

The problem

I work as an SEO engineer for SaaS companies. Part of my job involves auditing content for AI search readiness — what the industry calls AEO (Answer Engine Optimisation) or GEO (Generative Engine Optimisation).

The manual process was slow. I'd open an article, check for a definition block in the first paragraph, look at the heading hierarchy, check for FAQ sections, look for author credibility signals, and mentally score it. Then do it again for the next article. And the next.

I wanted a tool that could do this in seconds, consistently, across hundreds of articles. Nothing on the market did this — existing tools focus on traditional SEO signals (keyword density, backlinks, meta tags), not on what makes content citable by AI.

So I built it.

The tech stack

Layer	Technology
Frontend	React 18 + TypeScript
Build	Vite 5
Styling	Tailwind CSS + shadcn/ui
Backend	Supabase (auth, database, edge functions)
AI	Claude Sonnet via OpenRouter API
Icons	lucide-react
Fonts	DM Sans + Space Grotesk

I chose this stack because I wanted something I could ship fast and iterate on. Supabase handles auth, database, and serverless functions in one platform. React + Tailwind is what I use for everything. The AI layer runs through OpenRouter so I can switch models without changing code.

How it works

The flow is straightforward:

User pastes article text or enters a URL
If URL: an edge function fetches the HTML and extracts the text content, preserving heading hierarchy as markdown. It also extracts any JSON-LD structured data (author, article schema, FAQ schema)
The extracted text is sent to a Claude Sonnet edge function with a system prompt that evaluates 10 specific criteria
Claude returns a structured JSON response with scores, pass/fail ratings, and specific suggestions per criterion
The frontend renders a scorecard with an overall score (0-100), colour-coded criteria cards, and a prioritised action list

For long articles (over 15,000 characters), I use smart truncation: the first 12,000 characters plus the last 5,000, with a separator marker. This preserves both the opening (where definition blocks and H1s live) and the closing (where FAQs and author bios typically appear) while staying within token limits.

The 10 scoring criteria

This is the core of the tool. Each criterion scores 0-10, and the overall score is a weighted aggregate out of 100. Here's what each one checks and why it matters for AI citation:

1. Question-based H1

Does the headline match the format of a prompt someone would type into ChatGPT? AI systems are trained on question-answer patterns. An H1 that reads like a question ("How does hreflang work for multilingual SaaS?") is more likely to be matched to an AI prompt than one that reads like a feature list ("Hreflang Implementation Guide 2026").

2. Direct answer in the first 200 words

AI systems extract answers from the opening of an article. If your first paragraph is a vague introduction, AI will skip to a competitor who gets to the point. The tool looks for a TL;DR block, a blockquote definition, a bold summary, or a callout — 40-80 words that directly answer the question in the H1.

3. Clear heading hierarchy

H1 → H2 → H3 in logical order. AI parses heading structure to understand content organisation. Broken hierarchies (H1 → H3, or multiple H1s) confuse the extraction process.

4. Modular sections (75-300 words each)

AI systems work best with self-contained sections that each make one point. Sections that are too long get truncated. Sections that are too short don't provide enough context. The sweet spot is 75-300 words per section.

5. FAQ section with 5+ questions

FAQ sections are gold for AI citation. They're structured as question-answer pairs — exactly the format AI is designed to surface. The tool checks for at least 5 questions with 40-80 word answers. It looks for bold text formatting (not just headings) since many FAQ implementations use bold questions.

6. Named sources and expert quotes

AI systems assess credibility through citations. Content that says "experts say" scores lower than content that says "Dr. Jane Smith, Head of Research at MIT, says..." The tool checks for real names with credentials.

7. Specific data and numbers

"Revenue increased significantly" vs "Revenue increased 340% from 1.56M to 6.89M." AI systems prefer specific, verifiable claims. The tool checks for concrete figures backing claims.

8. Original insight

Can AI generate this content itself? If yes, there's no reason to cite you. The tool checks for unique perspectives, first-hand experience, proprietary data, or contrarian viewpoints that couldn't come from a language model.

9. No AI slop

Filler content, vague generalisations, and corporate jargon. Content that reads like it was generated by AI without human editing scores poorly because AI systems have no reason to cite more of what they can already produce.

10. Author credibility signals

Bio boxes, bylines, credentials, linked profiles. AI systems use author signals as a trust indicator — this maps directly to Google's E-E-A-T (Experience, Expertise, Authoritativeness, Trust) framework.

Usage limiting without a paywall

I wanted the tool to be genuinely free — no credit card, no trial period. But I also needed to prevent abuse and manage API costs.

The solution: IP-based rate limiting with 3 analyses per month. Server-side tracking using Supabase, keyed on IP address + month string (e.g., "2026-04"). No cookies, no client-side workarounds.

// usage_tracking table
{
  ip_address: string,
  month_key: string,  // "2026-04"
  count: number,
  last_used: timestamp
}
// Unique constraint on (ip_address, month_key)

Admins bypass the limit for testing. The counter resets automatically each month based on the month_key format.

This approach means: no signup required to use the tool (lowering friction), no payment gate (it's a lead magnet, not a product), and costs stay predictable.

The URL fetch challenge

The hardest part wasn't the AI scoring — it was extracting clean text from arbitrary URLs.

The edge function fetches HTML with a standard User-Agent, then extracts text while preserving heading hierarchy as markdown. It also pulls JSON-LD structured data (author schema, article schema, FAQ schema) which feeds into the credibility scoring.

The limitation: this only works for server-rendered HTML. JavaScript-rendered SPAs (React, Next.js with client-side rendering, Angular) return minimal content because the edge function doesn't execute JavaScript. For these sites, users paste the text directly.

This is actually a useful signal in itself — if your content requires JavaScript to render, AI crawlers might have the same problem extracting it.

What I learned building this

AI scoring needs guardrails. Early versions of the system prompt returned inconsistent scores. A "direct answer" score of 7 from one analysis might be a 4 on the same article the next time. I fixed this by making the scoring criteria extremely specific in the prompt: word count ranges, structural patterns to look for, and explicit examples of pass vs fail.

Users care about actions, not scores. The first version showed scores only. Users would see "6/10 on FAQ section" and not know what to do. Adding specific, prioritised action items ("Add a FAQ section with at least 5 questions. Each answer should be 40-80 words and start with a direct statement.") made the tool dramatically more useful.

3 free uses is the right number. Enough to analyse your homepage, your best blog post, and a competitor's page. That's enough to see the pattern and understand the value. Most users who hit the limit either share the tool (free distribution) or reach out about a full audit (pipeline).

What's next

I'm working on two additions:

Batch analysis — upload a CSV of URLs and get a scored report for every page. This is the basis for the content inventory phase (Phase 5) of my 12-phase SEO & GEO audit.
Competitor comparison — analyse your article alongside 2-3 competitor articles for the same query. Show exactly where you're stronger and weaker on each criterion.

Try it

The AEO Article Analyzer is free at nadiamohamed.me/ai-tools/aeo-article-analyzer/. No signup required for the first 3 analyses.

I also built two other free tools:

Keyword Clustering — group up to 1,000 keywords into semantic clusters
AI Tool Ideas Generator — enter your niche, get tool ideas with demand estimates

All three are at nadiamohamed.me/ai-tools/.

If you're building AI-powered SEO tools or working on GEO, I'd love to hear what you're working on. Find me on LinkedIn or at nadiamohamed.me.

I'm Nadia — an SEO engineer specialising in GEO and technical SEO for SaaS companies. I build the organic growth infrastructure myself — structured data, tracking dashboards, automation workflows — instead of handing clients a PDF of recommendations.

Top comments (1)

Bhavin Sheth • Apr 3

This hits something most SEO folks are starting to feel but can’t measure yet.

I’ve manually audited content for “AI readability” too, and it’s honestly messy and inconsistent — turning that into clear scoring + actions is the real win here, not just the AI part.

The 10 criteria breakdown is especially solid 👍 helps bridge the gap between SEO and what actually gets cited.