Alma Mahler

Posted on Apr 9

I built an AI-powered SEO API in one afternoon and put it on RapidAPI

#ai #python #fastapi #api

I've been wanting to ship a tiny, useful API on RapidAPI for a while, mostly as an excuse to play with Claude's structured output. I finally blocked off an afternoon, and the result is live: SEO Metadata Analyzer.

Point it at any URL and it returns the usual SEO stuff (title, description, OG tags), a readability score, and — the part I actually wanted to build — AI-generated keyword candidates grouped by intent (primary / secondary / long-tail).

Here's what the stack looks like, what it cost, and what I learned.

Why I built it

Every SEO tool I've used either wants $99/mo for a dashboard I don't need, or it scrapes a page and hands me a word-frequency list and calls that "keyword research." I wanted something in between: a boring REST endpoint that gives me actual keyword candidates I can pipe into whatever script I'm writing that day.

Once I realized Claude Haiku could do the keyword extraction reliably for under a cent a call, the whole thing became a one-afternoon project.

The stack

FastAPI — async handlers, Pydantic v2 response models, Depends for auth
Claude Haiku 4.5 via the Anthropic SDK — client.messages.parse() with a Pydantic schema so the keywords come back pre-validated
httpx + BeautifulSoup4 + lxml — fetch and parse
textstat — Flesch-Kincaid readability for English; a simple custom scorer for Japanese
Redis (with an in-memory fallback for local dev) — 24h response cache keyed by URL hash
Railway — deploy target, Redis addon, env vars
RapidAPI — monetization, rate limiting, auth via X-RapidAPI-Proxy-Secret

Architecture

\┌────────────┐ client ───────▶ │ RapidAPI │ ──▶ proxy secret check └─────┬──────┘ │ ▼ ┌──────────────────┐ │ FastAPI /analyze │ └─────────┬────────┘ │ ▼ ┌─────────────────┐ │ Redis cache? │──hit──▶ return cached JSON └────────┬────────┘ │ miss ▼ ┌───────────────────────┐ │ httpx fetch + parse │ │ (BS4 + lxml) │ └──────────┬────────────┘ │ ▼ ┌─────────────────────────┐ │ readability (textstat) │ └──────────┬──────────────┘ │ ▼ ┌─────────────────────────────┐ │ Claude Haiku keyword extract │ │ (Pydantic schema) │ └──────────────┬──────────────┘ │ ▼ store in cache, return \\

The whole /analyze handler is ~30 lines. The interesting bit is that Claude returns structured JSON I never have to parse myself — messages.parse() with a KeywordList Pydantic model is the entire contract.

What a response looks like

Here's a real response for https://www.anthropic.com (trimmed for readability):

\json { "url": "https://www.anthropic.com", "title": "Home \\ Anthropic", "description": "Anthropic is an AI safety and research company...", "og_tags": { "og:title": "Home", "og:description": "Anthropic is an AI safety and research company...", "og:image": "https://cdn.sanity.io/.../anthropic-social.jpg", "og:type": "website" }, "language": "en", "readability": { "score": 42.1, "grade_level": "Grade 11.8", "method": "flesch-kincaid" }, "keywords": [ { "keyword": "Claude AI", "relevance": 0.95, "type": "primary" }, { "keyword": "Anthropic", "relevance": 0.92, "type": "primary" }, { "keyword": "AI safety", "relevance": 0.88, "type": "primary" }, { "keyword": "large language models","relevance": 0.82, "type": "secondary" }, { "keyword": "responsible AI research","relevance": 0.74,"type": "long-tail"}, { "keyword": "enterprise AI solutions","relevance": 0.70,"type": "long-tail"} ], "cached": false } \\

The keyword buckets are what make this useful for me — primaries are the things you'd put in <title>, long-tails are the things you'd target with a blog post. That categorization is literally just a prompt instruction to Claude, and the Pydantic schema enforces it on the way out.

The cost math (being honest about it)

Every miss is one Claude Haiku call. I benchmarked a handful of pages and the average is roughly:

~2K input tokens (system prompt + page text, after extraction)
~400 output tokens (structured keywords)
Haiku 4.5 pricing: $1.00 / 1M input, $5.00 / 1M output
Cost per miss: ~$0.004 (plus a rounding pad for variance)

I set the internal cost estimate to $0.0075/request to give myself margin on longer pages. Cache hits cost $0. At 24h TTL and typical repeat access patterns, real-world average cost is well under half a cent per request.

That's what made the pricing on RapidAPI work: even the BASIC tier (50 req/mo, free) can't put me in the red, and the paid tiers have healthy margin without being predatory.

Tradeoffs I made

No JavaScript rendering. It's plain httpx + BeautifulSoup. SPAs with empty <body> won't get useful keywords. For my use case (blog posts, marketing pages, docs) this is fine, and it keeps the response time under a second.
24h cache, not configurable. One less query param, one less thing to explain in the RapidAPI docs. If you care about fresh data, the TTL is short enough.
English + Japanese only for readability. The keyword extraction works in any language Claude speaks, but the readability scorer is language-specific and I only wrote two.
No batch endpoint yet. If people actually use this I'll add one.

What I'd do differently

The thing that ate the most time wasn't the code — it was RapidAPI's monetization UI. The form for setting up pricing plans is a React-heavy modal that fights you if you try to move quickly, and the "Proxy Secret" they auto-generate can't be overridden, so you have to sync their value into your backend env, not the other way around. Once I figured that out it was fine, but I wasted 20 minutes on it.

If I were doing this again I'd write the Railway deploy and the RapidAPI setup into a checklist before I touched the code.

Try it

RapidAPI listing: https://rapidapi.com/almamahler/api/seo-metadata-analyzer
Free tier: 50 requests/month, no credit card
PRO: $14.99/mo for 1,000 req/mo
ULTRA: $49.99/mo for 5,000 req/mo

If you build something with it I'd love to hear about it — especially if you wire the keyword output into a content brief generator or a competitive analysis script. That's the kind of downstream use I had in mind when I picked the three-bucket output shape.

Thanks for reading.

DEV Community