DEV Community

Alma Mahler
Alma Mahler

Posted on

I built an AI-powered SEO API in one afternoon and put it on RapidAPI

I've been wanting to ship a tiny, useful API on RapidAPI for a while, mostly as an excuse to play with Claude's structured output. I finally blocked off an afternoon, and the result is live: SEO Metadata Analyzer.

Point it at any URL and it returns the usual SEO stuff (title, description, OG tags), a readability score, and — the part I actually wanted to build — AI-generated keyword candidates grouped by intent (primary / secondary / long-tail).

Here's what the stack looks like, what it cost, and what I learned.

Why I built it

Every SEO tool I've used either wants $99/mo for a dashboard I don't need, or it scrapes a page and hands me a word-frequency list and calls that "keyword research." I wanted something in between: a boring REST endpoint that gives me actual keyword candidates I can pipe into whatever script I'm writing that day.

Once I realized Claude Haiku could do the keyword extraction reliably for under a cent a call, the whole thing became a one-afternoon project.

The stack

  • FastAPI — async handlers, Pydantic v2 response models, Depends for auth
  • Claude Haiku 4.5 via the Anthropic SDK — client.messages.parse() with a Pydantic schema so the keywords come back pre-validated
  • httpx + BeautifulSoup4 + lxml — fetch and parse
  • textstat — Flesch-Kincaid readability for English; a simple custom scorer for Japanese
  • Redis (with an in-memory fallback for local dev) — 24h response cache keyed by URL hash
  • Railway — deploy target, Redis addon, env vars
  • RapidAPI — monetization, rate limiting, auth via X-RapidAPI-Proxy-Secret

Architecture

\
┌────────────┐
client ───────▶ │ RapidAPI │ ──▶ proxy secret check
└─────┬──────┘


┌──────────────────┐
│ FastAPI /analyze │
└─────────┬────────┘


┌─────────────────┐
│ Redis cache? │──hit──▶ return cached JSON
└────────┬────────┘
│ miss

┌───────────────────────┐
│ httpx fetch + parse │
│ (BS4 + lxml) │
└──────────┬────────────┘


┌─────────────────────────┐
│ readability (textstat) │
└──────────┬──────────────┘


┌─────────────────────────────┐
│ Claude Haiku keyword extract │
│ (Pydantic schema) │
└──────────────┬──────────────┘


store in cache, return
\
\

The whole /analyze handler is ~30 lines. The interesting bit is that Claude returns structured JSON I never have to parse myself — messages.parse() with a KeywordList Pydantic model is the entire contract.

What a response looks like

Here's a real response for https://www.anthropic.com (trimmed for readability):

\json
{
"url": "https://www.anthropic.com",
"title": "Home \\ Anthropic",
"description": "Anthropic is an AI safety and research company...",
"og_tags": {
"og:title": "Home",
"og:description": "Anthropic is an AI safety and research company...",
"og:image": "https://cdn.sanity.io/.../anthropic-social.jpg",
"og:type": "website"
},
"language": "en",
"readability": {
"score": 42.1,
"grade_level": "Grade 11.8",
"method": "flesch-kincaid"
},
"keywords": [
{ "keyword": "Claude AI", "relevance": 0.95, "type": "primary" },
{ "keyword": "Anthropic", "relevance": 0.92, "type": "primary" },
{ "keyword": "AI safety", "relevance": 0.88, "type": "primary" },
{ "keyword": "large language models","relevance": 0.82, "type": "secondary" },
{ "keyword": "responsible AI research","relevance": 0.74,"type": "long-tail"},
{ "keyword": "enterprise AI solutions","relevance": 0.70,"type": "long-tail"}
],
"cached": false
}
\
\

The keyword buckets are what make this useful for me — primaries are the things you'd put in <title>, long-tails are the things you'd target with a blog post. That categorization is literally just a prompt instruction to Claude, and the Pydantic schema enforces it on the way out.

The cost math (being honest about it)

Every miss is one Claude Haiku call. I benchmarked a handful of pages and the average is roughly:

  • ~2K input tokens (system prompt + page text, after extraction)
  • ~400 output tokens (structured keywords)
  • Haiku 4.5 pricing: $1.00 / 1M input, $5.00 / 1M output
  • Cost per miss: ~$0.004 (plus a rounding pad for variance)

I set the internal cost estimate to $0.0075/request to give myself margin on longer pages. Cache hits cost $0. At 24h TTL and typical repeat access patterns, real-world average cost is well under half a cent per request.

That's what made the pricing on RapidAPI work: even the BASIC tier (50 req/mo, free) can't put me in the red, and the paid tiers have healthy margin without being predatory.

Tradeoffs I made

  • No JavaScript rendering. It's plain httpx + BeautifulSoup. SPAs with empty <body> won't get useful keywords. For my use case (blog posts, marketing pages, docs) this is fine, and it keeps the response time under a second.
  • 24h cache, not configurable. One less query param, one less thing to explain in the RapidAPI docs. If you care about fresh data, the TTL is short enough.
  • English + Japanese only for readability. The keyword extraction works in any language Claude speaks, but the readability scorer is language-specific and I only wrote two.
  • No batch endpoint yet. If people actually use this I'll add one.

What I'd do differently

The thing that ate the most time wasn't the code — it was RapidAPI's monetization UI. The form for setting up pricing plans is a React-heavy modal that fights you if you try to move quickly, and the "Proxy Secret" they auto-generate can't be overridden, so you have to sync their value into your backend env, not the other way around. Once I figured that out it was fine, but I wasted 20 minutes on it.

If I were doing this again I'd write the Railway deploy and the RapidAPI setup into a checklist before I touched the code.

Try it

If you build something with it I'd love to hear about it — especially if you wire the keyword output into a content brief generator or a competitive analysis script. That's the kind of downstream use I had in mind when I picked the three-bucket output shape.

Thanks for reading.

Top comments (0)