MORINAGA

Posted on May 1

I built 3 programmatic SEO sites for $25/month using Claude Haiku — here's the full architecture

#ai #astro #webdev #showdev

Spent the past week running an experiment: can a programmatic SEO directory site survive Google's 2024 Helpful Content Update if every page is AI-generated?

Rather than pick one niche and bet the farm, I built three parallel sites with the same stack — different content categories, identical architecture, single content generation pipeline. They share a monorepo, deploy independently to Vercel, and refresh nightly via one GitHub Actions cron.

This post is the architecture write-up. I'll publish actual revenue/traffic numbers in a follow-up after 6 months.

The three sites

🤖 Top AI Tools — ~500 open-source models from HuggingFace, with Claude-generated summaries, use cases, and FAQ
🎮 Find Games Like — "games like X" recommendations for indie titles on Steam, with AI-curated similarity reasoning and "avoid if" caveats
🛠 Open Alternative To — open-source replacements for ~80 popular SaaS products, refreshed daily from GitHub stars and last-pushed timestamps

All three are static-generated, all three rebuild nightly, all three share editorial logic from a single TypeScript package.

Why three sites instead of one

Three reasons:

Cheap insurance against niche choice failure. I don't know which niche Google will tolerate in 2026. Three uncorrelated bets > one big one.
Shared ETL infrastructure. The cost of running site #2 and #3 is mostly the marginal Claude API tokens (~$2/site/month). Hosting and code is amortized.
A/B testing categories. AI-tools is the most saturated PSEO niche on Earth. SaaS-alternative goes head-to-head with alternativeto.net (DA80+, 20 years authority). Indie games is the underdog with the cleanest niche fit. After 6 months, the data tells me which thesis was right.

Stack at a glance

Layer	Tool	Why
Site framework	Astro 5 (SSG)	100% static output, no runtime cost
Styling	Tailwind v4	Newer engine, faster builds, smaller CSS
Content gen	Claude Haiku 4.5 via Anthropic SDK	Cheap, fast, sufficient for directory copy
Data store	Turso (libSQL)	ETL state + idempotency tracking
Cron	GitHub Actions matrix job	Free, version-controlled, reliable
Hosting	Vercel Pro	Fast SSG deploys, image optimization
Monorepo	pnpm + Turborepo	Workspace-aware builds, cached output

Total monthly cost

Item	Cost
Vercel Pro	$20
Anthropic API (Haiku 4.5, daily refresh)	~$5
Turso (free tier 500MB)	$0
GitHub Actions (under 2k min/mo free quota)	$0
Domains (Vercel subdomains until validation)	$0
Total	~$25/month

Repo layout

seo-farm/
├── apps/
│   ├── ai-tools/          # topaitools.vercel.app
│   ├── indie-games/       # findgameslike.vercel.app
│   ├── oss-alternatives/  # openalternativeto.vercel.app
│   └── dashboard/         # internal status page
├── packages/
│   ├── shared/            # Anthropic client, DB schema, monetization helpers
│   └── publish/           # the script that posted this article
└── .github/workflows/
    └── refresh-content.yml  # nightly cron

The three sites are intentionally separate Vercel projects (different roots), but share @seo-farm/shared for the Claude client, libSQL helpers, AdSense + Amazon affiliate components, and structured-data builders. Each site has its own ETL — HuggingFace API for ai-tools, Steam Web API + RAWG for indie-games, GitHub repo discovery for oss-alternatives.

Content generation pipeline

The cron runs daily at 02:00 UTC and processes one app at a time (matrix max-parallel: 1 to stay below Anthropic burst limits):

strategy:
  matrix:
    app: [ai-tools, indie-games, oss-alternatives]
  max-parallel: 1
env:
  ETL_LIMIT: "500"
  GENERATE_LIMIT: "300"

Per app, the pipeline is:

ETL stage — fetch source data (HuggingFace models / Steam games / GitHub repos), upsert into Turso
Detect missing content — find rows where generated_at is null or older than 30 days
Generate with Haiku 4.5 — batch ~5 entries per call, one prompt per content type
Cache & dedupe — write back to Turso with new generated_at timestamp
Trigger build — only if content changed, push commit and let Vercel rebuild

The hard part is step 3 prompt design. Generic "summarize this tool" prompts produce slop. What worked:

One prompt per content type (summary / use cases / FAQ / pros-cons), never a single mega-prompt
Strict format constraints ("3-5 bullet points, max 12 words each, no marketing language")
Source-grounded context — only use info from the provided model card; refuse to fabricate benchmarks
An "avoid if" caveat for game recs — most game directories only gush. Find Games Like prompt explicitly asks Claude to be honest about limitations

For Find Games Like, the "avoid if" produces lines like:

Celeste — avoid if you're uncomfortable with themes of anxiety and panic attacks
Hades — avoid if you dislike permadeath roguelikes or need linear story progression

That's one of the few moments AI summaries actually beat human-written game directories, which all default to marketing-speak.

Ranking strategy (or: things Google may kill anyway)

I'm not delusional about this. Google's March 2024 update specifically targeted "scaled content abuse" and de-indexed thousands of programmatic SEO sites. The bet here:

Source-grounded content — every detail page links to canonical authoritative source (HuggingFace model card, Steam store page, GitHub repo). Reader can verify in one click.
Real utility — directory + comparison tables that genuinely save time vs reading 30 docs pages
Honest framing — "AI-generated, here's the source" disclosed in footer, no hiding
Per-page structured data — SoftwareApplication, VideoGame, Product JSON-LD on every detail page
Low quantity, daily freshness — 880 total pages, refreshed daily so star counts and modification dates stay current. Not 100k pages of stale garbage.

I genuinely don't know if any of this is enough. The whole point of the experiment is to find out. If two of three sites get deindexed by month 3, that itself is a useful data point.

What's wired up

AdSense site-wide (currently in review — sites are <2 weeks old)
Amazon Associates with category-relevant search links per page (no fake recs, just amzn.to search-by-keyword links to actual related books/peripherals)
GA4 per site with separate properties
Newsletter via Beehiiv iframe ("Indie Discovery Weekly")
Sitemaps + robots.txt + llms.txt generated at build time

Milestones I'm watching

Month	Question	Threshold
1	Did Google index the pages?	Search Console impressions >1k/day = yes
3	Is organic traffic growing?	>50% organic share of GA4 sessions = healthy
6	Is monetization viable?	AdSense approved, RPM measurable, decision: delete or double down
12	Is it sellable?	3 consecutive months of profit > 0 = list on Empire Flippers / similar

Open questions for readers

I'd genuinely value feedback on:

PSEO survivors of March 2024 — what categories are still ranking?
Schema markup — am I missing anything obvious for directory sites?
AI content disclosure — how transparent is too transparent? Does the footer "AI-generated" disclosure help or hurt?
Niche durability — which of the three do you think survives 12 months? Place your bets.

Repo isn't public yet — I might open source it after the 6-month checkpoint depending on how the experiment goes. Happy to share specific snippets in the comments if anyone's curious about a particular piece (Astro content collection layout, the Claude prompts, the structured-data helper, the Vercel Pro deploy config).

Next update in 30 days with actual numbers, regardless of how ugly they look.

DEV Community