DEV Community

Daniel Rozin
Daniel Rozin

Posted on • Originally published at aversusb.net

Programmatic SEO at Scale: How We Built 3,200 Comparison Pages Without Sacrificing Quality

Comparison sites live or die by page count. A single "Bose vs Sony" page serves one intent. A library of 3,200 comparison pages serves every intent in your category — and ranks for the long tail that drives consistent, compounding traffic.

Here's exactly how we built 3,200 comparison pages at aversusb.net and SmartReview without sacrificing content quality or creating thin-content penalties.

The Core Tension: Volume vs. Quality

Google's helpful content guidance is explicit: pages that exist primarily to rank — rather than to help users — get suppressed. The graveyard of programmatic SEO failures is full of sites that generated 50,000 pages of templated content and got hit with a core update.

Our approach: generate at scale, but never generate below a quality floor.

That means every page must have:

  1. Accurate, up-to-date specs for both entities
  2. A genuine verdict (not "both are great, it depends")
  3. At least 3 structured comparison dimensions
  4. A FAQ section answering real questions buyers have

If we can't meet that bar for a given comparison pair, we don't publish the page.

Step 1: Keyword Discovery at Scale

We use DataForSEO's Labs API to identify comparison opportunities. Our discovery pipeline runs daily:

// Simplified discovery pipeline
const seeds = ['robot vacuum', 'espresso machine', 'running shoe', 'mattress', ...];

for (const seed of seeds) {
  const keywords = await dataforseo.keywordSuggestions({
    keyword: seed,
    filters: [['keyword_info.search_volume', '>', 100]],
    include_serp_info: true
  });

  const comparisons = keywords.filter(k => 
    /\bvs\.?\b|\bversus\b|\bor\b|\bcompare\b/.test(k.keyword)
  );

  await scoreAndStore(comparisons);
}
Enter fullscreen mode Exit fullscreen mode

Scoring formula:

opportunityScore = 
  log10(volume) * 20 
  + (100 - difficulty) * 0.3 
  + min(cpc * 5, 25) 
  + (1 - competition) * 15
Enter fullscreen mode Exit fullscreen mode

This weights high-intent, low-difficulty keywords. A keyword with 2,000 monthly searches, 25 KD, and $2.50 CPC scores higher than one with 10,000 searches, 75 KD, and $0.20 CPC. We optimize for winnable keywords, not just volume.

Step 2: Entity Extraction and Normalization

The hardest part of comparison site engineering isn't the pages — it's the entity layer underneath them. "AirPods Pro 2" and "Apple AirPods Pro (2nd Generation)" are the same product. Your database needs to know that.

We built an entity resolution pipeline using three signals:

Signal 1: Name normalization
Strip model number variants, clean parentheticals, normalize brand prefixes. "Apple AirPods Pro 2" → entity ID apple-airpods-pro-2.

Signal 2: Spec fingerprinting
Hash a weighted combination of specs (weight, dimensions, key performance metrics). Products with identical or near-identical fingerprints get flagged for manual review.

Signal 3: Retailer cross-referencing
Match ASINs (Amazon), UPCs, and model numbers across 6 affiliate networks. If two product names share an ASIN, they're the same product.

Our entity database now has ~4,200 unique products with clean canonical names, sourced specs, and retailer cross-references.

Step 3: Content Generation with Quality Gates

For each comparison pair, we run a two-stage generation process:

Stage 1: Data enrichment
Pull live data from:

  • Amazon product API (pricing, ratings, review count)
  • Retailer product pages via our scraper
  • RTINGS.com measurements (for AV/electronics)
  • User review aggregation (Reddit, Wirecutter, RTINGS community)

Stage 2: Structured generation
We pass enriched data to Claude with a strict schema prompt:

Given these spec sheets and review data for [Product A] and [Product B], 
generate a structured comparison with:
- shortAnswer (1 sentence, must declare a winner)
- keyDifferences (array of 3-5 specific, factual differences)
- verdict (2-3 sentences, must include specific use case recommendation)
- faqs (5 questions buyers actually ask, with direct answers)

DO NOT generate if:
- Spec data is incomplete
- Products are from different categories
- The comparison would be misleading
Enter fullscreen mode Exit fullscreen mode

The shortAnswer constraint is the most important quality gate. If Claude can't declare a winner in one sentence based on the data, the comparison is either too close to call (publish with nuanced verdict) or missing data (hold for enrichment).

Step 4: The Publishing Pipeline

Pages don't go live immediately after generation. They go through a three-stage queue:

Queue 1: Generated (unpublished)
AI-generated content sitting in our database, not yet live. We generate ahead of demand — our queue typically has 200-300 pages ready to publish.

Queue 2: Spot-checked
Every 10th page in a category gets a human review. We sample, not exhaustively review, because exhaustive review doesn't scale. Sampling catches systematic quality issues before they compound.

Queue 3: Published
Live pages. Each one has a lastVerified timestamp. Pages older than 90 days get flagged for re-enrichment — product specs change, prices shift, and review consensus evolves.

Step 5: Internal Linking Architecture

3,200 pages with no internal linking structure is a crawl budget disaster. We built a topical hub architecture:

Category hubs (e.g., /robot-vacuums/) link to:

  • All brand overview pages (/robot-vacuums/roborock/)
  • All head-to-head comparisons (/robot-vacuums/roborock-s8-vs-roomba-j9-plus/)
  • A buying guide (/robot-vacuums/buying-guide/)

Brand pages link to:

  • All comparisons featuring that brand
  • The category hub
  • Related category comparisons

Comparison pages link to:

  • The two brand pages
  • 3-5 related comparisons (same brands, adjacent categories)
  • The category hub

This creates a flat, crawlable structure where Google can reach any page in 3 clicks from the homepage. With 3,200 pages, that means every page gets at least 5-10 internal links pointing at it.

What 3,200 Pages Actually Produces

At 6 months, our page library performance breaks down roughly as follows:

Tier Pages Monthly Searches RPPV
Top 100 100 5,000+ each $0.08
Long tail 3,100 100–1,000 each $0.015

The long tail individually looks unimpressive. Collectively, 3,100 pages × 300 average monthly searches × 15% CTR × $0.015 RPPV = ~$2,100/month from pages that took seconds each to generate.

The top 100 pages drive disproportionate revenue — but they also took the most enrichment effort. The long tail pays for the infrastructure; the top 100 pages pay for growth.

The Quality Failure Mode to Avoid

The most common programmatic SEO failure we've seen isn't thin content — it's stale content.

A "Roborock S8 vs Roomba j9+" comparison published in 2024 that still shows 2024 pricing and doesn't mention the Roomba Combo Essential is worse than useless — it actively misleads buyers and damages trust.

Our 90-day re-enrichment cycle is non-negotiable. Pages that go stale get suppressed (noindex) until they're updated. We'd rather have 2,800 high-quality pages than 3,200 with 400 stale ones dragging down the domain's quality signal.

Tools We Use

  • DataForSEO: Keyword discovery, bulk difficulty scoring, SERP monitoring
  • Tavily: Real-time enrichment for specs, reviews, and pricing context
  • Next.js ISR: On-demand revalidation for live pages, 24-hour stale-while-revalidate
  • PostgreSQL + Redis: Entity database + comparison cache (7-day TTL)
  • Claude API: Generation with quality gates baked into the prompt schema

SmartReview and aversusb.net build structured product comparison tools. See our comparisons at aversusb.net.

Top comments (0)