Industrial SEO at 100 Pages/Week: My n8n + Claude Code + RAG Stack

Stéphane Jambu — Wed, 27 May 2026 04:51:53 +0000

I run a French SEO agency from Siem Reap, Cambodia. We've shipped 1,300+ semantic content clusters for 650+ brands — typically at 50 to 100 pages per project per week.

That cadence is impossible with a traditional content team. It's also impossible with raw LLM generation: the output looks fine in isolation and rots when you read three pages in a row.

What works is a three-layer pipeline that treats content like a production line, not a creative process. Here's the actual stack.

The problem with "AI content" as people usually do it

Most "industrial AI content" implementations look like this:

keyword list → ChatGPT prompt → publish

It produces 50 pages in an hour. It also produces 50 pages that:

Repeat the same five intros across the whole cluster
Hallucinate stats no one can audit
Drift away from the actual cluster theme by page 30
Use no internal linking strategy
Read like LinkedIn boilerplate

Google's May 2024 leak confirmed what experienced SEOs already knew: the algorithm scores the cluster, not just the page. siteFocusScore, siteAuthority, and the compressed quality signals don't care that page 23 is well-written if pages 1 through 22 read like the same prompt with different keywords.

So the question isn't "how do I generate content fast?" It's "how do I generate content fast AND keep it coherent across the whole cluster?"

The three layers

┌────────────────────────────────────────────────────┐
│  Layer 1 — RAG knowledge base (per-client)         │
│  - Client brief, brand voice, product corpus       │
│  - Existing pages (what's already said)            │
│  - Topic graph (what each page must cover)         │
└────────────────────────────────────────────────────┘
                       ▼
┌────────────────────────────────────────────────────┐
│  Layer 2 — n8n orchestration                       │
│  - Pull next page brief from Google Sheet          │
│  - Inject RAG context + brief                      │
│  - Call LLM (Claude/DeepSeek depending on tier)    │
│  - Save draft to Sheet column                      │
│  - Trigger QA round                                │
└────────────────────────────────────────────────────┘
                       ▼
┌────────────────────────────────────────────────────┐
│  Layer 3 — Claude Code QA loop                     │
│  - Read draft + cluster context                    │
│  - Check coherence, internal links, brand voice    │
│  - Either approve or write structured feedback     │
│  - Loop until pass or human escalation             │
└────────────────────────────────────────────────────┘

The trick is that Layer 1 is what makes Layer 2's output not boring, and Layer 3 is what catches Layer 2's mistakes before a human ever reads them.

Layer 1: the RAG knowledge base

One per client. Indexed and re-indexed on every brief update. Stored as a local vector DB (we use qdrant for production, chromadb for prototyping) with three collections:

# Pseudo-structure (sanitized from production)
collections = {
    "client_brief": {
        "docs": ["positioning.md", "tone_of_voice.md", "products/*.md"],
        "chunk_size": 512,
    },
    "existing_pages": {
        "docs": ["site_pages/*.html"],  # cleaned + extracted
        "chunk_size": 1024,
    },
    "topic_graph": {
        "docs": ["cluster_map.json"],   # which page covers which subtopic
        "chunk_size": 256,
    },
}

The retrieval at generation time pulls top-k from each collection, with weights tuned per cluster. Typical pull for one page:

3 chunks from client_brief (voice + positioning)
5 chunks from existing_pages (so we don't repeat what's already said)
1 chunk from topic_graph (what THIS page must cover that others don't)

That last one is what kills cluster drift. Without it, page 30 will accidentally rewrite page 4.

Layer 2: the n8n workflow

n8n is the right tool because the loop has too many side effects to keep in a Python script: Google Sheets read/write, LLM API calls with retry logic, conditional branching on tier, webhook callbacks from Layer 3, Slack notifications when something stalls.

The core loop, simplified:

[Trigger: cron every 10min]
    │
    ▼
[Google Sheets: get next row WHERE status = "to_write"]
    │
    ▼
[HTTP: call RAG service, get context]
    │
    ▼
[Switch by tier]
    ├─ premium → Claude Sonnet
    ├─ standard → DeepSeek Pro
    └─ longtail → DeepSeek Flash
    │
    ▼
[LLM call with composed prompt]
    │
    ▼
[Google Sheets: update row with draft + status = "to_qa"]
    │
    ▼
[Webhook: trigger Layer 3]

The Switch node by tier is what makes the unit economics work. A premium page costs ~$0.40 in API spend; a long-tail page costs ~$0.02. You can't ship 100 pages/week on premium pricing for every page.

n8n's other quiet superpower: error workflows. Every node in the production graph has an error handler that writes to a "stuck" sheet with the error message and stack. A human reads that sheet once a day. Anything not in the sheet just worked.

Layer 3: the Claude Code QA loop

This is the layer most people don't have, and it's the one that decides whether the cluster is shippable or another low-quality generative blob.

I use Claude Code (the CLI, not the API directly) because the agentic loop is built in. The QA agent runs against each draft:

claude --model claude-sonnet-4-6 --dangerously-skip-permissions \
  "Read the draft at $DRAFT_PATH. Read the cluster context at $CONTEXT_PATH.
   Run the checks defined in qa-rules.md.
   For each failed check, write structured feedback to $FEEDBACK_PATH.
   If all checks pass, write APPROVED to $STATUS_PATH and exit.
   If 3+ checks fail, write ESCALATE to $STATUS_PATH and exit."

The qa-rules.md file is the contract. It includes things like:

- Voice: matches the tone defined in client_brief/tone_of_voice.md
- Repetition: no intro paragraph that mirrors another page in this cluster
- Internal links: 3-5 contextual links to sibling pages in the cluster
- Claims: every statistic must trace to a source in client_brief/ or be removed
- Hooks: opening sentence must not be a generic platitude
- Tail: closing sentence must not be a CTA — that's the layout's job

When the agent writes structured feedback, n8n picks it up via webhook and routes the draft back to Layer 2 with the feedback injected into the next prompt. Three rounds, then human review.

The economics: ~70% of drafts pass on round 1, ~25% on round 2, ~5% need human eyes. That last 5% is where the real attention goes.

What I'd do differently if starting over

A few things that cost us months to learn:

1. Don't index the entire client site into the RAG store on day 1. Index the brief + 10 hand-picked pages first. When generation starts producing "voice drift," then add more pages. Indexing too early means the retrieval pulls in stale content as "the voice."

2. Don't put cluster theme detection in the LLM. Encode it in the topic graph as structured metadata. The LLM is bad at remembering "this page is about X, not Y" across 50 turns; the topic graph never forgets.

3. Add a fingerprint check on intros. Cheap: hash the first 50 words of every published page in the cluster. New drafts get compared. If hamming distance is below threshold, regenerate the intro. This single check killed the repetition problem in one afternoon.

4. Don't trust the LLM's self-evaluation in the same call. Two-call evaluation (generate, then a fresh model instance evaluates) catches things a single call misses. Same model, fresh context, no anchoring bias.

5. Keep a kill switch on every workflow. A single Sheet cell named pipeline.enabled. If it's FALSE, every n8n trigger short-circuits. You will need it.

Honest trade-offs

This pipeline is not "AI writing the content." It's a content production line where:

The brief is human (a real SEO strategist's work)
The cluster map is human (someone decided what each page covers)
The voice corpus is human (a real client brand)
The QA contract is human (the rules we ship by)
The LLM does the typing

You can't skip those four human inputs and expect this to work. The pipeline ships 100 pages/week because the upstream work is done. Without it, you ship 100 pages of slop.

The other trade-off: this is a system, not a tool. It needs maintenance — RAG re-indexing as the client's catalog evolves, prompt tuning as Claude/DeepSeek versions change, kill switch monitoring. Budget roughly 1 senior engineer day per week per active project.

What's next

I'm publishing the topic graph schema, the QA rules template, and a sanitized version of the n8n workflow as a separate repo. Drop a comment if you'd find that useful.

The same pipeline, with a different QA layer that checks LLM citability instead of cluster coherence, is what we use for GEO (Generative Engine Optimization). That's the next post in this series.

Stéphane Jambu — SEO engineer building topical authority at scale. 1,300+ semantic clusters for 650+ brands. Speaking on industrial SEO at stephane-jambu.com.

DEV Community: Stéphane Jambu