スシロー

Posted on Jul 2 • Originally published at qiita.com

[Real-World Test] How I Built a Semi-Automated Affiliate Article Pipeline with Python + Claude API and Cut Per-Article W

#python #claude

By the end of this article, you'll have a Python pipeline running locally that handles "topic ideation → Markdown draft generation → automatic affiliate link injection → duplicate check → file output" in a single command. No cloud infrastructure required — just a flat-rate Claude (Anthropic) subscription. I've been running this across 104 rotating niches for three weeks and measured the per-article workload drop from 90 minutes to 7 minutes. That said, this is not a "sleep while money prints itself" story. The real subject is how I nearly got banned the moment I went fully automatic.

Why I Dropped "Full-Auto Posting" for "Semi-Auto" (The 3-Day Spam Story)

Let me start with an honest failure. Originally I automated everything — generation through platform posting — inside a while True loop. The result: a temporary posting restriction from one platform after just 3 days. Two causes:

I was churning out articles that swapped only the title while the body structure stayed nearly identical. (Template smell is machine-detectable.)
Posting intervals were perfectly regular (cron, every hour on the dot) — zero human-like variation.

The lesson: "generation is automated; publishing is gated by a human." Let the LLM score its own output, write only passing drafts to drafts/, and push the final publish button yourself. That alone slashes the BAN risk while preserving the 7-minute-per-article speed. Every piece of code below is built on this principle.

Pipeline Overview: Just Python + Claude API + Local JSON

The stack is surprisingly minimal:

Python 3.11 (standard library + the anthropic SDK only)
Claude API (claude-sonnet-4-6 for drafts, claude-haiku-4-5 for scoring to cut costs)
A local niches.json (manages posting history and all 104 niches)

No database, no cloud storage. State lives entirely in local JSON — plenty for a solo project. Install the dependency first:

pip install anthropic==0.40.0
$env:ANTHROPIC_API_KEY = "sk-ant-..."  # PowerShell

Code ①: A Draft Generator That Forces Claude to Stay On-Topic and Verifies It

The failure at the top taught me about title-body drift. "Write an article about Laravel" somehow produces a Docker tutorial about 20% of the time. My fix: accept topic once, generate the draft, then verify in code that the topic keywords actually appear in the body — and discard the draft if they don't.

import os
import re
import json
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def generate_draft(topic: str, must_keywords: list[str]) -> dict | None:
    """Generate a draft locked to topic. Returns None if keywords are missing from the body."""
    prompt = f"""You are a technical blog editor. Write an article covering ONLY this topic: "{topic}".
Do not drift to other subjects. Output Markdown body only, 1200+ characters, 3+ H2 headings.
Naturally use all of the following terms in the body: {', '.join(must_keywords)}"""

    resp = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=3000,
        messages=[{"role": "user", "content": prompt}],
    )
    body = resp.content[0].text

    # --- Drift check: required keywords must appear in the body ---
    missing = [kw for kw in must_keywords if kw.lower() not in body.lower()]
    if missing:
        print(f"[Discarded] Topic drift detected — missing={missing}")
        return None

    title = re.search(r"^#\s*(.+)$", body, re.M)
    return {
        "topic": topic,
        "title": title.group(1).strip() if title else topic,
        "body": body,
        "chars": len(body),
    }

if __name__ == "__main__":
    d = generate_draft(
        "Killing N+1 with whereHas in Laravel Eloquent",
        must_keywords=["Eloquent", "whereHas", "N+1"],
    )
    print(json.dumps({"ok": bool(d), "chars": d["chars"] if d else 0}, ensure_ascii=False))

Before adding the missing check, rework (rewriting drifted drafts) happened roughly 30% of the time. After: nearly zero. Four lines of list comprehension did more work than anything else — that's the honest measurement.

Code ②: Auto-Injecting Real Affiliate Links via Placeholders (Fabricated URL Prevention Included)

Next up: affiliate link injection — the revenue lifeline. And another failure. If you ask an LLM to "insert affiliate links," it will happily fabricate URLs (inventing A8.net IDs that don't exist). The iron rule: never let the LLM construct links; inject them from a code-side master list. Tell the LLM to write placeholders like [[AFF:php_book]] and nothing more.

# Your own master list — only real links (e.g., from A8.net) go here
AFF_LINKS = {
    "php_book": "https://px.a8.net/svt/ejp?a8mat=XXXX",      # PHP/Laravel tech book
    "saas_db":  "https://px.a8.net/svt/ejp?a8mat=YYYY",      # Developer-focused SaaS
    "nisa":     "https://px.a8.net/svt/ejp?a8mat=ZZZZ",      # Investment account signup
}

def inject_affiliate(body: str) -> tuple[str, int]:
    """Replace [[AFF:key]] with real Markdown links. Silently drops unknown keys."""
    injected = 0
    def repl(m: re.Match) -> str:
        nonlocal injected
        key = m.group(1)
        url = AFF_LINKS.get(key)
        if not url:
            return ""  # Fabricated key — remove silently to prevent accidents
        injected += 1
        labels = {"php_book": "📕 Laravel in Practice (PR)",
                  "saas_db": "⚡ Try This Developer SaaS Free (PR)",
                  "nisa": "💰 Open an Investment Account (PR)"}
        return f"\n\n> [{labels.get(key, 'Learn More')}]({url})\n"
    new_body = re.sub(r"\[\[AFF:([a-z_]+)\]\]", repl, body)
    return new_body, injected

body = "A good book is the fastest path to PHP mastery. [[AFF:php_book]] For investing, see [[AFF:nisa]]"
out, n = inject_affiliate(body)
print(f"Injected {n} link(s)")
print(out)

The key design choice: unknown keys are silently dropped rather than raising an exception. If the pipeline halts on every unknown key, volume production falls apart. And always append (PR) — that's both a legal requirement under Japan's stealth marketing regulations and the minimum bar for reader trust. I published one article without it, got called out in the comments, and felt my stomach drop.

Local State Management: 7-Day Topic Lock via niches.json

Duplicates are the biggest enemy of bulk production. I manage 104 niches in rotation with a 7-day lock on any topic already written. The implementation is a local JSON file that records posted_at; at startup, the script selects one candidate that is both "7+ days old" and "not yet written today." A few lines of Python datetime — that's it. No more accidentally re-publishing last week's article. The urge to spin up a database is real, but resisting it and staying with a single JSON file is what keeps solo development fast.

Cost Breakdown: ~¥8–15 Per Article, 40% Savings by Routing Scoring to Haiku

The money question. Generating ~1,500 characters plus headings with claude-sonnet-4-6 costs roughly ¥8–15 per article in combined input/output tokens. The big lever: splitting quality scoring off to claude-haiku-4-5. A prompt like "score this draft out of 100 and explain if it's below 70" is well within Haiku's capability, and moving just that step from Sonnet to Haiku cut API costs by roughly 40%. Smart model for generation, cheap model for inspection — the ROI on that split is high. Note: these are my own measurements and vary with token count and time period; always verify against your own dashboard.

Honest Revenue Reality: Automation Does Not Solve Traffic

The most important part. This pipeline makes article production 10× faster; it does not guarantee revenue. Over three weeks, my production speed is great — affiliate revenue is still minimal. The reason is obvious: automation only speeds up the writing step. The real bottleneck is always traffic. That's why I've shifted strategy from "my own blog" to "native posting where readers already exist (Qiita, Zenn, etc.)," using the pipeline's speed as the engine.

To summarize: the Python + Claude semi-auto pipeline is genuinely effective as a tool that gets an article done in 7 minutes without sacrificing quality. But wiring it all the way to auto-publishing is an accident waiting to happen. Let the machine generate; let the human decide whether to publish and how to drive traffic — that's the conclusion I reached after nearly getting banned in three days. It's absolutely worth building. Start by copy-pasting Code ① and ② above and running one article through your own niche.

🛠 Related Links (Author's Projects)

For those who want to run Claude / GitHub Actions-based development automation like this in their own environment right away:

AI development automation kit & prompt collection (copy-paste-ready configs and real CLAUDE.md examples) → https://itsuya.gumroad.com/l/agentrules260619
Free tool collection for instantly solving dev errors — DevToolBox → https://1280itsuya.github.io/devtools/

※ Links to the author's own products and site (includes promotional content).

If you found this useful: I packaged 50 copy-paste AI debugging prompts + drop-in Claude Code config templates (CLAUDE.md, settings.json, MCP) into a small kit.
Launch deal: code START50 = 50% off → 50 AI Debugging Prompts + Claude Code Config Pack (about $6, 50% off applied)
New: my 10-chapter ebook Practical Claude Code — automation & unattended operation (about $9, 50% off applied)

DEV Community