Michael K

Posted on Feb 28

I Built an AI Content Machine with OpenClaw and MCP — Here's My Actual Config and Where It Broke

#ai #webdev #productivity #automation

I'm a solo founder running an AI agency in Berlin. No team. Just me, some cron jobs, and a lot of duct tape.

This post is the technical breakdown of how I automated my entire content operation: 150+ blog posts in 4 languages, 23 cron jobs, and an SEO monitoring system that caught keyword stuffing on 24 pages on a recent run while I was still making coffee.

Not a thought piece. Actual code, actual configs, actual failures. If you're building with AI agents, maybe this saves you a few weeks.

TL;DR:

Built a fully automated content pipeline using OpenClaw (agent runner) + MCP (tool protocol) + Claude (reasoning) + Convex (DB/memory)
23 cron jobs handle everything from overnight SEO audits to social engagement to visual content generation
Total cost: ~€300/month. The failures taught me more than the wins.

Scope & Honesty Check

Before the big numbers: here's what's automated vs what isn't.

Automated: blog drafting, translation (4 languages), hero image generation, publishing, social distribution, SEO monitoring, NLP entity analysis, GSC indexing, engagement scheduling, lead research, infographic/carousel creation.

Still manual: topic selection (I pick from 3 AI-proposed options each morning), quality review before publish (~10-15 min per post), strategic decisions, client work, complex problem-solving, and fixing the 30% of infographics that come out broken.

How I measure success: organic traffic from blog posts, LinkedIn follower growth, lead pipeline volume, and SEO scores (NLP entity salience, GEO compliance). Not post count.

Safety boundaries: the agent can't publish without my review, can't send emails to leads, can't modify production code, and can't spend money without approval. Dead letter queue catches silent failures. Every write operation has a verify step.

Why I Built This (The Honest Version)

Three months ago, I was drowning. Content marketing works — I know this, the data is clear. But I was spending 4 hours a day writing, editing, translating, posting. By noon, I'd published maybe one thing. And I still had actual client work to do.

The math didn't math.

I tried the obvious: ChatGPT, Jasper, Copy.ai. The results were... fine? Generic. Every output sounded like every other company using the same tools.

Then I realized I was using AI wrong. I was using AI tools when I needed AI agents.

Tools need you to orchestrate. Agents orchestrate themselves. That's the difference that changed everything for me.

The Architecture

Here's what actually runs:

┌─────────────────────────────────────────────┐
│  Mac Mini (always-on)                       │
│                                             │
│  OpenClaw ─── Agent runner + cron scheduler │
│     │         (open source, runs locally)   │
│     │         github.com/openclaw/openclaw  │
│     │                                       │
│     ├── Claude API ── Opus 4.6 (complex)    │
│     │                 Sonnet 4.6 (routine)  │
│     │                                       │
│     ├── MCP Server ── 143 tools (Vercel)    │
│     │                 HTTP transport         │
│     │                 API key auth           │
│     │                                       │
│     └── Convex ────── CMS + blog posts      │
│                       Cortex memory system   │
│                       Real-time subscriptions│
└─────────────────────────────────────────────┘

Why OpenClaw over alternatives? I evaluated LangGraph, CrewAI, and plain Python scripts. LangGraph was too graph-heavy for what are mostly linear pipelines. CrewAI felt too opinionated about multi-agent patterns I didn't need yet. Plain scripts worked until I needed cron scheduling, session memory, tool orchestration, and multi-channel messaging — at which point I was rebuilding an agent framework anyway. OpenClaw gave me all of that with a config file and a SKILL.md.

Why MCP? I needed Claude to call 143 different functions. MCP is Anthropic's protocol for exactly this — typed tool definitions over HTTP (or stdio). One server, one auth header, all tools available. The alternative was 143 separate function definitions in every prompt.

The MCP Server (What It Actually Looks Like)

The "143 tools" config I showed earlier was a simplified summary. Here's what a real tool registration looks like:

// Example: publish_to_social tool definition
{
  name: "publish_to_social",
  description: "Publish content to social media platforms via Typefully API",
  inputSchema: {
    type: "object",
    properties: {
      platforms: { type: "array", items: { enum: ["x", "linkedin", "threads"] } },
      text: { type: "string", maxLength: 280 },  // X limit
      imageUrl: { type: "string", format: "uri" },
      account: { enum: ["company", "personal"] },
      scheduleAt: { type: "string", format: "date-time" }
    },
    required: ["platforms", "text"]
  },
  // Returns: { draftId, platform, status, url? }
}

Error handling is typed too — every tool returns { success: boolean, error?: string, code?: "RATE_LIMITED" | "AUTH_FAILED" | "NOT_FOUND" }. The dead letter queue keys on these error codes for retry logic.

Transport is HTTP (/api endpoint on Vercel), auth via X-API-Key header, no stdio needed since the server is remote.

The Content Pipeline (Step by Step)

When I publish a blog post, here's what OpenClaw actually does:

Step 1: Research & Keywords

python3 ~/clawd/scripts/mcp-query.py research_topic topic="AI agents for enterprise"
python3 ~/clawd/scripts/mcp-query.py generate_keywords topic="..." locale="en"

Research tool aggregates from web search + knowledge base. Keyword tool suggests primary/secondary keywords with volume estimates via GSC data + heuristics (not exact — but directionally useful for content planning).

Step 2: Draft in 4 Languages

python3 ~/clawd/scripts/mcp-query.py create_blog_post \
  locale="de" \
  title="..." \
  content="..." \
  translationGroupId="abc123"

One brief → four versions. German, English, French, Italian. Not translated — each written natively for the locale. The translationGroupId links them for hreflang.

Step 3: Hero Image

python3 ~/clawd/scripts/mcp-query.py generate_hero_image \
  prompt="Abstract visualization of AI agents..." \
  style="modern, clean, tech aesthetic"

Blocking step. No image → no publish. OpenClaw enforces this. I added this rule after publishing 20 posts with broken image links. Embarrassing.

Step 4: Publish & Verify

# Publish
python3 ~/clawd/scripts/mcp-query.py publish_blog_post id="..."

# Verify (this is important)
curl -I https://contextstudios.ai/blog/de/article-slug
# Must return 200 before proceeding

The verify step caught a bug where the publish succeeded but the CDN cache hadn't invalidated. Posts were "live" but returning 404. Took me two weeks to figure out why traffic wasn't matching publish volume.

Step 5: Social Distribution

python3 ~/clawd/scripts/mcp-query.py publish_to_social \
  platforms='["x", "linkedin"]' \
  text="..." \
  imageUrl="..."

Each platform has different limits and formats — the MCP tool handles that. X enforces 280 chars server-side. LinkedIn gets a longer structured post.

Step 6: GSC Indexing

python3 ~/clawd/scripts/mcp-query.py submit_to_gsc url="..."

Tell Google to crawl immediately instead of waiting for natural discovery.

Total time: 8-12 minutes. No human in the loop until review.

Guardrails & Safety

Dev.to readers will rightfully ask: what stops this from publishing garbage?

Pre-publish review: I review every blog post before it goes live. ~10-15 minutes. The agent proposes, I approve.

Automated quality gates:

Content quality scan (4 AM daily): checks accuracy, tone, depth, usability, freshness. Posts scoring C or below get flagged.
NLP entity salience check: verifies the primary topic entity is prominent enough for Google NLP. If below threshold, the pipeline rewrites before publish.
Keyword stuffing detector: flags pages with unnatural density (learned this the hard way — see Disasters below).

Failure handling:

Dead letter queue: every failed MCP call gets caught, logged, retried with exponential backoff.
Idempotent operations: retries can't create duplicates. upsert over create.
Post-publish verification: curl -I every URL, confirm 200. If not → don't post to social.

Rollback: every blog post update saves the original to a backup directory first. CMS page updates go through a wrapper script that blocks if the new version has fewer sections than the current one (this saved me after one catastrophic bulk edit destroyed 6 pages).

The Cron Jobs (23 of Them)

Here's the categorized overview. Full YAML config is in the appendix.

Overnight (02:00-06:00): Output audit → deep intel sweep → memory maintenance → content quality scan → SEO/GEO audit → auto-healer. Six jobs, fully autonomous. By morning, everything from yesterday is verified and today's intel is ready.

Morning (07:00-10:00): Blog topic proposals → GSC indexing → morning briefing → ecosystem scan → EU engagement. The briefing at 08:30 is the one message I read to know the state of everything.

Midday (12:00-14:00): Infographic/carousel creation → lead gen → breaking news scan.

Afternoon (15:00-20:00): Reddit engagement → US morning peak (biggest round) → US afternoon + daily self-assessment.

Weekly: Meta-learning review (how well are the self-learning systems actually learning?), NLP entity audit (week-over-week), analytics review.

Infrastructure: Telegram relay health check every 4 hours.

📋 Full cron config (all 23 jobs)

# === Overnight Pipeline (runs while I sleep) ===
- name: daily-output-audit
  schedule: "0 1 * * *"       # 02:00 CET
  task: "Verify all social posts, blog posts, cron health from yesterday"

- name: overnight-deep-intel
  schedule: "0 2 * * *"       # 03:00 CET
  task: "Multi-layer scan: X accounts, web news, GitHub releases, YouTube channels"

- name: cortex-wake-critical
  schedule: "0 3 * * *"       # 04:00 CET
  task: "Prevent important cognitive memories from decaying below threshold"

- name: content-quality-scan
  schedule: "0 4 * * *"       # 04:00 CET (staggered)
  task: "5-dimension quality audit: accuracy, tone, depth, usability, freshness"

- name: daily-seo-geo-audit
  schedule: "0 4 * * *"       # 05:00 CET
  task: "Full SEO + GEO + NLP entity salience + schema + hreflang audit"

- name: seo-auto-healer
  schedule: "45 4 * * *"      # 05:45 CET
  task: "Read audit manifest, auto-fix Tier 1 issues, propose Tier 3 to me"

# === Morning Pipeline ===
- name: blog-topic-proposals
  schedule: "0 6 * * *"       # 07:00 CET
  task: "Read overnight intel, propose 3 topics via Telegram buttons, spawn autopilot"

- name: gsc-bulk-indexing
  schedule: "0 6 * * *"       # 07:00 CET
  task: "Submit new blog URLs to Google Search Console"

- name: morning-briefing
  schedule: "30 7 * * *"      # 08:30 CET
  task: "Comprehensive dashboard: intel, metrics, health, scanner ideas"

- name: daily-deep-scan
  schedule: "0 8 * * *"       # 09:00 CET
  task: "OpenClaw ecosystem scan: skills, security, competitive intel, brand mentions"

- name: eu-engagement
  schedule: "0 9 * * *"       # 10:00 CET
  task: "X + LinkedIn engagement targeting EU companies and accounts"

# === Midday ===
- name: infographic-carousel
  schedule: "0 11 * * *"      # 12:00 CET
  task: "Create infographic (yesterday's blog) + carousel (2 days ago)"

- name: linkedin-outreach
  schedule: "0 12 * * *"      # 13:00 CET
  task: "Self-learning lead gen: intent signals, engagement bridges, competitor intel"

- name: midday-news-scan
  schedule: "30 12 * * *"     # 13:30 CET
  task: "Quick breaking news check, update daily intel file"

# === Afternoon/Evening (US market hours) ===
- name: reddit-daily-engagement
  schedule: "0 15 * * *"      # 15:00 CET
  task: "2-3 authentic comments on r/ClaudeAI, r/LocalLLaMA, etc."

- name: us-morning-engagement
  schedule: "0 15 * * *"      # 16:00 CET (10 AM ET)
  task: "Biggest round: X + LinkedIn targeting US accounts, both company + personal"

- name: us-afternoon-engagement
  schedule: "0 19 * * *"      # 20:00 CET (2 PM ET)
  task: "Final round + daily self-assessment + pattern learning"

- name: nightly-scanner
  schedule: "0 22 * * *"      # 23:00 CET
  task: "Find improvement ideas for the setup, auto-apply safe changes"

# === Weekly ===
- name: weekly-meta-learning
  schedule: "0 0 * * 0"       # Sun 01:00
  task: "Review self-learning effectiveness, prune stale rules"

- name: nlp-entity-weekly-sweep
  schedule: "0 2 * * 0"       # Sun 03:00
  task: "Full NLP entity salience audit, week-over-week comparison"

- name: relay-log-rotation
  schedule: "0 4 * * 0"       # Sun 05:00
  task: "Archive old Telegram relay logs"

- name: weekly-analytics-review
  schedule: "0 9 * * 0"       # Sun 10:00
  task: "Comprehensive X + LinkedIn + blog performance review"

# === Infrastructure ===
- name: telegram-relay-health
  schedule: "0 */4 * * *"     # Every 4 hours
  task: "Verify Telegram message relay is working, check dead letter queue"

The Numbers

Metric	Number
Blog posts published	150+
Languages	4 (DE, EN, FR, IT)
CMS landing pages	200+
Automated cron jobs	23
MCP tools	143
NLP avg entity salience (before)	0.164
NLP avg entity salience (after)	0.177
Pages audited daily	258

On a recent run: keyword stuffing detected on 24 pages at 4 AM, fixed by 10 AM. I found out over breakfast.

Where It Broke (The Important Part)

Let me tell you about my disasters. This is the stuff nobody writes about.

Disaster 1: The Keyword Stuffing Incident

I gave the content agent aggressive keyword targets. "Optimize heavily for these 5 keywords."

The result? Pages that read like SEO spam from 2010. Keyword density through the roof. Sentences that made no grammatical sense. I had to rewrite 40+ pages manually.

The irony of using AI agents to create worse content wasn't lost on me.

What I changed: Keywords go in the brief, not post-processing. Write naturally → check density → adjust if needed. Order matters.

Disaster 2: Silent Failures

MCP tools fail. Network issues, rate limits, API changes. For three weeks, failed operations just... disappeared. No retry. No log. Nothing.

I only noticed when LinkedIn engagement dropped to zero. The publishing tool had been failing silently for weeks. OpenClaw was calling it, getting errors, and moving on.

What I built: Dead letter queue. Every failed operation gets caught, logged, and retried with exponential backoff. Now it's a daily cron job. Should have been there from day one.

Disaster 3: Optimizing the Wrong Metric

For one terrible month, I optimized for volume. 10 posts a day! 20 posts a day! Look at all this content!

Traffic didn't scale. Engagement dropped. Quality tanked.

Two well-researched posts consistently beat ten thin ones. Every time.

What I learned: Measure outcomes (traffic, engagement, conversions), not outputs (post count). This one took me too long to figure out.

Disaster 4: Visual Content Generation Is Still Broken

I automated infographic and carousel creation. 19 templates, automatic export, multi-platform publishing. Sounds great on paper.

Reality: footers get cut off. Half the slide is unused whitespace. Sizing doesn't match platform specs. The carousel PDF metadata is wrong so LinkedIn shows "document-1740384.pdf" instead of a title. Icons loaded from CDNs break in headless export. Every platform wants different dimensions.

I've rewritten the export pipeline three times. It's still not where I want it. The templates work maybe 70% of the time — the other 30% need manual fixes before I can post them.

What I'm learning: Visual content generation is a fundamentally harder problem than text. Text is forgiving — a slightly awkward sentence still communicates. A badly cropped infographic looks broken. This one's still work in progress.

Disaster 5: The Memory Problem

Agents forget everything between sessions. I'd have the same conversation about brand voice five times. The agent would make the same mistake repeatedly.

Super frustrating. Also obvious in hindsight.

What I built: Cortex. Cognitive memory system with different stores — sensory (24h buffer), episodic (events), semantic (facts/decisions), procedural (how-to). Daily decay cron applies forgetting curves.

Now the agent remembers decisions. Runs on Convex. Minimal extra cost.

The Cortex Memory System (Because Agents Are Goldfish)

This is the part I'm most excited about. Here's how it works:

# Store a decision
cortex_remember(
  store="semantic",
  category="decision", 
  title="LinkedIn posting frequency",
  content="3x/day optimal. More causes engagement drop.",
  tags=["social", "linkedin"],
  source="conversation"
)

# Recall relevant context
cortex_recall(
  query="linkedin engagement strategy",
  store="semantic",
  limit=5
)

Architecture:

Sensory store — 24h buffer, auto-expires
Episodic store — Event memory with timestamps
Semantic store — Facts, decisions, lessons
Procedural store — How-to knowledge
Prospective store — Future intentions

The decay cron is important. Old, rarely-accessed memories fade. Important ones strengthen through retrieval. It's not perfect, but it's way better than starting fresh every session.

MCP Tool Patterns That Actually Work

After 143 tools, some patterns emerged:

Pattern 1: Idempotent Everything

# Safe to retry
python3 ~/clawd/scripts/mcp-query.py upsert_blog_post \
  slug="ai-agents-guide" \
  locale="en" \
  content="..."

If the dead letter queue retries, it shouldn't create duplicates or corrupt data.

Pattern 2: Always Verify

# Publish returns success
python3 ~/clawd/scripts/mcp-query.py publish_blog_post id="..."

# But always verify
python3 ~/clawd/scripts/mcp-query.py get_blog_post id="..."
# Check status === 'published'

MCP tools can return success while downstream systems fail. Trust but verify.

Pattern 3: Chunk Large Operations

# Don't audit all 258 pages in one call
for page in $(python3 ~/clawd/scripts/mcp-query.py list_pages --limit 50 --offset $i); do
  python3 ~/clawd/scripts/mcp-query.py audit_page slug="$page"
done

Single failure shouldn't abort everything.

The Economics (Real Numbers)

Development: ~3 months of focused work, built alongside client projects.

Monthly running costs:

OpenClaw: $0 (open source, runs locally)
Claude API: ~€200/month (Opus + Sonnet mix)
Vercel (MCP server + hosting): ~€40/month
Convex (DB + memory): ~€25/month
Image generation: ~€10/month
Other (domains, monitoring): ~€25/month

Total: ~€300/month

What I'd pay for equivalent:

2 content writers: €8,000+/month
Content agency: €3,000-10,000/month

The math works. But the upfront investment is real. This isn't a weekend project.

The AEO Stack (Early Experiment)

I built a set of machine-readable files for AI crawlers and assistants:

llms.txt — site description in a format LLM crawlers can parse (emerging convention, not a standard yet)
llms-full.txt — comprehensive content index
.well-known/mcp.json — public tool definitions (non-sensitive subset)
.well-known/brand-facts.json — structured brand information

curl https://contextstudios.ai/llms.txt
curl https://contextstudios.ai/.well-known/mcp.json

Who consumes these? GPTBot, ClaudeBot, PerplexityBot all crawl llms.txt if it exists. The mcp.json is more speculative — aimed at a future where AI assistants can discover and call services. Measurable effect so far: hard to isolate, but AI-referred traffic is growing. Early days.

Caution: don't expose internal tool definitions or auth patterns in public-facing files. My mcp.json only describes the public, read-only subset.

What You Can Copy Today

Three actionable things you can do this week:

Build one MCP tool that wraps your most tedious API call. Start with something you do 5x a day manually. Connect it to Claude via OpenClaw or any MCP client.
Add a dead letter queue to any automated pipeline you already have. Failed operations → log → retry with backoff. You'll be shocked what's been failing silently.
Create llms.txt for your site/project. It takes 10 minutes and makes your content visible to AI crawlers that are already visiting.

What I'd Do Differently

Start with monitoring. I built 50 tools before proper logging. Debugging was hell.

Build the dead letter queue first. Assume failures. Handle them.

Define success metrics early. "More content" isn't a metric.

Test MCP tools in isolation. Complex chains hide bugs.

Document as you go. Future me thanks past me for every comment.

How to Start

If you want to build something similar:

Setup OpenClaw — Start with examples
Copy & paste this article to OpenClaw
Build one MCP tool — Something simple
Add complexity gradually — Don't build 143 tools day one

The mistake: trying to automate everything at once. Start with one workflow. Get it solid. Expand.

OpenClaw is open source. MCP is documented by Anthropic. Convex has a generous free tier. You can build this.

I'm happy to chat about what worked and what didn't. Not pitching anything — just a solo founder in Berlin who spent three months on this and wants someone else to avoid my mistakes.

Context Studios — contextstudios.ai

Appendix: MCP Tool Categories

For reference, here's how I organize the 143 tools:

Blog (26 tools): create, update, delete, publish, unpublish, list, get, search, translate, bulk operations, metadata management

CMS (19 tools): page CRUD, section management, navigation, templates, localization

Social (21 tools): Twitter/X posting, LinkedIn posting, scheduling, engagement tracking, analytics

SEO (10 tools): NLP analysis, keyword research, density check, GSC submission, sitemap generation

Video (20 tools): script generation, TTS, avatar creation, scene generation, assembly

Images (11 tools): hero generation, social cards, optimization, bulk processing

Cortex (8 tools): remember, recall, forget, decay, consolidate, stats, checkpoint, wake

Research (4 tools): topic research, competitor analysis, trend detection, source aggregation

Other (24 tools): notifications, webhooks, logging, analytics, admin functions

Each tool is a separate function with typed inputs/outputs. MCP handles the protocol. OpenClaw handles the orchestration. Claude handles the reasoning.

Solo founder. 23 cron jobs. 150+ posts. What works and what breaks.

Context Studios — contextstudios.ai

DEV Community