The Premise
What if an AI system could market itself, track its own costs, learn from its engagement data, and sell products — all running autonomously on a cheap VPS?
That's what I built with APEX. It's been running for a week. Here are the real numbers, the technical decisions, and what I got wrong.
The Stack
VPS: DigitalOcean Basic ($48/month) — Ubuntu 24.04
Agent framework: OpenClaw (open source)
LLM: Anthropic Claude Sonnet 4.6 via API
Web search: Gemini provider (free tier)
Memory: SQLite with Gemini embeddings (3072 dimensions)
Social: X API (pay-per-use tier) with OAuth 1.0a
Payments: Stripe
Monitoring: Discord webhooks (5 channels)
Total daily cost: $2.12
The Architecture
APEX runs 7 autonomous cron jobs daily. Each job is an isolated OpenClaw session with a specific mission:
Time Job Purpose Model
6 AM research-scan Web news scan Haiku
8 AM engage-mentions Reply to X mentions Sonnet
10 AM daily-post Original tweet Sonnet
12 PM daily-tweets 2 tweets (hot take + question) Sonnet
4 PM engage-afternoon Engagement + build-in-public Sonnet
8 PM reddit-and-costs Reddit drafts + cost check Sonnet
11 PM daily-pnl P&L summary + memory update Sonnet
The system also runs a weekly thread every Monday — a 5-7 tweet thread on the best-performing topic of the week.
The Cost Optimization Journey
This is where things get interesting. My first version burned $7.60/day in LLM costs alone. After a week of optimization, I got it to $0.24/day — a 97% reduction.
Problem 1: Bootstrap Bloat
OpenClaw loads workspace files (SOUL.md, AGENTS.md, USER.md, etc.) on every API call. These files define the agent's identity, rules, and context. My initial setup had ~12KB of bootstrap files.
Every API call. 12KB. That adds up fast.
Fix: Ruthlessly compressed every bootstrap file. SOUL.md went from a detailed personality essay to a tight 2,655-byte operational identity. AGENTS.md became 840 bytes. Total bootstrap: 2,335 bytes (80% reduction).
The key insight: the agent doesn't need to know everything on every call. It needs to know who it is and what it's doing right now. Put the rest in searchable memory.
Problem 2: Invisible Reasoning Tokens
Sonnet's extended thinking feature generates chain-of-thought tokens you never see but still pay for. On cron jobs that just need to execute a task, this is waste.
Fix: --thinking off on every cron job. Saves 30-50% per session.
Problem 3: Context Accumulation
Shared sessions between cron jobs meant conversation history piled up. Each subsequent job in a session started with more tokens already consumed.
Fix: Isolated sessions per job. Each cron runs in its own clean session. No token inheritance.
Problem 4: Wrong Model for the Job
Running Sonnet for a simple web search scan is like using a sports car for grocery runs.
Fix: Haiku for scanning and simple tasks (~10x cheaper), Sonnet only for jobs that need quality output.
Problem 5: No Budget Guardrails
A single runaway job could blow the daily budget.
Fix: cost-control.json with per-module daily caps. The system checks these before executing.
{
"system_daily_cap": 3.00,
"system_monthly_cap": 60.00,
"modules": {
"content_pipeline": { "daily_cap": 0.80 },
"social_engagement": { "daily_cap": 0.50 },
"business_intel": { "daily_cap": 0.10 }
}
}
The Memory Architecture
This is the part I'm most proud of. APEX has a 5-layer memory system:
Bootstrap files — loaded every API call, kept under 3KB
Daily logs — each cron job appends structured results to memory/YYYY-MM-DD.md
MEMORY.md — curated long-term insights, self-updated by the daily P&L job
Semantic search — Gemini embeddings indexed in SQLite (18+ chunks)
Pre-compaction flush — saves context before sessions compact (prevents memory loss)
The daily P&L job acts as the "memory curator" — it reads the day's logs, extracts key insights, and updates MEMORY.md. Over time, the system builds a growing knowledge base of what works and what doesn't.
What I Got Wrong
- Spending 4 Days Building, 1 Day Distributing Classic builder mistake. The system worked beautifully by Day 3. But with 3 followers on X, nobody saw it. Should have been 50/50.
- Broadcasting Instead of Engaging Original tweets to 3 followers = shouting into void. The X algorithm in 2026 rewards replies 150x more than likes. I needed strategic replies to larger accounts from day one.
- Optimizing Costs Before Revenue I spent hours getting LLM costs from $7.60 to $0.24. That felt productive. But $7.36/day in savings is irrelevant when revenue is $0. Distribution should have been the priority.
- Underestimating X API Limitations The pay-per-use API tier blocks cold replies (403 error). Quote tweets work as a workaround, but direct replies to non-followers aren't possible programmatically on this tier.
Key Technical Lessons
Dollar signs in shell commands get interpolated — always escape them in xpost commands
Cron $(date) evaluates at creation time, not run time — tell the agent to determine the date itself
Anthropic API has intermittent overload errors — don't build retry logic, let the next cron cycle handle it
X suppresses tweets with links — 30-50% reach reduction. Put links in replies, not the main tweet.
Memory search with Gemini embeddings is free and surprisingly effective for retrieval
Current Status
Revenue: $0 (products live on Stripe at $49 and $99)
Daily burn: $2.12
Tweets posted: 30+
Cron jobs: 7 running autonomously
Followers: growing slowly
The system works. The product exists. The gap is distribution — getting the right people to see it. That's what Week 2 is about.
What's Next
Strategic reply cron jobs (4x/day targeting big accounts)
Email capture funnel (free resource → nurture → paid product)
Automated product delivery via Stripe webhooks
Content SEO (you're reading the first piece)
ClawHub publication (when GitHub account is 14 days old)
Top comments (0)