Two months into self-hosting my AI agent, I opened my Anthropic dashboard and saw a number I didn't love. $47 for the month. Not catastrophic, but way more than it needed to be.
After a week of tweaking, I got that down to ~$15/month — same quality for daily tasks, same channels, same skills. Here's exactly what I changed.
(If you're new here: Part 1 covers setting up OpenClaw, and Part 2 covers the Skills I use daily.)
Why Was I Overpaying?
The default setup uses one model for everything. I had Claude Sonnet 4 handling every message — including "thanks", "ok", and "what time is it?". That's like using a sports car to go get milk.
Here's what the models actually cost (as of early 2026):
| Model | Input | Output | Best For |
|---|---|---|---|
| Claude Haiku 4.5 | $1/1M tokens | $5/1M tokens | Quick tasks, greetings, lookups |
| Claude Sonnet 4 | $3/1M tokens | $15/1M tokens | Balanced daily use |
| Claude Opus 4.6 | $5/1M tokens | $25/1M tokens | Deep research, analysis |
Haiku is 5x cheaper than Opus on output tokens, and 3x cheaper than Sonnet. For "what's the weather?" or "remind me to check the deployment" — Haiku handles it just fine.
Prices from Anthropic's pricing page. Check there for the latest rates.
Fix #1: Use Different Models for Different Channels
This was the single biggest win. OpenClaw lets you assign models per channel using modelByChannel in your config:
// ~/.openclaw/openclaw.json
{
"channels": {
"modelByChannel": {
"whatsapp": {
"default": "anthropic/claude-haiku-4-5"
},
"telegram": {
"default": "anthropic/claude-haiku-4-5"
},
"discord": {
"default": "anthropic/claude-sonnet-4"
},
"slack": {
"default": "anthropic/claude-sonnet-4"
}
}
}
}
My logic: WhatsApp and Telegram are mostly quick personal messages — Haiku is perfect. Discord and Slack are where I do actual work stuff (code review, debugging), so those get Sonnet.
This one change shifted about 50% of my traffic to Haiku. Immediate cost drop.
Want smarter routing? The community has built auto-routers like iblai-openclaw-router that analyze message complexity and route to Haiku/Sonnet/Opus automatically. I haven't tried it yet, but the concept is solid — route "hi" to Haiku and "analyze this architecture" to Opus, per message.
Fix #2: Limit Conversation History
This one surprised me. By default, OpenClaw sends your recent conversation history with every request — so the model has context. But that means every "yes" reply carries a long tail of previous messages worth of tokens.
OpenClaw lets you cap this with historyLimit:
{
"messages": {
"groupChat": {
"historyLimit": 10
}
},
"channels": {
"whatsapp": {
"dmHistoryLimit": 8
},
"telegram": {
"dmHistoryLimit": 8
},
"discord": {
"historyLimit": 15
}
}
}
I keep Discord higher (15) because work conversations need more context. WhatsApp and Telegram get 8 — plenty for casual back-and-forth.
Before this change, I was sending ~10,000 input tokens per request with all the history. After: ~4,000. That alone cut my input costs by more than half.
Fix #3: Use Anthropic's Prompt Caching
This one's not an OpenClaw setting — it's an Anthropic API feature. If your agent sends the same system prompt and conversation prefix with every request (which it does), prompt caching avoids reprocessing those tokens.
The savings are real:
| Operation | Cost |
|---|---|
| Normal input | $3/1M tokens (Sonnet) |
| Cache write (first request) | $3.75/1M tokens (1.25x) |
| Cache read (subsequent) | $0.30/1M tokens (0.1x) |
That's a 90% discount on repeated context. If your system prompt is 1,000 tokens and you send 50 messages, you pay full price once and 10% for the other 49.
OpenClaw supports this if your API provider has caching enabled. Check your Anthropic dashboard — if you see "cache read tokens" in your usage, it's already working.
Fix #4: Pick a Cheaper Default Model
Sounds obvious, but I was overthinking it. I switched my default from Sonnet to Haiku 4.5 for most channels, and honestly? For 80% of my daily interactions, I can't tell the difference.
Haiku handles:
- Quick Q&A and lookups ✅
- Smart home commands ✅
- Simple reminders and scheduling ✅
- Casual conversation ✅
Where I notice the difference: complex code review, long-form writing, and nuanced analysis. For those, I keep Sonnet (or Opus) on my work channels.
The mental shift: default cheap, upgrade where it matters — not the other way around.
Fix #5: Monitor and Set Limits
You can't optimize what you don't measure. OpenClaw has a built-in stats command:
openclaw stats
Today's Usage:
Input tokens: 45,230
Output tokens: 12,450
Estimated cost: $0.23
This month:
Total tokens: 1,234,567
Estimated cost: $8.45
I check this weekly. It helps me spot when something's off — like that time a Discord channel was generating way more traffic than I expected.
Also set up usage limits on your Anthropic account directly. The Anthropic Console lets you configure monthly spend caps so you never get a surprise bill. I set mine at $25/month — if I ever hit it, something's wrong.
Quick Math: What Should You Expect to Pay?
Here's a rough formula:
Monthly Cost = (Daily Messages × Avg Tokens × Price per Token) × 30
For 50 messages/day with Claude Sonnet 4 (avg 800 input + 400 output tokens):
- Input: 50 × 800 × ($3/1M) × 30 = $3.60
- Output: 50 × 400 × ($15/1M) × 30 = $9.00
- Total: ~$12.60/month
With half your traffic on Haiku 4.5 instead:
- Haiku portion (25 msgs): 25 × 800 × ($1/1M) × 30 + 25 × 400 × ($5/1M) × 30 = $0.60 + $1.50 = $2.10
- Sonnet portion (25 msgs): 25 × 800 × ($3/1M) × 30 + 25 × 400 × ($15/1M) × 30 = $1.80 + $4.50 = $6.30
- Total: ~$8.40/month
Add prompt caching on top and you're looking at even less.
Three Setups to Get You Started
Tight budget (~$5-10/month):
- Default model: Haiku 4.5 for all channels
- History limit: 5 messages
- Prompt caching: enabled
- Good for: personal use, casual messaging
Balanced (~$15-25/month):
- Haiku 4.5 for WhatsApp/Telegram, Sonnet for Discord/Slack
- History limit: 8-15 messages depending on channel
- Prompt caching: enabled
- Good for: daily driver with some work use
Quality-first (~$30-50/month):
- Sonnet for most channels, Opus for specific work channels
- History limit: 20+ messages
- Good for: heavy work use, code review, research
More details and full config examples on open-claw.me/blog/openclaw-model-selection-cost.
TL;DR
- Route models by channel — Haiku for casual, Sonnet for work. Biggest single win.
- Limit conversation history — 8-15 messages is enough for most chats.
- Use prompt caching — 90% off repeated context, basically free.
- Default to a cheaper model — upgrade where it matters, not everywhere.
- Monitor weekly — catch surprises early, set spend caps.
These five changes took my bill from $47 to ~$15. The agent works just as well for daily tasks — it just uses the right tool for the job now.
Config details may vary by OpenClaw version — check docs.openclaw.ai for your setup.
What's your monthly AI agent bill looking like? I'm curious if anyone's running leaner — or if you've found optimizations I missed. Drop a comment.
Top comments (0)