DEV Community

Cover image for I tried OpenClaw, the 'free' AI agent. It cost me $500.
GDS K S
GDS K S

Posted on • Edited on

I tried OpenClaw, the 'free' AI agent. It cost me $500.

OpenClaw has 124k stars on GitHub. Every other AI-adjacent account on X is hyping it. "Open source. Free. The future."

I spent a weekend setting it up. The software really is free. The thing it does isn't.

This is what every tutorial skips.

What I thought I was getting

You've seen the demos. OpenClaw, which used to be Clawdbot, which used to be MoltBot, looks like a personal AI assistant living inside your messaging apps. It writes code while you sleep. It checks your servers and pings you on Telegram. It opens PRs. It browses the web. MIT licensed. Install and go.

What nobody shows you is the meter running underneath all of it.

What OpenClaw actually costs to run

The free stack:

  • OpenClaw itself: $0
  • Cloudflare Tunnel: $0
  • LiteLLM proxy: $0

The not-free stack:

  • A VPS to host it on: $23 to $70 a month
  • The model behind it: $50 to $500+ a month, easily

The model is where it gets interesting. Per million tokens, roughly:

Model Input Output Verdict
GPT-4o-mini $0.15 $0.60 Useless for agents (more below)
GPT-4o $2.50 $10 Minimum bar for "actually works"
Claude Sonnet 4.5 $3 $15 Good middle ground
Claude Opus 4.5 $5 $25 What every demo is secretly running

What people are actually spending

Federico Viticci at MacStories burned 180 million tokens his first month. About $3,600.

A guy on X had a runaway cron loop one night and woke up to a $200 charge.

The viral "MoltMaxxing" post going around says $200/month is the floor if you want it to feel like the demos.

None of these are worst-case. They're median.

Why OpenClaw cron jobs eat tokens

This is where the cost stops being theoretical.

ChatGPT or Claude in normal chat is one round trip. You ask, it answers, you close the tab. Cost is bounded by how much you talk.

OpenClaw with a heartbeat or cron is different. It wakes up on a schedule, pulls its system prompt and recent context, reasons about whatever it's monitoring, maybe pings you, goes back to sleep. Then again. And again. Forever.

A single check is small. Call it 3,700 tokens between system prompt, recalled context, reasoning, and reply. Run it every five minutes, that's 288 checks a day, around 32 million tokens a month. At GPT-4o rates: $128/month for one cron job.

Now stack them. Email watcher. Slack watcher. GitHub poller. Server health. Stock alerts. Each one ticking. Each one billing.

OpenClaw cost reality vs hype comparison

GPT-4o-mini broke everything

GPT-4o-mini is 17x cheaper than GPT-4o. Obvious move.

It was fine for chat. It was fine for "is the server up."

Then I asked it to fix a TypeScript build. The errors:

error TS2304: Cannot find name 'HeadersInit'
error TS2749: 'TaskStatus' refers to a value, but is being used as a type
Enter fullscreen mode Exit fullscreen mode

These are sixty-second fixes. HeadersInit is a DOM type, doesn't exist in Node, swap it for Record<string, string>. TaskStatus is a const object, the type you want is TaskStatusType derived from it.

Mini reported the errors back to me and gave up. I burned five failed runs before swapping in GPT-4o, which fixed both in one shot. The cheap model was more expensive.

The "use a smaller model for cost" advice doesn't survive contact with the actual reasoning agents need to do. Either the model is good enough or it isn't, and "good enough" right now means GPT-4o or Sonnet at minimum.

The MoltMaxxing claim

The post I keep seeing referenced says anything below Opus 4.5 isn't a slightly worse Moltbot, it's 40 to 95% of one. Which sounds fine until you realize the missing 5 to 60% is the part that does work unattended.

Their rough ranking:

  • Qwen3 30B local: not usable for agents
  • Kimi K2.5: 85-90%, still needs hand-holding
  • GPT-5.2-codex: 95%, "procedurally correct" but not always the right call
  • Claude Opus 4.5: works as advertised

Their conclusion is uncomfortable but probably honest: MoltMaxxing only works if you're OpusMaxxing.

What $200 a month actually buys

Same money, different shapes:

  • ChatGPT Pro: unlimited GPT-5 chat. No agents.
  • Claude Max: 20x Pro usage. Rate limits still bite.
  • OpenClaw + Opus on the API: full autonomy. Can spike to $500 if you're not careful.
  • OpenClaw + GPT-4o: ~$130 and roughly 90% of the way there.
  • OpenClaw + Mini: $30 fancy chatbot. Not actually an agent.

If you want to talk to an LLM, the flat-fee chat plans win. If you want it to do things in the background, you're on the pay-per-token rails.

Can you run OpenClaw locally?

Every comments section has this guy. The math doesn't pan out.

A 16GB Mac Mini runs Qwen 7B or Llama 8B - GPT-3.5 tier, barely useful for agents. A 24GB Mac Mini runs 14B models, below GPT-4o. A 64GB Mac Studio Ultra ($4k) runs 30B to 70B at maybe 80% of GPT-4o. A 128GB one ($6k) gets closer, with high latency, and still not Opus.

Plus electricity, around $10-15/month running 24/7.

Breakeven is around 200 hours of actual use a month. Below that, cloud wins. Above that, local might pencil out, assuming you don't need the top tier. Which, per everything above, you usually do.

OpenRouter, briefly

OpenRouter routes you through one API to multiple providers. It doesn't mark prices up. It also doesn't mark them down. You're still paying API rates.

Useful for swapping models without swapping clients. Not a cost solution.

My actual OpenClaw setup

After all the experimentation:

infra:
  vps: $23/month   # Azure, roughly 8 hrs/day
  cloudflare: $0

model:
  provider: Azure OpenAI
  model: GPT-4o
  tokens: 50-80M / month
  cost: $80-120 / month

channels:
  telegram, discord, github webhooks: $0

total: ~$100-150 / month
Enter fullscreen mode Exit fullscreen mode

For me, fine. It does work I'd otherwise spend hours on.

For most people who ask me about it, probably not fine, unless they go in knowing the floor is $100 a month and that the cheap-model cope doesn't really work.

Where this goes

The MoltMaxxing post calls it: there's going to be an "agent tier" from Anthropic and OpenAI, probably $300-500 a month, with rate limits designed for always-on automation instead of pay-per-token. Anthropic's already partway there with Max. The metered model isn't going to survive contact with mainstream agent use. Nobody wants a $200 surprise from a runaway loop.

Until then I've paused aggressive agent work and gone back to shipping a mobile app. I'll come back when the bill is bounded.

What I'd tell you in a DM

The software is real. The team shipped something good. None of this is OpenClaw's fault.

But the demos you're seeing are running Opus. When you try the same thing on the budget model nobody mentions out loud, the magic stops. You get a chatbot with a cron job attached.

The model is the product. The agent framework is packaging.


Have you run OpenClaw on a different stack? Curious what your bill looks like.

@thegdsks

Top comments (4)

Collapse
 
thang_hoangnguyenmanh_2 profile image
Thang Hoang Nguyen Manh

Exactly, the API provider costs are quite high. But there's a workaround: you can use Codex or Gemini CLI with OpenClaw by just signing up for the $20 plan. You hit the nail on the head, though—the price of maintaining a 'personal assistant' might exceed the cost of hiring a real person

Collapse
 
alexander_w_470b77a173154 profile image
Info Comment hidden by post author - thread only accessible via permalink
Alex

You did it Just wrong.

Hetzner Cloud VPS: 7€/Month
Claude Code Pro: 18€/Month

Use claude setup-token
It's Just that simple.

Cheers

Collapse
 
thegdsks profile image
GDS K S
Collapse
 
opposer profile image
Info Comment hidden by post author - thread only accessible via permalink
Opposer

You wrote this with AI didn't you

Some comments have been hidden by the post's author - find out more