Last night at 2am I checked my OpenAI dashboard.
$140. Just in API fees. Just for running my personal AI agent.
That felt insane. I own the hardware. Why am I paying a monthly bill forever just to run something on my own machine?
So I built AirClaw.
What it does:
AirClaw bridges OpenClaw (personal AI agent for WhatsApp/Telegram/Discord) to a local LLM running on your own GPU via AirLLM. Instead of every message costing you API money, it runs completely locally. Forever free.
bashpip install airclaw && airclaw install
That's literally it. It auto-detects your OpenClaw config, backs it up, patches it to point to localhost instead of OpenAI. Then you start the local server and restart OpenClaw.
The tech:
AirLLM uses layer-by-layer inference — instead of loading the whole 70B model into VRAM at once, it streams one layer at a time. So you only need 4GB VRAM regardless of model size. Trade-off is speed but 7B runs fast enough for real-time chat.
RabbitLLM (newer fork) adds support for Qwen2.5, DeepSeek, Phi-3.
Models supported:
Mistral 7B — default, 4GB GPU, fast
Llama 3 8B — 6GB GPU, better reasoning
Qwen 2.5 7B — multilingual, 4GB
DeepSeek 7B — great for coding
Phi-3 mini — fastest, any hardware
Llama 70B — 4GB GPU, slow but insane quality
What happened when I posted it:
Posted on Reddit yesterday. Woke up to 36,000 views, 73 upvotes, 22 GitHub stars in 15 hours.
Turns out a lot of people have this problem.
Try it:
bashpip install airclaw
airclaw install
airclaw start
restart OpenClaw
GitHub: github.com/nickzsche21/airclaw
If you try it drop a comment — especially curious about Mac and older hardware results.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.