I was paying for three different cloud AI subscriptions and still didn't have an assistant that could actually do things — read my inbox, run on a schedule, and keep my data on my own network. So I moved the whole thing onto a small box that sits on my shelf and runs around the clock. Here's the setup, the trade-offs, and the numbers.
The problem with cloud AI for "assistant" work
Chat UIs are great for one-off questions. They fall apart the moment you want:
- Always-on automation — "every morning, check X and message me"
- Real device/inbox access without handing a third party your credentials
- Predictable cost instead of metered tokens
- Privacy — data that never leaves your LAN
That's an infrastructure problem, not a prompt problem. You need a machine that's always on, sips power, and runs an agent runtime.
The hardware
I landed on an NVIDIA Jetson Orin Nano class device:
- 67 TOPS of on-device inference
- ~15 W under load (cheap to run 24/7)
- Runs local models + browser automation + messaging integrations
You can absolutely DIY this. I went with a pre-built ClawBox (€549, plug-and-play) because I didn't want to spend a weekend on JetPack, thermals, and power modes — but the software stack below is open source and works on your own Jetson or Pi too.
If you're weighing build-vs-buy, I wrote up the math here: ClawBox vs Mac Mini vs DIY.
The software: OpenClaw
The runtime is OpenClaw — an open-source AI gateway that connects a model to Telegram/WhatsApp/Discord, gives it tools (browser, files, shell, MCP servers), and survives restarts so scheduled jobs keep running.
Minimal mental model:
[ Telegram/WhatsApp ] -> [ OpenClaw gateway ] -> [ local model + tools ]
|
runs 24/7 on the box
Setup requirements and tiers (minimum / recommended / production) are documented here: OpenClaw hardware requirements.
What it actually does for me
Three jobs earned their keep in week one:
- Inbox triage — sorts mail, flags the 3 things that need me, drafts replies to the routine ones.
- Scheduled checks — a morning cron that logs into a dashboard, pulls numbers, and DMs me a summary.
- A support bot — answers common product questions from a knowledge file.
All of it runs locally. Credentials live on the box, on my network — not in someone's cloud.
The trade-offs (honest version)
- Local models ≠ frontier models. For heavy reasoning you'll still reach for a hosted model via API. The win is where the orchestration and your data live.
- You own the uptime. It's a box in your house. Mine's been fine, but it's on you.
- Setup time if you DIY. That's the whole reason pre-built exists.
Cost, roughly
| Cloud subs (mine, before) | Local box | |
|---|---|---|
| Up-front | €0 | ~€549 one-time |
| Monthly | ~€60 across 3 services | ~€2 electricity + optional API |
| Data location | their cloud | my LAN |
Break-even was under a year for me, and I stopped renting three things that each did 10% of what I wanted.
Try it
- Hardware + the build-vs-buy breakdown: clawbox.tech
- Full review with benchmarks (67 TOPS, ~15 tok/s): ClawBox Review 2026
- OpenClaw is open source — run it on whatever Jetson/Pi you already have.
If you've built a similar always-on local setup, I'd love to hear what jobs you handed to it first. That choice matters more than the hardware.
Top comments (1)
This matches my experience. The real unlock isn't raw model quality, it's that an always-on box can actually do things on a schedule. A cloud chat window can't triage my inbox at 7am or sit behind a login running a browser task.
Two questions: which local model are you running on the 8GB for the routine work, and what tokens/sec are you seeing on the Orin Nano? And where does your local->cloud cutoff land in practice. Do you fall back to a frontier model with your own key for the heavy asks, or push a bigger quant locally?