AI agents are getting scary good at ops work.
Claude Code, Codex, OpenHands — they can SSH into a server, read logs, restart containers, fix configs. It works.
But I still won't give my home server an SSH key for an agent.
A shell is too sharp. It can inspect anything. It can delete anything. It can rm -rf the wrong path very fast and very confidently.
I want the agent to help operate my server, not be the operator.
So I've been building HomeButler in the opposite direction: not "AI runs my server," but "AI talks to my server through a narrow, structured interface."
The shape of the tool
HomeButler is a single Go binary. No daemon. No database. No always-on web service.
You can copy it onto a server, run it from cron, pipe it into a script, open the web dashboard, or hook it into Claude Desktop via MCP. Same binary, same core, different surfaces.
┌──────────────────────────────────────────────────┐
│ Layer 3 — Chat Interface │
│ Telegram · Slack · Discord · Terminal · Browser │
└──────────────────────┬───────────────────────────┘
│
┌──────────────────────▼───────────────────────────┐
│ Layer 2 — AI Agent │
│ Claude · LangChain · n8n · OpenClaw │
└──────────────────────┬───────────────────────────┘
│ CLI exec or MCP (stdio)
┌──────────────────────▼───────────────────────────┐
│ Layer 1 — Tool (homebutler) ← YOU ARE HERE │
│ │
│ CLI · MCP · Web → same internal/ core │
│ system · docker · ports · backup · watch │
└──────────────────────────────────────────────────┘
The agent gets explicit, JSON-returning operations. It can't reach outside this surface. The blast radius of a wrong call is bounded.
I sleep better.
Three things HomeButler does that I actually care about
There's a long feature list in the README. Three of them are why I keep working on this.
1. report — what changed, not just what is
Most homelab dashboards show state. Beautiful graphs, lots of dots.
But my home server doesn't fail because I lack data. It fails because I don't notice the right change at the right time.
🏠 Homebutler Report — mac-mini
── Current Status ──
CPU: 5.0% · Memory: 8.3/16.0 GB · Disk: 4%
Containers: 1 running, 1 stopped · Public ports: 5
── Needs Attention ──
⚠️ 1 container(s) stopped
── Notable Changes ──
No significant changes since last report.
── Suggested Actions ──
→ Address items in 'Needs attention' above.
First run takes a baseline. Every run after diffs against the last snapshot and tells you what moved. --json for piping into agents.
What I want it to answer is just three things:
- What looks wrong?
- What changed?
- What should I check next?
Not "here's a graph, you figure it out."
2. watch — the 3 AM container death problem
Your container crashed at 3 AM. Why? By morning, the logs are gone, the container is back up, and you have no idea what happened.
watch catches the moment it dies, saves the pre-death and post-restart logs, and analyzes the cause:
[03:14:22] INCIDENT: nginx (incident nginx-20260410-031422-7a2124)
Crash: OOM — process killed by SIGKILL (oom, confidence: high)
⚠ FLAPPING: acute (3 restarts in short window)
It uses exit codes (137 = OOM, 139 = segfault, 143 = SIGTERM) plus log pattern matching (panic:, Out of memory, FATAL) to categorize the crash. Flapping detection flags processes stuck in restart loops — acute (3+ in 10 min) or chronic (5+ in 24h).
Supports Docker (real-time event stream), systemd, and PM2.
homebutler watch add nginx # interactive: pick docker / systemd / pm2
homebutler watch start # foreground monitoring
homebutler watch history # list past incidents
homebutler watch show <incident-id> # full crash report
This is the feature I built for myself first. Every homelabber has had a 3 AM container death they couldn't explain.
3. backup drill — "having a backup" vs "being able to restore"
These are different things. I learned this the hard way.
backup drill boots your backup in an isolated Docker network, runs an HTTP health check against the booted app, and tears it all down. Like a fire drill for your data.
🔍 Backup Drill — uptime-kuma
📦 Backup: backup_2026-04-04_1711.tar.gz (18.6 MB)
🔐 Integrity: ✅ tar valid (8 files)
🚀 Boot: ✅ container started in 0s
🌐 Health: ✅ HTTP 200 on port 58574
⏱️ Total: 2s
✅ DRILL PASSED
Zero risk to your running services — completely isolated environment, random port, separate network. If the drill fails, your backup is theater. Better to find out today than on the day you actually need it.
Why this shape, not another dashboard
Portainer, Netdata, CasaOS, Uptime Kuma — these are all great. I run several of them. They solve real problems.
But on a small home server, my bottleneck isn't observability. It's me, the operator, with five tabs open trying to remember what "normal" looked like yesterday.
A small structured CLI fits that gap better than another tab:
-
Cron-friendly —
homebutler report --json | jqat 8 AM with my coffee - Script-friendly — exit codes that mean something, JSON everywhere
- Agent-friendly — MCP server built in, works with Claude Desktop / ChatGPT / Cursor out of the box
- Air-gap friendly — single binary, no daemon, no phone-home
The dashboard is also there (homebutler serve for a go:embed web UI), but it's opt-in. The default mode is "command you run, answer you get, exit."
The bigger bet
I think the next interesting interface for ops isn't a prettier dashboard. It's the conversation:
"Is anything weird with the server?"
(Claude calls
homebutler reportvia MCP, reads the JSON, summarizes)"Yeah — Vaultwarden has been restarting every ~2 hours since yesterday. Want me to pull the crash logs?"
For that to be safe, the agent needs a narrow tool, not a shell. That's what HomeButler is becoming.
Try it
brew install Higangssh/homebutler/homebutler
Or:
curl -fsSL https://raw.githubusercontent.com/Higangssh/homebutler/main/install.sh | sh
Then:
homebutler init # interactive setup
homebutler report # baseline + summary
homebutler watch tui # terminal dashboard
If you run a homelab, I'd love to hear: what's the one thing you wish someone would just tell you about your server every morning? That's the next thing I want report to answer.
Top comments (0)