DEV Community

Cover image for I still don't want to give Claude SSH access, so I built a doctor for my homelab
SangheeSon
SangheeSon

Posted on

I still don't want to give Claude SSH access, so I built a doctor for my homelab

AI as interpreter rather than operator

A few weeks ago I wrote about why I don't want to give Claude SSH access to my home server.

It's not that AI agents are useless. It's the opposite. They got good enough that handing one a shell started to feel reckless.

A shell isn't really an interface. It's a weapon with tab completion. It can read anything, delete anything, restart anything, and confidently run the wrong command on the wrong machine at 2am.

So I've been building HomeButler in the other direction:

Not "AI runs my server."
"AI asks my server safe, structured questions."

The thing dashboards don't do

The first version of HomeButler was mostly about visibility:

homebutler status
homebutler docker list
homebutler ports
homebutler report
Enter fullscreen mode Exit fullscreen mode

It worked. But after running it on my own homelab for a while, I noticed something.

Raw status isn't enough.

Most of the time I don't actually want to see every metric. I want to know four things:

  • Is something wrong?
  • Is it urgent?
  • What changed?
  • What should I check next?

Dashboards don't really answer those. They show you CPU, memory, disk, containers, ports, uptime, graphs, colors, tables — and then quietly hand the interpretation back to you.

That's fine when I'm sitting at my desk. It's much less useful when I'm checking my phone half-awake, wondering if something is quietly on fire.

For a small homelab I don't need a mini NOC. I need a calm answer.

So I added a new command:

homebutler doctor
Enter fullscreen mode Exit fullscreen mode

Output looks like this:

🩺 HomeButler Doctor — mac-mini

✅ CPU looks normal
✅ Memory looks normal
⚠️ Disk usage is high: 91%
⚠️ 1 container is stopped
⚠️ Public listener found on 0.0.0.0:8080
⚠️ Latest backup is older than 7 days

Suggested next steps:
→ homebutler docker list
→ homebutler ports
→ homebutler backup list
Enter fullscreen mode Exit fullscreen mode

Not "here is everything." More like "here is what deserves your attention."

Why this matters more once an AI is in the loop

This shape becomes a lot more interesting when an agent is involved.

Imagine giving an agent SSH and asking:

"Is my server okay?"

Now the agent has to decide what to run. Probably something like:

df -h
free -m
docker ps
docker logs ...
ss -tulpn
systemctl status ...
Enter fullscreen mode Exit fullscreen mode

That can work. But the agent is now exploring my box through a general-purpose shell. It can run too much, see too much, or run the right command on the wrong host. The blast radius is "whatever the shell can do," which is everything.

With HomeButler, the agent gets a much smaller surface:

homebutler doctor --json
Enter fullscreen mode Exit fullscreen mode

Structured output. Read-only. Bounded scope.

The agent doesn't need to be the operator. It can be the interpreter. That distinction is the whole point of the project for me.

What doctor actually checks

homebutler doctor is intentionally boring. That's the feature.

It checks the kinds of things I usually only notice once they've already become a problem:

  • high CPU, memory, or disk usage
  • stopped containers
  • public bind ports
  • missing or stale backups
  • notification readiness
  • whether report has a baseline for change detection

There's a strict mode for cron and CI:

homebutler doctor --strict
Enter fullscreen mode Exit fullscreen mode

And because everything is machine-readable:

homebutler doctor --json
Enter fullscreen mode Exit fullscreen mode

An agent can consume the result directly. No scraping terminal output, no parsing colored text, no guessing what df -h formatted on a particular distro.

Report vs Doctor

After using both for a while, I've started thinking about HomeButler as having two distinct kinds of answers.

report answers: "What changed since last time?"

homebutler report
Enter fullscreen mode Exit fullscreen mode

It saves snapshots, compares the current state with the previous one, and summarizes notable changes.

doctor answers: "What looks risky right now?"

homebutler doctor
Enter fullscreen mode Exit fullscreen mode

One looks backward. One looks at the current risk. Together they're much more useful than raw metrics, and honestly more useful than most of the dashboards I've run.

The pattern I keep coming back to

The longer I work on this, the more I think the framing matters more than the features:

AI agents don't need more power by default. They need better tools.

A shell gives an agent maximum power and maximum ambiguity. A narrow tool gives it less power but more meaning. For homelab ops, that tradeoff feels right to me. I don't want an agent freely roaming my server. I want it to ask specific, bounded questions:

homebutler doctor --json
homebutler report --json
homebutler inventory scan --json
homebutler backup drill uptime-kuma --json
Enter fullscreen mode Exit fullscreen mode

And then explain the result in plain language.

That's a very different security model from "here is SSH, good luck."

Try it

HomeButler is a single Go binary. No daemon, no database, no always-on service.

brew tap Higangssh/homebutler
brew install homebutler
Enter fullscreen mode Exit fullscreen mode

Or:

curl -fsSL https://raw.githubusercontent.com/Higangssh/homebutler/main/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

Then:

homebutler init
homebutler doctor
homebutler report
Enter fullscreen mode Exit fullscreen mode

Repo: https://github.com/Higangssh/homebutler


I'm still building this around one idea: the future of AI-assisted ops shouldn't be "give the agent a shell." It should be "give the agent a tool that says what it means."

Genuine question for anyone running a homelab: what's the last thing your setup broke without warning? I'm collecting ideas for what doctor should check next.

Top comments (13)

Collapse
 
itskondrat profile image
Mykola Kondratiuk

ran into the same reluctance - giving an AI agent root over my homelab felt like handing it to an intern on day one. the doctor pattern is cleaner: define the safe surface explicitly, the agent stays in bounds

Collapse
 
higangssh profile image
SangheeSon

Thanks, "intern on day one" is exactly the feeling. I just wanted a way to manage my server safely — letting an AI touch it through SSH felt way too risky. So instead of giving it a shell, I gave it a fixed set of commands that only read things or do safe ops. The agent stays in bounds because the bounds are the only thing it can see.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

that constraint-by-design approach is cleaner than permission scoping - the agent can't exceed scope because the scope is its entire API surface. the edge I keep hitting is safe vs unsafe ops isn't always clear at build time. you decide 'service restart is safe' until it cascades. do you revisit the allowed command list after incidents or keep it static?

Collapse
 
klaudiagrz profile image
Klaudia Grzondziel

Interesting idea! Only today I had a heated discussion with my colleagues about security considerations around using AI and how much you should allow it. I think many people are still not aware of the risks it can bring. Will make sure to read your previous article.

Good luck with your project, it looks impressive! 👏🏻

Collapse
 
higangssh profile image
SangheeSon

Thanks Klaudia! Glad the timing lined up with your discussion 😄 Hope the previous article is useful too!

Collapse
 
lcmd007 profile image
Andy Stewart

Spot on! Handing a blind weapon like SSH to an AI agent is a disaster waiting to happen.
Local ops demand strict boundary control and determinism. Using Go to build structured, read-only APIs lets the AI act as the 'interpreter' rather than the 'operator'—this is true Local-First architecture. Great work!

Collapse
 
shahzaib-tech profile image
Shahzaib

The "interpreter vs operator" framing is gold. That's exactly the trade-off I've been struggling with for my own homelab. A shell is a weapon with tab completion, as you put it, and I'm starting to think the right path isn't "better SSH security" but "bounded, read-only tools" like what you built here.

homebutler doctor --json is a beautiful interface for an agent. It's a perfect example of giving the AI less power but more meaning.

Are you planning to add any sort of "change detection" beyond the report command? Like alerting when a specific metric crosses a threshold without needing a full dashboard?

Collapse
 
higangssh profile image
SangheeSon • Edited

Glad that framing landed. And yeah, change detection is basically the direction I’m pushing toward — not just a small feature on top.

Threshold alerts already exist today: homebutler alerts --watch polls CPU, memory, and disk, then sends Telegram/webhook notifications when limits are crossed. Defaults are 90/85/90, so it works without a dashboard.

report also keeps snapshots and diffs against the previous run, so you get changes like “Running containers: 4 → 3” or “Disk /: +2.3 GB since last report.” And watch handles Docker events, pre-crash logs, and restart flapping.

The harder part is drift detection. A container restarting five times is easy to flag. But memory slowly growing 1% every day for two weeks? That’s the thing I want to surface without pulling in a full TSDB. I’m thinking about something lightweight on top of existing snapshots — open to ideas.

Collapse
 
demiai profile image
Demi AI

Quite interesting!

Collapse
 
higangssh profile image
SangheeSon

Thank you! 🙏

Collapse
 
harjjotsinghh profile image
Harjot Singh

"I don't want to give Claude SSH access" is a healthy instinct, and building a read-only doctor instead is the right pattern. The principle generalizes: give the agent rich observability and zero (or tightly scoped) mutation authority. Let it diagnose all day; make a human or a constrained, audited action path do the actual changing.

This is the same trust-boundary idea that makes agents safe anywhere - the danger isn't the model reasoning about your homelab, it's the model with a shell and sudo acting on a wrong conclusion. A doctor that reports findings keeps the failure mode at "bad advice you can ignore" instead of "rm -rf on prod." Smart build. Curious if you gave it any guarded remediation actions, or kept it strictly diagnostic?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.