I don’t really trust “private AI” that still runs on someone else’s hardware.
Every vendor has some version of: “Your data is safe, we don’t train on it, trust us.” But at the end of the day, you’re still piping sensitive work into a black box you don’t control, on infrastructure you can’t see, under policies that can change whenever it’s convenient.
So I started designing what I actually wanted:
A sealed box, on my own hardware, where AI works for me instead of on me.
That turned into the Sealed Box AI Runbook – a full write-up on how I run a local-only AI stack with a worker model, a watchdog model, local RAG, and agents, all behind my own guardrails.
GitHub repo: https://github.com/jtarkington77/sealed-box-ai-runbook
What “Sealed Box AI” means here
This isn’t “install one app and call it a day.” It’s an architecture and a set of habits:
- Worker model – the main model that answers questions, writes code, drafts reports, etc.
- Watchdog model – a second model that reads summaries of what the worker is doing and scores it for risky behavior, policy violations, or weird patterns.
- Local RAG – a retrieval layer (Qdrant in my case) that only indexes content I explicitly feed it.
- Agents – tightly scoped tools (internet research, intel sync, lab actions) that the worker can call, but only in specific ways.
- Strict boundaries – clear lanes between “things the model can see,” “things the model can touch,” and “things that never leave this box.”
The goal isn’t “maximum complexity.” It’s:
Use powerful models, but own the stack and the blast radius.
Who this is for
If you’re:
- Running a homelab or small environment and want AI without handing everything to a cloud vendor
- Doing blue-team / security work and don’t want incident data living in random SaaS logs
- Building tools where privacy, provenance, and control actually matter
…this runbook is written for you.
It’s not a sales deck. It’s “here’s how I actually wire this up at home.”
Architecture at a glance
The stack is meant to be understandable even if you’re not an ML engineer.
High-level flow:
- You → Open WebUI (or your UI of choice)
- Open WebUI → Worker model
- Worker can:
- Call local tools/agents (research, scripts, retrieval)
- Read from local RAG (Qdrant) for your own notes, docs, logs
- Each run generates a summary + metadata (what tools were used, what it tried to do, etc.)
- The watchdog model reads those summaries and:
- Flags risky behavior or policy violations
- Scores runs, so you can spot “spiky” or odd sessions
- Everything lives behind your own network controls:
- Reverse proxy / zero-trust edge if you expose anything
- No direct inbound to the models
- Clear separation between “inside the box” and “outside traffic”
Think of it as combining:
- Self-hosted LLM stack
- Minimal SIEM-style visibility
- Old-school “story of the system” runbook
Hardware tiers: reality, not fantasy
The runbook doesn’t assume you own a data center. I break things down by VRAM tiers and what you trade off at each level:
-
12–16 GB VRAM – Bare minimum
- Smaller models, fewer concurrent agents, more careful prompt design.
-
16–24 GB VRAM – Comfortable for a primary box
- Better 7B/8B models, more tools, more headroom.
-
24+ GB VRAM – Where it gets fun
- Multiple agents, stronger models, more experimentation without everything falling over.
The idea is: you can start on what you have now, and grow into the bigger build as you go.
What the runbook actually gives you
The repo isn’t just “here’s an idea.” It’s a practical guide:
- Conceptual design – how worker + watchdog + RAG + agents fit together
- Model selection notes – what I’m using and why, and what you can swap
- Network and host layout – how I separate concerns and keep the blast radius small
- Operational habits – how to think about logging, summaries, and watching your own AI stack
- Build-sheet style notes – so you can adapt it to your own hardware instead of copying blindly
If you’ve ever wanted to move from:
“I hope my vendor’s ‘private AI’ story holds up”
to
“I know exactly where this data lives and what these models can touch”
…that’s the gap this runbook is trying to close.
Grab the full runbook
If any of this resonates, the full write-up lives here:
Sealed Box AI Runbook
https://github.com/jtarkington77/sealed-box-ai-runbook
I’ll keep iterating on it as I test new models, refine the watchdog, and tighten the guardrails. Feedback, arguments, and “you’re missing a huge threat” comments are all welcome.
More of my work & tools:
Portfolio: https://jtarkington-portfolio.netlify.app
GitHub: https://github.com/jtarkington77
Top comments (0)