DEV Community

GaltRanch
GaltRanch

Posted on • Originally published at astrolexis.space

Why I Built My Own AI: The Case for Self-Hosted Domain Agents (Kulvex)

Originally published on the AstroLexis blog. Cross-posted here for the community.

In 2024 I got tired of paying OpenAI to know everything about my house, my conversations, my calendar, my code, and my family. So I built Kulvex AI — a self-hosted AI platform that runs an 80-billion parameter model on a pair of consumer GPUs in my office, exposes 17 domain agents over a private API, and handles everything from Zigbee lights to messaging across eight platforms. Here's why I built it, what it actually does, and where the trade-offs land in 2026.

The premise that broke for me

For about eighteen months I lived inside the OpenAI / Anthropic ecosystem like everyone else. Then a few things piled up at once:

  • OpenAI deprecated three model versions in a single quarter. Prompts that had been stable for nine months suddenly produced different outputs.
  • I was paying close to $400/month across personal + work API usage.
  • A friend got rate-limited mid-presentation by Anthropic.
  • I started building accessibility software for my dad and clinical software for my kid's therapist. For both, "send the user's voice to OpenAI" was a non-starter. Once I'd done the on-device work for those, the "we have to use cloud" thinking for everything else stopped making sense.

The breaking point was simple: I wanted an AI that would still work in five years exactly the way it works today, on hardware I own, that I could point at any task without asking anyone for permission. That product didn't exist. So I built it.

What Kulvex actually is

Three pieces:

  1. The server — runs on your own hardware (a workstation with at least one 24GB GPU, ideally two). Hosts a quantized 80B-parameter model via llama.cpp, exposes a Socket.IO API for clients, and orchestrates a set of domain agents.
  2. The iOS client (in App Review as of writing) — talks to your Kulvex server over a private endpoint. Apple Foundation Models on-device for low-latency tasks, falls back to the server for anything requiring the 80B model.
  3. Kulvex Code — terminal IDE with 15+ plugins. Sibling product to KCode, but for general dev work instead of security audit.

The 17 domain agents

"AI assistant" is too vague. What Kulvex actually does is split work across specialized agents:

🏠 Home Automation · 📱 Messaging Hub · 📅 Calendar · 📧 Email · 📰 News Curation · 🎵 Music · 🌤️ Weather · 📷 Cameras (Hikvision) · 💡 Lights (Zigbee + Tuya) · 🔌 Energy/Solar · 🍔 Food · 🚗 Vehicle · 📚 Research · 💰 Finances · 📝 Notes · 🛠️ Code · 🧬 Self-Evolution

Each agent has a narrow job, its own toolset, and its own context budget. When the user says "turn off the kitchen lights and tell me what's on the calendar tomorrow," the platform routes to two agents in parallel. Neither agent needs to know about the other.

This design made Kulvex tractable for me as a solo developer. The general-purpose "do everything" AI model is hard to make good. Seventeen narrow agents, each replaceable independently, is much easier to ship.

The piece I'm proudest of: self-evolution

One of the 17 agents is called Self-Evolution. Its job is to read Kulvex's own codebase, identify bugs or improvements, write the fix, run the test suite, and — if everything passes — commit and deploy.

Three guardrails make this safe:

  1. Sandboxed worktree. Changes apply to a clone, tested in isolation, only merged if CI passes.
  2. I review every commit. Agent opens PRs against my GitHub; nothing lands without human approval. No write access to main.
  3. Scope bounded by directive files. Agent reads OWNER-DIRECTIVES.md at the repo root, which tells it what it's allowed to modify (typos, dead code, small refactors, dependency upgrades) and what it's NOT (auth, payments, model selection logic, API surfaces).

In four months: ~340 PRs opened, ~280 merged. Meaningful chunk of the 200K-line codebase maintenance offloaded.

The hardware reality

Self-hosted AI in 2026 means buying GPUs:

  • RTX 5090 (32 GB VRAM) — primary inference. Runs the 80B model in 4-bit quantization.
  • RTX 4090 (24 GB VRAM) — secondary. Whisper-large + vision.
  • 96 GB DDR5 system RAM for context cache.
  • ~$5,500 total. Three-year amortization = $155/month — less than half what I was paying OpenAI.

The math gets better the more you use it. Hardware costs the same whether I run 10 queries or 10,000. After breakeven, marginal cost approaches zero.

For users who don't want to run their own GPU, Kulvex has a hosted "Home" tier where we run the server on shared infrastructure — still no cloud LLM vendor in the loop, model weights pinned, but you're renting compute instead of buying.

What I didn't have to give up

The 80B model on the RTX 5090 produces output that, for 95% of tasks (code review, drafting, knowledge retrieval, agent orchestration), is roughly comparable to GPT-4-class. The frontier (Claude Opus 4.x, GPT-5) still beats it on hard reasoning. For those cases Kulvex's orchestrator can route to Claude or GPT if the user opts in. I rarely turn that on. Privacy + cost + reliability beat marginal capability for my workload.

What I did have to give up

Honest list:

  • Setup is harder. Kulvex server takes 30-60 minutes to install if you're comfortable with Docker and CUDA. Not great for non-technical users. Hosted tier exists for this.
  • Cold starts. Idle GPU + first query = 5-10s while model loads back into VRAM. Subsequent queries sub-second.
  • Image generation. Not bundled. Need DALL-E quality? Fall back to a hosted service. ComfyUI integration on roadmap.
  • Internet research. Local model has no live web (by design). Agents can fire search to SearXNG or Perplexity. Not as smooth as ChatGPT browsing.

For casual AI users, cloud is still better. Kulvex makes sense once your usage is heavy enough that the privacy + cost + reliability balance flips.

Pricing

  • Starter — free, hosted, capped.
  • Home — $19/month hosted, unmetered personal use.
  • Pro — $49/month hosted + on-prem option.
  • Enterprise — custom, on-prem with SLA.

The on-prem tier never phones home except for license validation.

The honest part

  • iOS client in App Review right now (build 4.3.6). Apple rejected previous builds for AI consent flow; resolved.
  • Real users on Home and Pro exist but small numbers.
  • Platform mature, product/GTM still finding shape. Biggest gap: install-and-onboard for non-technical users.
  • I use Kulvex personally every day. That's the proof. Whether it's a business — same as KCode, we'll know in two quarters.

Who this is for

  • People who already self-host (Home Assistant, Nextcloud, Jellyfin homelabbers).
  • Privacy/compliance constraints (healthcare, legal, defense, GDPR).
  • Heavy AI users — $50+/month in API spend.
  • People who believe intelligence-as-software should be ownable, not rented.

— Bruno Galtranch, founder, AstroLexis LLC. If you self-host or are considering it: contact@astrolexis.space.

Top comments (0)