What I built
Atlarix — a ~400MB AI agent workstation that sits beside your IDE (VS Code, IntelliJ, Vim) instead of replacing it. A native harness built around the open-weight frontier: DeepSeek, Qwen, Kimi, MiniMax. BYOK is still there for everything else, but these four are the point, not an afterthought.
Built solo under NorahLabs, in Nairobi, Kenya.
The problem
I was running open-weight models for actual agentic work — multi-file edits, terminal commands, codebase exploration. Every tool I tried was built around Claude or GPT, with my models bolted on as a BYOK option. Context-window assumptions tuned for a different model. System prompts and tool-calling shaped around closed-model behavior. Retrieval that either dumps the whole repo or leans on a cloud vector DB that doesn't even exist on an offline machine.
The models are frontier-class now. The tooling around them isn't.
The approach
Blueprint — structural retrieval, no embeddings
- Universal Ctags symbols + ast-grep edges, backed by SQLite FTS5
- grep results reranked by structural relevance
- each result annotated with its enclosing function/class
- no vector DB, constant memory at any repo size
The thesis: for code, lexical + structural retrieval plus the model's own reasoning beats a vector index — which is also why Claude Code and opencode carry no embedding index. On my own large multi-repo workspace, a "find the signup code" query dropped from ~63K to ~26K turn tokens with exact file:line citations. (That's my own workspace, not a published benchmark — a reproducible eval is what I'm building next.)
Verified edit loop
- write → re-read from disk → compare to intended
- zero tokens on the happy path
- blocks "task complete" if an edit didn't actually land
Live model catalog
- managed model IDs fetched from a hosted config at startup
- new drops from the four labs appear automatically
- swapping a model is a config change, not an app rebuild
Per-OS sandboxing
- macOS: Seatbelt
- Linux: bubblewrap
- Windows: AppContainer
- every file write + command approval-gated, per-hunk diff review
Tech stack
- Electron + React + TypeScript
- SQLite FTS5 for retrieval
- Universal Ctags + ast-grep for structural indexing
- Rust helper for the Windows sandbox
- Node.js 24+
Where it's at, honestly
v13.9.0, shipped and working. I'm one developer, so I'll be straight about the early edges: no published head-to-head benchmark yet, macOS is notarized but Windows builds are currently unsigned (signing's coming), and the retrieval numbers above are from my own use, not a controlled eval. The honest pitch is "the first workstation built around open-weight models instead of just accepting them — here's exactly how," not "this beats everything."
Feedback wanted
If you're running open-weight models for agentic work — what's your current setup, and where does it break? That's the feedback that actually shapes this.
🌐 atlarix.dev
Top comments (0)