Most AI chat apps are either web-only — so data leaves the machine — or Electron-based, with a heavy Chromium bundle. KathaGPT is a different approach: a desktop app built for fully local use, a small footprint, and a backend that stays on-device. The stack is Rust + Tauri + llama.cpp.
What KathaGPT does
- Download and run Llama, Mistral, Qwen, DeepSeek from inside the app — no terminal, no Ollama install
- Chat offline once a model is downloaded
- Optionally connect cloud providers (OpenRouter, OpenAI, Anthropic, Gemini, Perplexity) with bring-your-own-key
- Keep chats, keys, and settings on disk — no account, no telemetry
Website: https://santoshpremi.github.io/KathaGPT/
Repo: https://github.com/santoshpremi/KathaGPT (MIT)
The old Node.js server is gone. One Rust core handles the API, storage, streaming, and the desktop shell.
1. Axum embedded in Tauri
The HTTP API runs inside the Tauri process on 127.0.0.1:17890 — loopback only, not exposed to the LAN.
In development, Vite serves the React UI on :5173 and proxies /api/local to the Rust server. In production, the built dist/ folder is loaded by the WebView and the API stays in-process.
Why this matters:
- No separate daemon to manage
- No Electron-style Chromium bundle
- Native window + system tray from Tauri
- Smaller installers, lower RAM than Electron apps
2. llama.cpp as a sidecar
The llama-server binary (~15 MB) is not shipped inside the installer. On first local model use, KathaGPT:
- Downloads the correct
llama-serverbuild from llama.cpp GitHub Releases - Extracts it to the app data directory
- Reuses the cached binary on later launches
The sidecar exposes an OpenAI-compatible API at 127.0.0.1:11435. That means local and cloud models share the same streaming code path in stream.rs — no duplicate inference logic.
Model catalog lives in Rust (model_catalog.rs): 18 curated GGUF models with HuggingFace URLs, RAM requirements, and quant levels. Download progress streams over SSE so the UI can show a real progress bar.
3. Unified streaming for local + cloud
rust
match resolve_model_route(pool, model).await? {
ModelRoute::Local { model } => {
sidecar::ensure_running(&model).await?;
stream_openai_compatible(
"http://127.0.0.1:11435/v1/chat/completions",
"local",
&model,
options,
).await?
}
ModelRoute::OpenRouter { slug } => {
stream_openai_compatible(
"https://openrouter.ai/api/v1/chat/completions",
&key,
&slug,
options,
).await?
}
// Direct: OpenAI, Anthropic, Gemini, Perplexity ...
}


Top comments (0)