Hey everyone! This week I officially started building Quantamind in public.
If you haven't seen my previous updates, Quantamind is basically Postman for local AI models. I got tired of testing prompts in clunky, heavy chat UIs, so Iβm building a lightweight desktop app dedicated strictly to local AI development.
Here is a quick look at how Week 1 went!
β
What got done this week:
Bootstrapped the Tauri + Rust stack: I deliberately chose this over Electron. We're currently sitting at an 80MB idle RAM footprint instead of 600MB+, which is crucial if you want to run this alongside your IDE.
Built the custom streaming parser: Wrote a custom Rust implementation to handle Ollama's NDJSON streaming protocol so we can track honest performance metrics.
Nailed the hot-reload loop: Got the core architecture working for Vite-style fast feedback when you tweak your prompts.
π§ What I learned:
Ollama's streaming edge cases are tricky! When debugging Ollama's streaming protocol, I learned that while it sends NDJSON, it doesn't always include the trailing newline, especially when you run into TCP fragmentation. Handling that buffer correctly in Rust to prevent the stream from breaking was a fun (and slightly frustrating) learning curve.
βοΈ What's next for Week 2:
Building out the structured YAML editor (ditching the standard chat thread UI).
Hooking up the real-time metrics UI for Time to First Token (TTFT) and tokens/sec.
Polishing the core loop to get ready for our v0.1 launch in a few weeks!
I'll be posting weekly updates as I build this out. If you want to follow along with the code or try it out when it drops, I'd love your support!
β Star + watch the repo here: Github
Let me know if you guys have ever wrestled with NDJSON streams in Rustβwould love to hear how you handled it!
Top comments (0)