A few days ago, I built a local virtual assistant in Python (JARVIS) using Ollama, PyQt6, and Piper TTS. It was a fun MVP, but as I started using it daily, the architectural flaws became obvious.
Hardcoding tool calls into the main loop made the file massive, and storing conversation history in flat .json files wasn't scalable.
I just released the v1.2.0 update, focusing purely on technical debt and scalability. Here is a breakdown of the refactoring process:
1. The Database Migration (SQLite)
Previously, JARVIS saved facts (like user preferences) and chat history in JSON. I ripped that out and implemented a jarvis_memory.db using SQLite. Now, when the LLM triggers the guardar_recuerdo (save memory) tool, it executes a clean SQL insert.
Note for dev newbies: When using SQLite in a desktop app, make sure to use sys.exit(0) instead of os._exit() on shutdown to prevent WAL (Write-Ahead Logging) corruption!
2. Decoupling with a Plugin Manager
I wanted the assistant to control PC apps, smart lights, or Discord, but I didn't want a 2000-line main file. I created a plugin_manager.py. It scans a /plugins directory on startup, dynamically reads the __doc__ strings and functions of any Python script inside, and injects them as available JSON tools into the LLM's system prompt.
3. Fixing TTS Crashes with RegEx
Local TTS models like Piper are incredibly fast and private, but they are fragile. If JARVIS searched Wikipedia and the LLM returned text containing IPA phonetics (e.g., /ˈalbɐt ˈaɪnʃtaɪn/), Piper would crash with Error 0.
The fix? I built a RegEx preprocessing pipeline that runs right before the audio synthesis to strip brackets, emojis, and phonetic strings, ensuring the TTS worker never chokes on weird characters.
4. Reactive UI Improvements
Since the UI is a borderless PyQt6 window, text alignment was getting wonky on long responses. I implemented proper WordWrap and added a VisualizadorAudio class: a custom QWidget that uses math.sin to generate a smooth, responsive cyan equalizer that animates only when the TTS daemon is actively speaking.
The Result
The codebase is now clean, modular, and much faster. If you are building local LLM wrappers or Python desktop apps, feel free to poke around the code and see how the UI and Tool Calling interact.
🔗 Check out the code on GitHub: https://github.com/Jm7997/JARVIS
If you have ideas for cool plugins I could add next, drop a comment!
Top comments (1)
Thanks for reading! The transition to the plugin architecture was definitely the most fun part of this refactor.
If anyone here has deep experience optimizing PyQt6 for even lower resource consumption (or avoiding the Windows DWM quirks with transparent windows), I'd love to hear your thoughts.
Also, if you end up cloning the repo and writing a custom plugin for your own smart home or PC setup, please feel free to share it or open a PR. I’d love to start building a small library of community plugins!