Over the past days, OpenBlob changed a lot.
Not just visually — but fundamentally.
This is a proper progress update on where things are heading 👇
🧠 Quick recap
OpenBlob is a local-first desktop AI companion that:
- lives on your desktop
- understands your context
- can see your screen (via vision models)
- reacts in real-time
- executes actions directly on your system
👉 Repo: https://github.com/southy404/openblob
🔧 Rebuilding the core (this was the big one)
The biggest update isn’t something you see. It’s how everything works underneath. OpenBlob now has a much cleaner and more scalable structure:
Core pipeline
input (voice / text / screen)
→ intent detection
→ command router
→ execution (local first)
→ AI fallback if needed
What changed
- Clear separation of responsibilities
- Proper command routing system
- Modular capabilities instead of chaos
- Easier to extend without breaking everything
This turns OpenBlob into something bigger than a chatbot: a runtime layer for your desktop.
🧩 Open-source friendly structure
One goal became very clear: this needs to be hackable. So the architecture is moving towards a module system like this:
📁 modules/
↳ 📁 discord/
↳ 📁 spotify/
↳ 📁 browser/
↳ 📁 system/
Each module:
- exposes commands
- runs locally
- can be extended independently
This makes it much easier to:
- build plugins
- integrate APIs
- experiment without touching the core
🎨 New UI (cleaner, faster, more alive)
The UI got a big upgrade:
- Floating bubble interface
- Glassmorphism style
- Smoother, more organic animations
- Faster interaction
Interaction now feels like:
-
CTRL + SPACE→ instant open - Global voice toggle
- Minimal friction
Less “tool”. More presence.
💬 NEW: Just Chatting mode
Sometimes you don’t want commands. You just want to talk. So OpenBlob now has a Just Chatting mode:
- Pure conversation with your AI companion
- No command routing
- No execution layer
- Just dialogue
This is important because: the companion shouldn’t only do things — it should also be there.
Use cases:
- Thinking out loud
- Asking questions
- Casual conversation
- Testing personality / tone
🖼 Screenshot assistant (more usable now)
The screen pipeline is getting more solid:
screenshot
→ OCR
→ context extraction
→ reasoning
→ answer
Already useful for:
- Debugging
- UI understanding
- Games
- Quick research
Still improving — but getting reliable.
🎙️ NEW: real-time transcript system
This is one of the biggest new additions. OpenBlob can now:
- Listen to system audio
- Listen to microphone input
- Generate live transcripts
- Store structured sessions
Pipeline
audio (system / mic)
→ transcription
→ segmented timeline
→ structured session
→ saved as text
What it already works for
- Meetings (Meet, Zoom, etc.)
- YouTube / podcasts
- Lectures
- General audio capture
🧪 Current prototype
- Live text appearing in real-time
- Segmented transcript blocks
- Session tracking
- Simple overlay UI
It’s still early. But it works.
🔮 Where transcripts are going
This is not just speech-to-text. Next steps:
📝 Meeting assistant
- Summaries
- Key points
- Action items
🧠 Memory layer
- Link transcripts to context
- Searchable history
⚡ Real-time help
- Explain while listening
- Highlight important info
- Suggest responses
⚡ Philosophy (still the same)
- Local-first
- Context > Prompt
- System-level AI
- Playful + useful
🧪 Current state
- Still experimental
- Still buggy sometimes
- Evolving very fast
But now: Much better structure, clearer direction, and easier to contribute.
🤝 If you want to join
Now is actually a great time. You can:
- Build modules (Discord, Spotify, browser, etc.)
- Improve transcription
- Design UI
- Experiment with AI
👉 Join here: https://github.com/southy404/openblob
💡 Final thought
I’m starting to believe the future of AI is not a chat window in a browser.
But something that lives on your system, understands your context, and can both act and talk.
OpenBlob is slowly getting there.

Top comments (0)