DEV Community

Cover image for OpenBlob is evolving: better architecture, modern UI, and real-time transcripts
southy404
southy404

Posted on

OpenBlob is evolving: better architecture, modern UI, and real-time transcripts

Over the past days, OpenBlob changed a lot.

Not just visually โ€” but fundamentally.

This is a proper progress update on where things are heading ๐Ÿ‘‡


๐Ÿง  Quick recap

OpenBlob is a local-first desktop AI companion that:

  • lives on your desktop
  • understands your context
  • can see your screen (via vision models)
  • reacts in real-time
  • executes actions directly on your system

๐Ÿ‘‰ Repo: https://github.com/southy404/openblob


๐Ÿ”ง Rebuilding the core (this was the big one)

The biggest update isnโ€™t something you see. Itโ€™s how everything works underneath. OpenBlob now has a much cleaner and more scalable structure:

Core pipeline

input (voice / text / screen)
โ†’ intent detection
โ†’ command router
โ†’ execution (local first)
โ†’ AI fallback if needed

What changed

  • Clear separation of responsibilities
  • Proper command routing system
  • Modular capabilities instead of chaos
  • Easier to extend without breaking everything

This turns OpenBlob into something bigger than a chatbot: a runtime layer for your desktop.


๐Ÿงฉ Open-source friendly structure

One goal became very clear: this needs to be hackable. So the architecture is moving towards a module system like this:

๐Ÿ“ modules/
โ†ณ ๐Ÿ“ discord/
โ†ณ ๐Ÿ“ spotify/
โ†ณ ๐Ÿ“ browser/
โ†ณ ๐Ÿ“ system/

Each module:

  • exposes commands
  • runs locally
  • can be extended independently

This makes it much easier to:

  • build plugins
  • integrate APIs
  • experiment without touching the core

๐ŸŽจ New UI (cleaner, faster, more alive)

The UI got a big upgrade:

  • Floating bubble interface
  • Glassmorphism style
  • Smoother, more organic animations
  • Faster interaction

Interaction now feels like:

  • CTRL + SPACE โ†’ instant open
  • Global voice toggle
  • Minimal friction

Less โ€œtoolโ€. More presence.


๐Ÿ’ฌ NEW: Just Chatting mode

Sometimes you donโ€™t want commands. You just want to talk. So OpenBlob now has a Just Chatting mode:

  • Pure conversation with your AI companion
  • No command routing
  • No execution layer
  • Just dialogue

This is important because: the companion shouldnโ€™t only do things โ€” it should also be there.

Use cases:

  • Thinking out loud
  • Asking questions
  • Casual conversation
  • Testing personality / tone

๐Ÿ–ผ Screenshot assistant (more usable now)

The screen pipeline is getting more solid:

screenshot
โ†’ OCR
โ†’ context extraction
โ†’ reasoning
โ†’ answer

Already useful for:

  • Debugging
  • UI understanding
  • Games
  • Quick research

Still improving โ€” but getting reliable.


๐ŸŽ™๏ธ NEW: real-time transcript system

Alt Text

This is one of the biggest new additions. OpenBlob can now:

  • Listen to system audio
  • Listen to microphone input
  • Generate live transcripts
  • Store structured sessions

Pipeline

audio (system / mic)
โ†’ transcription
โ†’ segmented timeline
โ†’ structured session
โ†’ saved as text

What it already works for

  • Meetings (Meet, Zoom, etc.)
  • YouTube / podcasts
  • Lectures
  • General audio capture

๐Ÿงช Current prototype

  • Live text appearing in real-time
  • Segmented transcript blocks
  • Session tracking
  • Simple overlay UI

Itโ€™s still early. But it works.


๐Ÿ”ฎ Where transcripts are going

This is not just speech-to-text. Next steps:

๐Ÿ“ Meeting assistant

  • Summaries
  • Key points
  • Action items

๐Ÿง  Memory layer

  • Link transcripts to context
  • Searchable history

โšก Real-time help

  • Explain while listening
  • Highlight important info
  • Suggest responses

โšก Philosophy (still the same)

  • Local-first
  • Context > Prompt
  • System-level AI
  • Playful + useful

๐Ÿงช Current state

  • Still experimental
  • Still buggy sometimes
  • Evolving very fast

But now: Much better structure, clearer direction, and easier to contribute.


๐Ÿค If you want to join

Now is actually a great time. You can:

  • Build modules (Discord, Spotify, browser, etc.)
  • Improve transcription
  • Design UI
  • Experiment with AI

๐Ÿ‘‰ Join here: https://github.com/southy404/openblob


๐Ÿ’ก Final thought

Iโ€™m starting to believe the future of AI is not a chat window in a browser.

But something that lives on your system, understands your context, and can both act and talk.

OpenBlob is slowly getting there.

Top comments (1)

Collapse
 
benjamin_nguyen_8ca6ff360 profile image
Benjamin Nguyen

great explanation