DEV Community

AR.BHARADWAJ
AR.BHARADWAJ

Posted on

Building Vakh in BETA: Bugs, Learning, and Local Speech-to-Text

A few months ago, I had a simple idea.

"What if I could stop fighting my keyboard and just talk to my computer?"

That idea eventually became VAKH (Sanskrit for "Speech"), a native Windows application that listens to your voice and types directly into any application in real time.

The concept sounded straightforward.

Build a speech-to-text application.

Connect AI.

Add a nice UI.

Ship it.

Simple.

Or at least that's what I thought.

What is VAKH?

VAKH is an AI-powered Windows dictation tool designed to make interaction with a computer feel more natural.

Instead of typing every thought manually, users can activate the application, speak naturally, and have text appear directly inside applications such as VS Code, Chrome, Slack, Word, Notepad, and more.

The application is built using modern open-source technologies, including:

  • Rust
  • Tauri
  • Whisper AI
  • WebRTC Voice Activity Detection (VAD)
  • SQLite

The goal was not just speech recognition.

The goal was creating an intelligent layer between human thoughts and computer input.

The Exciting Beginning

Like many developers today, I used AI extensively throughout development.

I had:

  • A clear vision of the product
  • Feature breakdowns and planning documents
  • Architectural ideas
  • Powerful AI models
  • Modern AI development tools

Initially, progress felt incredibly fast.

Features appeared quickly.

UI screens came together smoothly.

Complex logic could be generated in minutes.

For a while, it felt like building software had become almost effortless.

Then the real engineering work started.

When Things Started Breaking

As VAKH became more complex, so did the challenges.

Individual features worked well on their own.

The problems started when everything had to work together.

Audio capture.

Voice activity detection.

Speech recognition.

Window focus management.

Keyboard injection.

Real-time updates.

State management.

Each component behaved correctly in isolation.

Getting all of them to cooperate consistently was a completely different challenge.

There were days when:

  • The UI looked perfect.
  • Audio was being captured correctly.
  • Whisper was processing voice input.
  • Logs showed everything was running.

And yet the application still refused to behave properly.

Random pauses appeared.

Transcriptions lagged.

Features that worked yesterday suddenly stopped working today.

The application constantly reminded me that building software is not about individual features.

It's about building reliable systems.

The Biggest Lesson From This Project

One thing became very clear during development.

AI is advancing at an incredible pace.

The reasoning capabilities of modern models are impressive.

The speed at which they can generate code is remarkable.

But when building a real application with multiple moving parts, AI alone is not enough.

Even with:

  • Detailed planning
  • Clear requirements
  • Strong technical understanding
  • Powerful AI tools
  • Advanced language models

There were still moments where the project got stuck.

Features broke.

Integrations failed.

Architectural decisions had to be reconsidered.

Debugging became unavoidable.

At those moments, human intervention became the most important factor.

Not because AI was failing.

But because software engineering is more than writing code.

It is understanding systems.

It is making trade-offs.

It is identifying bottlenecks.

It is connecting components together.

It is knowing when a generated solution fits the architecture and when it doesn't.

The biggest realization from VAKH was this:

AI accelerates development, but engineering judgment still builds successful products.

AI helped me move faster.

Human reasoning helped me move forward.

Why This Project Was Worth Building

Beyond the final application, VAKH taught me lessons that tutorials rarely cover.

I gained practical experience with:

  • Real-time systems
  • Native desktop application development
  • AI integration
  • Software architecture
  • Performance optimization
  • Debugging distributed workflows
  • Product design and usability

Most importantly, it showed me how software behaves when multiple technologies must operate together continuously.

Those lessons are difficult to learn without actually building something.

Try VAKH

The project is now available publicly:

Project Website: https://arbharadwaj.github.io/Vakh/

I would love feedback from developers, engineers, AI enthusiasts, and curious users.

If you decide to try it, please let me know:

  • What worked well?
  • What felt confusing?
  • Which features would you improve?
  • What new ideas would you like to see implemented?
  • What problems did you encounter?
  • How would you approach the architecture differently?

I'm especially interested in hearing from developers who have worked on:

  • Speech recognition systems
  • Desktop applications
  • AI-powered tools
  • Real-time processing systems

Sometimes a single suggestion can unlock the next major improvement.

What's Next?

VAKH started as an experiment.

It became a learning experience.

Today, it's a working application.

Tomorrow, it will hopefully become something even better.

There are still features to build, bugs to fix, workflows to optimize, and ideas to explore.

That's what makes software engineering exciting.

If you test VAKH, share your thoughts, ideas, criticism, or feature requests.

Every piece of feedback helps shape the next version.

And who knows?

Your suggestion might become the next feature.

Happy building 🚀

Top comments (0)