I Built a $49 Voice-to-Text App That Never Touches the Cloud

#ai #privacy #linux #productivity

Why I Built Brethof Voice Pro

I got tired of two things:

Voice-to-text tools that only work well in English
Every good STT solution either costs $700 (Dragon) or sends your voice to the cloud monthly

So I built something different.

What It Does

Brethof Voice Pro is a desktop app (Windows + Linux) that converts speech to text using AI — entirely on your device. Press Ctrl+D, speak, text appears where your cursor is.

Zero cloud calls during transcription. Models download once on first launch, then it works fully offline.

The Tech Stack

Qwen3-ASR engine — 1.84% word error rate across 10 languages (arXiv 2601.21337)
GGUF models via llama.cpp — 6 quantization tiers from 1 GB to 3.2 GB
Vulkan GPU acceleration — works on NVIDIA, AMD, Intel, and CPU-only
DeepFilter noise reduction
36 languages with auto-detection

Why It Matters

If you speak Thai, Polish, Arabic, Vietnamese, Korean, or dozens of other languages — there has been no good voice-to-text option for you. Dragon doesn't support most languages. Google's API charges per-minute and requires internet. Whisper's accuracy on non-English languages is mediocre.

Qwen3-ASR changes this. State-of-the-art accuracy across 36 languages, running locally on consumer hardware.