SpeakShift: Fully Local Whisper.cpp + NLLB Translation + FFmpeg Media Converter
Hi DEV Community š
Like many of you, I spend a lot of time working with audio and video ā podcasts, meetings, lectures, interviews, and content creation. I wanted a fast, private, and fully offline workflow, but kept jumping between different tools for media conversion, transcription, and translation.
So I built SpeakShift ā a clean, focused desktop application that brings everything together using battle-tested local technologies.
Core Technologies
- Whisper.cpp ā Blazing fast local speech-to-text (supports tiny, base, small, medium, large-v3, and large-v3-turbo models)
- NLLB ā Neural Machine Translation for high-quality multilingual translation
- FFmpeg ā Powerful backend for media conversion and processing
What SpeakShift Does
- Convert media files (video ā audio, format changes, trimming, etc.)
- Transcribe audio/video locally with high accuracy
- Translate transcripts
- Organize your files and transcripts in a clean library
- Export in multiple formats (TXT, SRT, JSON, etc.)
- Speaker diarization (up to 4 speakers in Pro version)
Everything runs 100% locally ā no cloud, no API keys, no data leaving your machine.
Platforms & Performance
- Windows
- macOS (Apple Silicon optimized)
- Linux
It works great even on modest hardware, and flies on machines with decent CPUs or GPUs.
Pricing (Transparent & Fair)
- Free version ā Fully functional for media conversion + basic transcription. This will stay free forever.
- Pro (one-time purchase) ā Unlocks batch processing, speaker diarization, advanced translation, priority support, and future features.
Try the Free Version:
š https://usefulthings.gumroad.com/l/bzris
Full Product Page:
š https://usefulthings.gumroad.com/l/speakshift
Why I Built This
The LocalLLM community has shown that many developers and power users want control, privacy, and speed without monthly subscriptions. SpeakShift was built with that mindset.
Iād love to hear your thoughts:
- What Whisper model do you use most?
- What features are missing for your workflow?
- Any pain points with current local transcription tools?
Feedback, bug reports, and feature requests are more than welcome.
If you work with audio, content, research, or local AI tools, give it a try and let me know how it goes.





Top comments (0)