FluentCap: A Free Desktop App for Real-time Transcription & Translation
Hi dev community!
I'm excited to share FluentCap - a desktop application I've been building to bring real-time transcription and translation to everyone. The core philosophy is simple: language should never be a barrier to understanding, learning, or connecting with others.
Check it out: https://fluentcap.live
Download: https://fluentcap.live/download
Who Is FluentCap For?
FluentCap is designed for anyone who works with audio in a different language:
| Use Case | How FluentCap Helps |
|---|---|
| Language Learners | Watch K-dramas, anime, foreign films with real-time bilingual subtitles |
| Remote Workers | Participate confidently in international meetings |
| Deaf & Hard of Hearing | Access real-time captions for any audio on your computer |
| Researchers & Students | Transcribe lectures, podcasts, interviews instantly |
| Content Creators | Generate transcripts for videos without expensive services |
Accessibility: Built for the Deaf & Hard of Hearing Community
One of FluentCap's core missions is accessibility. According to WHO, 433 million people worldwide experience disabling hearing loss, projected to reach 2.5 billion by 2050.
How FluentCap helps:
- Real-time captions for ANY audio - not just specific apps like Zoom or Meet
- Works with system audio - Netflix, YouTube, podcasts, lectures, video calls
- Bilingual display - see both original language and translation simultaneously
- Local processing - your audio stays on your machine, privacy-first
Research shows real-time transcription increases engagement for deaf students by 70-85%, and 98.6% of students find captions helpful for focus and comprehension.
FluentCap aims to make professional-grade captioning accessible to everyone at a fraction of the cost of traditional services (which can run $150+/hour for professional captioners). With FluentCap's BYOK model, the same utility costs just $0.15-0.40/hour.
The BYOK (Bring Your Own Key) Philosophy
Unlike subscription apps that charge $15-30/month, FluentCap uses a transparent BYOK model:
- FluentCap is free forever - no ads, no subscriptions
- You connect directly to speech-to-text providers using your own API keys
- Providers offer hundreds of hours free to get started:
| Provider | Free Credits | ~Free Hours |
|---|---|---|
| Deepgram | $200 | ~750 hours |
| AssemblyAI | $50 | ~140 hours |
| Gladia | 10 hrs/month | Unlimited (monthly reset) |
| Shunya | $100 | ~300 hours |
When free credits run out, you pay providers directly at wholesale rates: $0.15-0.40/hour (60-80% cheaper than subscription apps).
Feature Highlights
1. Audio Recording & Synchronized Playback
One of the newest features! FluentCap can:
- Record audio during transcription sessions (MP3 format)
- Merge recording bursts into a single continuous file
- Play back with full synchronization to the transcript
The 5-Pillar Interaction Model:
- Automatic Recording - Audio captured as MP3 in real-time
- Integrated Playback - Play/Pause, Seek, Download controls
- Click-to-Seek - Click any transcript line and audio jumps there
- Jump Navigation - Next/Previous highlight buttons
- Bidirectional Sync - Audio playback auto-highlights & scrolls transcript
2. Highlight & Annotation System
FluentCap isn't just for passive viewing - it's an active learning tool:
- Select any text - Instant highlight via context menu
Live Auto-Highlight Mode - Highlights apply immediately during recording (no menu clicks needed!)
Easy removal - Hovering over any highlight reveals a remove button
- Cross-session Gallery - View all highlights from all sessions in one sidebar tab
- Click-to-Navigate - Jump directly from highlight card to the exact position in transcript + audio
3. Bidirectional Sync
When audio plays:
- The current segment is highlighted with a subtle flash animation
- Both source & translated lists auto-scroll to keep the active segment centered
- Clicking any transcript line seeks audio without changing play/pause state
4. Multiple Layout Modes
| Mode | Description | Best For |
|---|---|---|
| Movie Mode | Large text, transparent overlay | Watching films |
| Meeting Notes | Visible sidebar, high opacity | Work calls |
| Side-by-Side | Vertical comparison view | Language learning |
5. Multi-Source Audio Capture
| Audio Source | Use Case |
|---|---|
| System Audio | Netflix, YouTube, Podcasts |
| Microphone | Dictation, Note-taking |
| Both | Video calls (hear everyone) |
6. Customizable Themes
FluentCap offers multiple themes to match your preference and reduce eye strain during long sessions.
Privacy-First Architecture
- Local storage only - all sessions saved on your machine
- Direct API calls - audio goes straight to providers, no intermediate servers
- No tracking - your data stays yours
Technical Stack
For the devs interested in the architecture:
- Framework: Electron + React + TypeScript
-
Audio Processing: FFmpeg (via
ffmpeg-static) - Storage: JSON-based local persistence
-
Virtualization:
react-windowfor smooth transcript scrolling - STT Providers: Deepgram, AssemblyAI, Gladia, Shunya
Thank You to Our Providers
FluentCap exists because of the amazing speech-to-text providers who make their technology accessible:
- Deepgram - Lightning-fast Nova-3 model
- AssemblyAI - Excellent accuracy and features
- Gladia - Generous monthly free tier
- Shunya - Budget-friendly rates
When your free credits run out, please consider supporting them. Their pricing is incredibly fair — just $0.15-0.40 per hour, and they deserve recognition for democratizing this technology.
Try It Out
Website: https://fluentcap.live
Download: https://fluentcap.live/download
Blog: https://fluentcap.live/blog
Available for macOS, Windows, and Linux.
Looking for Feedback
I'd love to hear from the community:
- What features would make this more useful for your workflow?
- Any accessibility improvements you'd suggest?
- Feedback on the BYOK model?
Drop a comment below or reach out through the website. Built to bring good things to the world.
— FluentCap Team
A world without language barriers.








Top comments (0)