DEV Community

MrD
MrD

Posted on

I Built a Free Desktop App for Real-time Transcription & Translation - Here's Everything It Can Do

FluentCap: A Free Desktop App for Real-time Transcription & Translation

Hi dev community!

I'm excited to share FluentCap - a desktop application I've been building to bring real-time transcription and translation to everyone. The core philosophy is simple: language should never be a barrier to understanding, learning, or connecting with others.

Check it out: https://fluentcap.live

Download: https://fluentcap.live/download

Live Transcript


Who Is FluentCap For?

FluentCap is designed for anyone who works with audio in a different language:

Use Case How FluentCap Helps
Language Learners Watch K-dramas, anime, foreign films with real-time bilingual subtitles
Remote Workers Participate confidently in international meetings
Deaf & Hard of Hearing Access real-time captions for any audio on your computer
Researchers & Students Transcribe lectures, podcasts, interviews instantly
Content Creators Generate transcripts for videos without expensive services

Accessibility: Built for the Deaf & Hard of Hearing Community

One of FluentCap's core missions is accessibility. According to WHO, 433 million people worldwide experience disabling hearing loss, projected to reach 2.5 billion by 2050.

How FluentCap helps:

  • Real-time captions for ANY audio - not just specific apps like Zoom or Meet
  • Works with system audio - Netflix, YouTube, podcasts, lectures, video calls
  • Bilingual display - see both original language and translation simultaneously
  • Local processing - your audio stays on your machine, privacy-first

Research shows real-time transcription increases engagement for deaf students by 70-85%, and 98.6% of students find captions helpful for focus and comprehension.

FluentCap aims to make professional-grade captioning accessible to everyone at a fraction of the cost of traditional services (which can run $150+/hour for professional captioners). With FluentCap's BYOK model, the same utility costs just $0.15-0.40/hour.


The BYOK (Bring Your Own Key) Philosophy

Unlike subscription apps that charge $15-30/month, FluentCap uses a transparent BYOK model:

  • FluentCap is free forever - no ads, no subscriptions
  • You connect directly to speech-to-text providers using your own API keys
  • Providers offer hundreds of hours free to get started:
Provider Free Credits ~Free Hours
Deepgram $200 ~750 hours
AssemblyAI $50 ~140 hours
Gladia 10 hrs/month Unlimited (monthly reset)
Shunya $100 ~300 hours

When free credits run out, you pay providers directly at wholesale rates: $0.15-0.40/hour (60-80% cheaper than subscription apps).

Pick Provider


Feature Highlights

1. Audio Recording & Synchronized Playback

One of the newest features! FluentCap can:

  • Record audio during transcription sessions (MP3 format)
  • Merge recording bursts into a single continuous file
  • Play back with full synchronization to the transcript

The 5-Pillar Interaction Model:

  1. Automatic Recording - Audio captured as MP3 in real-time
  2. Integrated Playback - Play/Pause, Seek, Download controls
  3. Click-to-Seek - Click any transcript line and audio jumps there
  4. Jump Navigation - Next/Previous highlight buttons
  5. Bidirectional Sync - Audio playback auto-highlights & scrolls transcript

Audio Sync Playback


2. Highlight & Annotation System

FluentCap isn't just for passive viewing - it's an active learning tool:

  • Select any text - Instant highlight via context menu

Highlight Context Menu

  • Live Auto-Highlight Mode - Highlights apply immediately during recording (no menu clicks needed!)

  • Easy removal - Hovering over any highlight reveals a remove button

Highlight Remove

  • Cross-session Gallery - View all highlights from all sessions in one sidebar tab
  • Click-to-Navigate - Jump directly from highlight card to the exact position in transcript + audio

Highlight Collection


3. Bidirectional Sync

When audio plays:

  1. The current segment is highlighted with a subtle flash animation
  2. Both source & translated lists auto-scroll to keep the active segment centered
  3. Clicking any transcript line seeks audio without changing play/pause state

4. Multiple Layout Modes

Mode Description Best For
Movie Mode Large text, transparent overlay Watching films
Meeting Notes Visible sidebar, high opacity Work calls
Side-by-Side Vertical comparison view Language learning

5. Multi-Source Audio Capture

Audio Source Use Case
System Audio Netflix, YouTube, Podcasts
Microphone Dictation, Note-taking
Both Video calls (hear everyone)

Settings - Audio Source & Provider


6. Customizable Themes

FluentCap offers multiple themes to match your preference and reduce eye strain during long sessions.

Settings - Theme


Privacy-First Architecture

  • Local storage only - all sessions saved on your machine
  • Direct API calls - audio goes straight to providers, no intermediate servers
  • No tracking - your data stays yours

Technical Stack

For the devs interested in the architecture:

  • Framework: Electron + React + TypeScript
  • Audio Processing: FFmpeg (via ffmpeg-static)
  • Storage: JSON-based local persistence
  • Virtualization: react-window for smooth transcript scrolling
  • STT Providers: Deepgram, AssemblyAI, Gladia, Shunya

Thank You to Our Providers

FluentCap exists because of the amazing speech-to-text providers who make their technology accessible:

  • Deepgram - Lightning-fast Nova-3 model
  • AssemblyAI - Excellent accuracy and features
  • Gladia - Generous monthly free tier
  • Shunya - Budget-friendly rates

When your free credits run out, please consider supporting them. Their pricing is incredibly fair — just $0.15-0.40 per hour, and they deserve recognition for democratizing this technology.


Try It Out

Website: https://fluentcap.live

Download: https://fluentcap.live/download

Blog: https://fluentcap.live/blog

Available for macOS, Windows, and Linux.


Looking for Feedback

I'd love to hear from the community:

  • What features would make this more useful for your workflow?
  • Any accessibility improvements you'd suggest?
  • Feedback on the BYOK model?

Drop a comment below or reach out through the website. Built to bring good things to the world.


— FluentCap Team

A world without language barriers.

Top comments (0)