<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: imper</title>
    <description>The latest articles on DEV Community by imper (@imper_7cde72b79d2529291ec).</description>
    <link>https://dev.to/imper_7cde72b79d2529291ec</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3812341%2F4a3e352b-d42f-4efe-bb2c-23e8703e87bb.jpg</url>
      <title>DEV Community: imper</title>
      <link>https://dev.to/imper_7cde72b79d2529291ec</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/imper_7cde72b79d2529291ec"/>
    <language>en</language>
    <item>
      <title>Building an offline subtitle extractor with whisper.cpp and Electron</title>
      <dc:creator>imper</dc:creator>
      <pubDate>Sun, 08 Mar 2026 02:37:46 +0000</pubDate>
      <link>https://dev.to/imper_7cde72b79d2529291ec/building-an-offline-subtitle-extractor-with-whispercpp-and-electron-44k2</link>
      <guid>https://dev.to/imper_7cde72b79d2529291ec/building-an-offline-subtitle-extractor-with-whispercpp-and-electron-44k2</guid>
      <description>&lt;p&gt;I watch a lot of foreign language content - anime, K-dramas, tech talks - and getting subtitles was always a pain. Upload to a random website, hit the daily limit, try another one, or install Python and figure out whisper's CLI.&lt;/p&gt;

&lt;p&gt;So over the past few months I've been building a desktop app that handles the whole pipeline locally.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;You drop a video file in, pick a whisper model size, and it spits out an SRT subtitle file. Optionally you can translate the subtitles using one of several engines.&lt;/p&gt;

&lt;p&gt;The speech-to-text runs via &lt;strong&gt;whisper.cpp&lt;/strong&gt; so everything stays on your machine. No uploads, no API calls for the transcription part. If you have an NVIDIA GPU it automatically uses CUDA, otherwise it falls back to CPU - this was one of the trickier parts to get right.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0bixenk40upft6zx7vbk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0bixenk40upft6zx7vbk.png" alt="App Screenshot" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech decisions
&lt;/h2&gt;

&lt;p&gt;I went with &lt;strong&gt;Electron + Node.js&lt;/strong&gt; because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cross-platform (though I'm mainly targeting Windows right now)&lt;/li&gt;
&lt;li&gt;Easy to bundle whisper.cpp binaries and ffmpeg&lt;/li&gt;
&lt;li&gt;The UI is just HTML/CSS/JS so iteration is fast&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For whisper.cpp integration, the app spawns it as a child process with the right flags depending on whether CUDA is available. Model files (GGML format) auto-download on first run into a local &lt;code&gt;_models/&lt;/code&gt; folder.&lt;/p&gt;

&lt;h2&gt;
  
  
  Translation engines
&lt;/h2&gt;

&lt;p&gt;Translation is optional. Currently supported:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MyMemory&lt;/strong&gt; - free, no API key, ~50K chars/day&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DeepL&lt;/strong&gt; - free tier 500K chars/month, needs API key&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI GPT&lt;/strong&gt; - paid, good quality for nuanced translations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini&lt;/strong&gt; - Google's API, generous free tier&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The app chunks subtitle text and sends it in batches to avoid rate limits. Each engine has its own quirks with language codes so there's a mapping layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  v1.4.0 changes
&lt;/h2&gt;

&lt;p&gt;Just pushed the latest update which adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic GPU/CPU fallback detection&lt;/li&gt;
&lt;li&gt;Bundled ffprobe-static (no more separate ffmpeg install)&lt;/li&gt;
&lt;li&gt;Better DeepL language code mapping&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it out
&lt;/h2&gt;

&lt;p&gt;It's packaged as a portable .exe - no install needed, just extract and run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Blue-B/WhisperSubTranslate" rel="noopener noreferrer"&gt;WhisperSubTranslate&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;License&lt;/strong&gt;: GPL-3.0&lt;/p&gt;

&lt;p&gt;If you're working on something similar or have suggestions, I'd love to hear about it.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>javascript</category>
      <category>electron</category>
      <category>whisper</category>
    </item>
  </channel>
</rss>
