<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: creasac</title>
    <description>The latest articles on DEV Community by creasac (@creasac).</description>
    <link>https://dev.to/creasac</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3797996%2F48723c78-a039-46fb-a91b-871023fb70d0.png</url>
      <title>DEV Community: creasac</title>
      <link>https://dev.to/creasac</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/creasac"/>
    <language>en</language>
    <item>
      <title>Built Shruti: A Minimal System-Wide Speech-to-Text Tool for Linux</title>
      <dc:creator>creasac</dc:creator>
      <pubDate>Mon, 02 Mar 2026 07:52:50 +0000</pubDate>
      <link>https://dev.to/creasac/built-shruti-a-minimal-system-wide-speech-to-text-tool-for-linux-47f2</link>
      <guid>https://dev.to/creasac/built-shruti-a-minimal-system-wide-speech-to-text-tool-for-linux-47f2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/weekend-2026-02-28"&gt;DEV Weekend Challenge: Community&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Community
&lt;/h2&gt;

&lt;p&gt;I built this for Linux users who do a lot of typing and want fast transcription directly in their existing workflow.&lt;/p&gt;

&lt;p&gt;That includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;developers writing docs, comments, and messages&lt;/li&gt;
&lt;li&gt;students taking quick notes&lt;/li&gt;
&lt;li&gt;writers drafting ideas&lt;/li&gt;
&lt;li&gt;multilingual users who need reliable transcription across languages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was simple: no heavy app, no complicated flow, just speak and continue typing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I built &lt;strong&gt;Shruti&lt;/strong&gt;, a minimal system-wide speech-to-text utility for Linux X11.&lt;/p&gt;

&lt;p&gt;Workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Place cursor in any text field&lt;/li&gt;
&lt;li&gt;Press hotkey once to start recording&lt;/li&gt;
&lt;li&gt;Press the same hotkey again to stop&lt;/li&gt;
&lt;li&gt;Shruti transcribes and types text at the cursor&lt;/li&gt;
&lt;li&gt;Press &lt;code&gt;Esc&lt;/code&gt; anytime to cancel current recording&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why it’s useful
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Works across apps where typing is possible&lt;/li&gt;
&lt;li&gt;Uses the user’s own Gemini API key&lt;/li&gt;
&lt;li&gt;High-quality transcription across languages&lt;/li&gt;
&lt;li&gt;Minimal visual HUD while recording/transcribing&lt;/li&gt;
&lt;li&gt;No always-on background process while idle.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/creasac/shruti" rel="noopener noreferrer"&gt;https://github.com/creasac/shruti&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Demo video: &lt;a href="https://youtu.be/UYwDzhuUQPQ" rel="noopener noreferrer"&gt;https://youtu.be/UYwDzhuUQPQ&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/creasac" rel="noopener noreferrer"&gt;
        creasac
      &lt;/a&gt; / &lt;a href="https://github.com/creasac/shruti" rel="noopener noreferrer"&gt;
        shruti
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      minimal linux (x11) speech-to-text utility
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;shruti&lt;/h1&gt;

&lt;/div&gt;
&lt;p&gt;Minimal desktop speech-to-text using Gemini.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Install&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;curl -fsSL https://raw.githubusercontent.com/creasac/shruti/main/bootstrap.sh &lt;span class="pl-k"&gt;|&lt;/span&gt; bash&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;What setup asks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Gemini API key (hidden input)&lt;/li&gt;
&lt;li&gt;Preferred hotkey (default &lt;code&gt;Ctrl+Space&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hotkey behavior after setup:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Press hotkey once: start recording&lt;/li&gt;
&lt;li&gt;Press hotkey again: stop and transcribe&lt;/li&gt;
&lt;li&gt;Press &lt;code&gt;Esc&lt;/code&gt;: cancel current recording&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Nothing runs in background while idle.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Configuration&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;Files:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;~/.config/shruti/config.toml&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;~/.config/shruti/credentials.toml&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;API key location:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Stored only in &lt;code&gt;~/.config/shruti/credentials.toml&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To remove your key:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;rm -f &lt;span class="pl-k"&gt;~&lt;/span&gt;/.config/shruti/credentials.toml&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;To remove all Shruti config data:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;rm -rf &lt;span class="pl-k"&gt;~&lt;/span&gt;/.config/shruti&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Editable config fields:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;model&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;hotkey&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;max_record_seconds&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sample_rate&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;channels&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;prompt&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Commands&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;shruti setup
shruti doctor --verbose
shruti oneshot&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Limitations&lt;/h2&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Linux X11 only (Wayland blocks unrestricted global hotkey/input injection for security)&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;License&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;MIT. See &lt;a href="https://github.com/creasac/shruti/LICENSE" rel="noopener noreferrer"&gt;LICENSE&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/creasac/shruti" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;





&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.11+&lt;/strong&gt; for fast iteration and portability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini API&lt;/strong&gt; (&lt;code&gt;gemini-2.5-flash-lite&lt;/code&gt;) for transcription&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sounddevice + numpy&lt;/strong&gt; for microphone capture and waveform data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;xdotool&lt;/strong&gt; for text insertion at the active cursor location (X11)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tkinter&lt;/strong&gt; for a compact overlay HUD&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TOML config&lt;/strong&gt; in &lt;code&gt;~/.config/shruti/&lt;/code&gt; for user-controlled settings and API key storage&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Design choices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keep the product intentionally narrow: one job, done well&lt;/li&gt;
&lt;li&gt;Hotkey-first interaction for speed&lt;/li&gt;
&lt;li&gt;Minimal UI and minimal config surface&lt;/li&gt;
&lt;li&gt;Users bring their own Gemini API key, so usage and billing stay fully under their control.&lt;/li&gt;
&lt;li&gt;Cloud-based transcription gives wider language coverage, keeps local compute light, and allows easy model upgrades.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What’s Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Better packaging/distribution options&lt;/li&gt;
&lt;li&gt;More desktop-environment setup helpers&lt;/li&gt;
&lt;li&gt;Explore Wayland-compatible path where feasible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Built by @creasac.&lt;/code&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>weekendchallenge</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
