<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Secret_Agent_007</title>
    <description>The latest articles on DEV Community by Secret_Agent_007 (@secret_agent_007).</description>
    <link>https://dev.to/secret_agent_007</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3948142%2Faaab78e3-2e84-4eba-b415-fdecd7d64d84.png</url>
      <title>DEV Community: Secret_Agent_007</title>
      <link>https://dev.to/secret_agent_007</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/secret_agent_007"/>
    <language>en</language>
    <item>
      <title>I got so tired of typing that I built a voice input tool. Here's what changed.</title>
      <dc:creator>Secret_Agent_007</dc:creator>
      <pubDate>Sat, 23 May 2026 19:13:29 +0000</pubDate>
      <link>https://dev.to/secret_agent_007/i-got-so-tired-of-typing-that-i-built-a-voice-input-tool-heres-what-changed-19k3</link>
      <guid>https://dev.to/secret_agent_007/i-got-so-tired-of-typing-that-i-built-a-voice-input-tool-heres-what-changed-19k3</guid>
      <description>&lt;p&gt;I'm a developer. I write a lot. Messages, docs, comments, emails, AI prompts — all day long.&lt;br&gt;
At some point I did the math:&lt;/p&gt;

&lt;p&gt;Average typing speed: ~60 words per minute&lt;br&gt;
Average speaking speed: ~150 words per minute&lt;/p&gt;

&lt;p&gt;That's 2.5x more output for the same amount of thinking.&lt;br&gt;
And yet I was sitting there, pecking at my keyboard like it's 1995.&lt;br&gt;
So I built something about it.&lt;/p&gt;

&lt;p&gt;The problem with existing voice tools&lt;br&gt;
I didn't want to build this. I went looking for something ready-made first.&lt;br&gt;
What I needed:&lt;/p&gt;

&lt;p&gt;Text appears wherever my cursor is — any app, any window&lt;br&gt;
Works offline — no sending audio to some server&lt;br&gt;
Free, no subscription&lt;br&gt;
No button rituals — just talk, text appears&lt;/p&gt;

&lt;p&gt;What I found: cloud-only tools, paid subscriptions, apps that only worked in their own tiny window (great, very useful, love copying text manually).&lt;br&gt;
So: built it myself. Classic developer move.&lt;/p&gt;

&lt;p&gt;VoxBee — voice input that actually fits into your workflow&lt;br&gt;
It's a free, open-source Windows app. Runs locally on whisper.cpp. No internet needed after setup. No data leaves your machine.&lt;br&gt;
The key feature: auto mode.&lt;br&gt;
You open any text field. You start talking. Text appears. You stop talking. It stops.&lt;br&gt;
No buttons. No "click to record." No switching apps.&lt;br&gt;
The workflow difference is real — once you dictate a few documents or long messages this way, going back to typing feels genuinely slow. Your thoughts move faster than your fingers. Now your output can too.&lt;/p&gt;

&lt;p&gt;The productivity gains that surprised me&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AI prompts get better
When I type prompts for ChatGPT or Claude, I keep them short because typing is effort. When I dictate, I give full context, more detail, better instructions. The responses got noticeably better just because I stopped being lazy with my prompts.&lt;/li&gt;
&lt;li&gt;"Okay" sends the message
I added a voice command: say "okay" → hits Enter. Sounds trivial. In practice, when you're dictating 20+ messages a day, never reaching for the keyboard is a genuine quality-of-life upgrade.
Other commands:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;save → Ctrl+S&lt;br&gt;
undo → Ctrl+Z&lt;br&gt;
click → mouse click&lt;br&gt;
up / down / left / right → moves cursor&lt;/p&gt;

&lt;p&gt;You can add custom ones too. I have a command that opens my task manager. Another that saves and closes a file. Whatever you repeat 10x a day — automate it with your voice.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Notes and drafts flow differently
When I type, I edit as I go. Delete, rephrase, backspace. It's slow and it fragments thinking.
When I dictate, I just... talk. The draft comes out messier but faster and often more natural. Then one editing pass. Total time: shorter.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Heads up: it's not magic&lt;br&gt;
In auto mode, background noise can sometimes trigger transcription. Fan noise, street sounds, a door slamming — Whisper might try to transcribe it. Noise suppression helps a lot.&lt;br&gt;
Also: it's a beta. Tested on my machine. Works great for me, might have quirks on yours. Open source, so if something's off — the code is there.&lt;/p&gt;

&lt;p&gt;Performance&lt;/p&gt;

&lt;p&gt;NVIDIA GPU (CUDA): essentially instant. You finish a sentence, the text is already there.&lt;br&gt;
CPU only: 1-2 second delay with the base model. Totally fine for most use cases.&lt;br&gt;
AMD: separate Vulkan build in progress, not ready yet.&lt;/p&gt;

&lt;p&gt;Models (start with base)&lt;br&gt;
ModelSizeNotestiny75 MBFast, rougher accuracybase148 MBBest starting pointsmall465 MBBetter accuracylarge-v3-turbo1.6 GBBest if you have a GPU&lt;/p&gt;

&lt;p&gt;Who this actually helps&lt;br&gt;
Great fit if you:&lt;/p&gt;

&lt;p&gt;Write long-form content (articles, docs, reports)&lt;br&gt;
Send a lot of messages or emails daily&lt;br&gt;
Use AI tools heavily and write detailed prompts&lt;br&gt;
Have wrist/hand pain from typing&lt;br&gt;
Think faster than you type (most people do)&lt;/p&gt;

&lt;p&gt;Probably not worth it if you write 50 words a day and are happy with that.&lt;/p&gt;

&lt;p&gt;Setup (one time, 3 steps)&lt;/p&gt;

&lt;p&gt;Download installer → Releases&lt;br&gt;
Grab model ggml-base.bin → Hugging Face&lt;br&gt;
Get engine whisper-blas-bin-x64.zip → whisper.cpp releases&lt;/p&gt;

&lt;p&gt;Put the models/, cpu/, gpu/ folders next to the installer. Done.&lt;br&gt;
Windows 10/11, any mic, 500 MB disk space.&lt;/p&gt;

&lt;p&gt;Repo: github.com/boris-agent007/Voxbee&lt;br&gt;
If you try it, let me know how it goes. Genuinely curious whether the productivity angle lands for people outside my own workflow.&lt;br&gt;
— Secret Agent 007 🕵️&lt;/p&gt;

&lt;p&gt;P.S. Dictated this post lying on my couch. Then had an AI clean up the draft. A post about voice productivity, written via voice, edited by AI. Make of that what you will.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>productivity</category>
      <category>microsoft</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
