Jordan Autrey

Posted on Mar 18

Why I Built a Local-First Voice-to-Code Tool (And What 18 Days of $0 Revenue Taught Me)

#buildinpublic #devtools #a11y #ai

Why I Built a Local-First Voice-to-Code Tool (And What 18 Days of $0 Revenue Taught Me)

I'm going to be honest with you: I'm writing this post from $0 MRR on day 18 of building Voco V2.

Not because I don't have a product. The product works. The architecture is solid. The latency is sub-300ms. Everything runs local.

I'm at $0 because I spent 18 days building instead of selling. This post is me fixing that.

The Problem Nobody Talks About

Every AI coding assistant — GitHub Copilot, Continue, Kiro, Cursor — expects you to type your prompts. You're using an AI to write code faster, but you're still bottlenecked by a keyboard.

For most developers, that's a minor friction. For developers with RSI, carpal tunnel, or limited mobility, it's a wall.

And even if your hands are fine today, the average software engineer types 5,000-10,000 keystrokes per hour. Over a career, that adds up. According to research, RSI affects a significant percentage of professional programmers.

What Exists Today

The voice coding space isn't empty. There are real tools doing real work:

Talon Voice — incredible for full hands-free computer control. Built by Ryan Hileman after he developed severe hand pain. But it has a steep learning curve and requires significant setup.
Wispr Flow — fast, polished dictation. But it's cloud-based (your audio leaves your machine) and built for prose, not code.
SuperWhisper — solid local transcription. But it's a general-purpose tool, not optimized for code syntax.
Claude Code Voice Mode — just launched, push-to-talk in terminal. Cool, but limited to Claude's ecosystem.

Each of these is good at what it does. None of them are built specifically for the workflow of speaking code into existence with local-first privacy as a non-negotiable.

What We Built

Voco V2 is voice-to-code. Not voice-to-text-that-you-paste-into-code. Voice to working code.

The key specs:

Sub-300ms latency — speak and see code appear, not "speak, wait 2 seconds, see code appear"
Fully local — your audio never leaves your machine. No API keys, no cloud calls, no telemetry. The model runs on your hardware.
Zero-trust architecture — we literally can't see your data because it never reaches us
Built for code — understands syntax, function signatures, variable names. Not optimized for writing emails.
Zero config — no grammar files to write, no training period, no custom commands to memorize

The Architecture Decision That Changed Everything

Early on, we had to make a choice: cloud or local?

Cloud is easier. You get access to massive models, the transcription quality is higher out of the box, and you don't have to worry about hardware requirements.

We chose local anyway. Here's why:

Privacy isn't optional for developers. You're speaking your code out loud. That includes function names, business logic, API endpoints, proprietary algorithms. Sending that to a cloud service is a non-starter for any serious engineering team.
Latency compounds. Even 500ms of network round-trip feels wrong when you're in flow. Sub-300ms local processing means voice coding feels like typing, not like dictating.
No dependency = no risk. Cloud services go down. APIs get deprecated. Pricing changes. Your local tool works on an airplane.

What 18 Days of $0 Taught Me

Here's the uncomfortable truth I'm sharing publicly because I think other builders need to hear it:

I built a working product, solid infrastructure, automated pipelines, monitoring dashboards, a content engine, and a campaign framework — and generated exactly zero revenue.

Why? Because I never talked to a single potential customer.

I told myself I was "building the foundation" and "getting ready to launch." In reality, I was hiding behind code because shipping to production is comfortable and selling is not.

The infrastructure isn't the product. The sale is the product.

So here I am. Day 18. Talking to people.

The Ask

I'm looking for 10 founding members for Voco V2.

$39/month — founding rate, locked forever. That means if we raise prices later (and we will), yours stays at $39.

What you get:

Full access to Voco V2
Direct access to me (the founder)
Priority feature requests — your voice literally shapes the roadmap
A seat at the table of something we're building from scratch

Who this is for:

Developers dealing with RSI, carpal tunnel, or any repetitive strain
Anyone who codes on mobile and wants a faster input method
Developers who want to try voice-first coding without a week of setup
Privacy-conscious builders who won't send their code to the cloud

Who this is NOT for:

People who want a general dictation tool (use Wispr Flow)
People who need full computer control by voice (use Talon)
People who are happy with their current coding speed and input method

Try It

Website: itsvoco.com

I'm not hiding behind a waitlist. The product exists. You can use it today.

If you have questions, drop them in the comments. I'll answer every single one.

Building in public means showing the messy parts too. Day 18, $0 MRR, and I'm just now doing what I should have done on day 1 — talking to people who might actually want this.

If you're an indie hacker reading this and you recognize the pattern... stop building. Start selling.

DEV Community

Why I Built a Local-First Voice-to-Code Tool (And What 18 Days of $0 Revenue Taught Me)

Why I Built a Local-First Voice-to-Code Tool (And What 18 Days of $0 Revenue Taught Me)

The Problem Nobody Talks About

What Exists Today

What We Built

The Architecture Decision That Changed Everything

What 18 Days of $0 Taught Me

The Ask

Try It

Top comments (0)