DEV Community

Cover image for From Brain Dump to Markdown: Structure Ideas as You Speak
Ilbets
Ilbets

Posted on

From Brain Dump to Markdown: Structure Ideas as You Speak

Written by Speech To Markdown

Voice input is faster than typing — but speed alone isn't the problem. The real challenge is structure. It's surprisingly hard to organise your thoughts on the fly and say something coherent. AI assistants like Gemini have adopted voice input, yet so much of what gets transcribed ends up as unstructured noise that takes longer to clean up than it saved.

That's the problem I set out to solve for myself.

Why I Built This

I wanted a way to speak freely — braindump style — and have an AI turn it into clean, structured Markdown in real time. No editing, no typing, no context switching. Just think out loud and get a document back.

The result is a Speech-to-Markdown [stmd] tool built into TaskSquad.

How It Works

When you start recording, stmd transcribes your speech locally using Whisper models (downloaded on the fly — I use the large variant). The transcript is buffered, aggregated, and sent chunk by chunk to a model of your choice. You can pause recording at any point to think before continuing.

You can connect it to:

  • A local harness available on TaskSquad (powered by a sub-agent)
  • A direct API or local model via a custom prompt (oMLX, Ollama, etc.)
Model Selection Session Setup Ready State

I currently use Claude Code as my main harness and have also tested Gemma 4 with oMLX extensively — both work well at comparable speed.

Two Modes

stmd works in 2 modes: append and edit.

Append mode — each spoken chunk is cleaned up and appended to the existing Markdown document. Great for brain-dumping a first draft.

Edit mode — your spoken words become edit commands. Instead of adding content, the agent modifies what's already there. Say "make the intro shorter" or "replace the second bullet" — no keyboard required.

Output

All raw transcripts and generated Markdown files are saved to your .tsq folder. The model is completely your choice — there are no lock-ins.


If you're the kind of developer who thinks faster than you type, give it a try.

  1. Download tasksquad.ai daemon
  2. Run ./tsq
  3. Open "Control Plane"
  4. Switch to "Speech to MD" tab
  5. Select the Whisperer model you want to download.
  6. Select a harness or model to process input.
  7. Start talking!

Top comments (0)