You ask OpenClaw something via voice message on Telegram.
It responds with something off. You read it again. Then you realize: it transcribed "Kubernetes" as "Cuban eties" and your entire prompt made no sense. You go back, re-record, send again.
That loop is the problem.
When you send a voice message on Telegram, OpenClaw receives an audio file. It transcribes it somewhere in its pipeline before the AI sees your words. You never see that transcript. By the time you get a bad response, you can't tell if the AI misunderstood your intent or if your words came in garbled.
With text, that problem disappears. The AI gets exactly what you typed. You can re-read it before hitting send.
The missing piece was a fast way to get from your voice to reviewed text before it hits the chat.
What Diction Does
Diction is an iPhone keyboard that transcribes as you speak — inside the keyboard, before you send.
The workflow becomes:
- Open Telegram, find OpenClaw
- Switch to Diction keyboard (globe icon)
- Tap the mic, speak your prompt
- The keyboard shows you the transcription immediately
- Read it. Fix "Cuban eties" to "Kubernetes." Fix a name, a command, whatever
- Hit send
OpenClaw receives clean text. You know exactly what it's working with.
Why This Beats Voice Messages for AI Prompts
Voice messages work fine for casual conversation. For AI prompts, they introduce a failure mode you can't debug.
A few things that go wrong with voice-to-OpenClaw:
- Technical terms get mangled. "n-grams" becomes "engrams." "React hook" becomes "react who."
- Proper nouns the transcription layer hasn't seen before come out as phonetic guesses
- Long prompts with multiple conditions lose a clause in the middle and you never know
When you see the transcription first, you catch these before they reach the model. The AI responds to what you actually meant.
Setup
Diction is a keyboard extension — install it once, switch to it whenever you want to dictate.
After install:
- Go to Settings → General → Keyboard → Keyboards → Add New Keyboard → Diction
- Enable Full Access (required for the keyboard to insert text)
- Open any app, tap a text field, press the globe icon to switch to Diction
- Tap the mic
For transcription, three options: on-device (runs on your iPhone, no network), self-hosted (your own server, if you run one), or Diction One (their cloud).
If you use OpenClaw because you prefer keeping data local, the on-device or self-hosted options mean your voice never leaves your phone or network. OpenClaw gets text either way.
It Works Everywhere, Not Just Telegram
Since Diction is a keyboard, it works in any app with a text field. Same workflow for:
- WhatsApp bots
- NanoGPT
- The Claude app
- ChatGPT
- Any web app in Safari
You don't configure anything per-app. Switch keyboard, tap mic, speak, review, send.
The app is free to download and try. Self-hosted mode is free. On-device basic model downloads automatically.
Download on the App Store · diction.one · github.com/omachala/diction
Top comments (0)