I dictate a lot. Messages to family, notes to myself, things I'd rather say than type. At some point I got curious about what was actually happening when I used a voice keyboard app.
Most of them bundle analytics SDKs, crash reporters, attribution libraries. Services that have nothing to do with transcription but run in the background every time you open the keyboard. Crash reports can include fragments of text from memory. Analytics events capture metadata about what you were doing. None of it is the transcription service you signed up for.
I didn't want that. I wanted a keyboard that did one thing: turn my voice into text. Nothing running in the background. Nothing phoning home except the transcription itself, and only when I ask it to.
So I built Diction.
Three modes, all of them clean
On-device: transcription runs locally on your iPhone. Nothing leaves the device. No network request at all.
Self-hosted: audio goes to your own server. You control what runs there. The server code is on GitHub if you want to read it.
Diction One: my cloud, same pipeline. Audio processed in real time, not stored.
Pick whichever fits. Switch any time. The keyboard is identical across all three.
Why it matters
Voice input is personal in a way that typing isn't. You speak more naturally than you write. You say things you wouldn't bother typing. That's the whole point of dictation.
It also means a voice keyboard that runs background analytics has access to something more candid than most apps ever see.
Diction doesn't do that. Transcription, nothing else.
Top comments (0)