DEV Community

Cover image for Your iPhone Keyboard Is Phoning Home. Stop It in 3 Commands.
Ondrej Machala
Ondrej Machala

Posted on • Edited on

Your iPhone Keyboard Is Phoning Home. Stop It in 3 Commands.

If you're running a homelab, you've probably already got speech-to-text somewhere in your stack.

Maybe you use it for Home Assistant voice commands. Or local LLM integrations. Or just transcribing meeting recordings.

Here's something you might not have considered: you can use that same transcription server as a keyboard on your iPhone.


The 3 Commands

git clone https://github.com/omachala/diction
cd diction
docker compose --profile small up -d
Enter fullscreen mode Exit fullscreen mode

That's the server running. Now install Diction on your iPhone, point it at your server URL, and you have a voice keyboard backed by your own speech-to-text instance.

The --profile small flag picks which model to run. The repo also ships medium, large, and parakeet profiles — more on those below.


What's Actually Running

The compose file spins up two services:

services:
  gateway:
    image: ghcr.io/omachala/diction-gateway:latest
    ports:
      - "8080:8080"
    environment:
      DEFAULT_MODEL: small

  whisper-small:
    image: fedirz/faster-whisper-server:latest-cpu
    profiles: ["small"]
    environment:
      WHISPER__MODEL: Systran/faster-whisper-small
      WHISPER__INFERENCE_DEVICE: cpu
Enter fullscreen mode Exit fullscreen mode

whisper-small: the transcription engine — runs open-source Whisper via a REST API. CPU works fine for real-time dictation.

gateway: a small open-source Go service that handles communication between the iOS app and the transcription backend. It accepts WebSocket connections from the phone, buffers audio frames, and forwards them to Whisper. This is what makes dictation feel instant instead of "record, upload, wait."

The gateway exposes port 8080. That's the URL you put into the Diction app.


Making It Accessible From Your Phone

Your phone needs to reach your server. A few options depending on your setup:

Tailscale (easiest): Install Tailscale on both your server and iPhone. You get a private IP accessible from anywhere. No port forwarding, no firewall rules.

http://100.x.x.x:8080
Enter fullscreen mode Exit fullscreen mode

Reverse proxy (for existing homelabbers): If you're already running Caddy, nginx, or Traefik, add a route to port 8080.

https://diction.yourdomain.com
Enter fullscreen mode Exit fullscreen mode

Direct LAN (simplest for home-only use): Just use your server's local IP. Works on home WiFi, not outside.

http://192.168.1.100:8080
Enter fullscreen mode Exit fullscreen mode

Choosing a Model

The repo ships four profiles. Pick one at docker compose --profile <name> up -d:

Profile Model RAM Speed (CPU) Notes
small Whisper small ~850 MB ~3-4s Good default for everyday dictation
medium Whisper medium ~2.1 GB ~8-12s Better with accents and background noise
large Whisper large-v3-turbo ~2.3 GB <2s on GPU Highest accuracy, benefits from GPU
parakeet NVIDIA Parakeet TDT v3 ~2 GB ~10x faster than Whisper 25 European languages, more accurate than Whisper for English

For most home servers running English or mixed multilingual, small hits the sweet spot. If you dictate mostly in a European language (German, French, Spanish, Italian, Polish, Czech, …), Parakeet is the better engine.

Switching profiles: tear down and bring up with the new profile. The DEFAULT_MODEL on the gateway is already wired in the compose file, so no extra config needed.

docker compose down
docker compose --profile parakeet up -d
Enter fullscreen mode Exit fullscreen mode

Models download automatically on first start and cache in a shared Docker volume.


Already Running a Whisper Server?

If you already have a speech-to-text container running, you don't need to spin up another one. Just run the gateway and point it at your existing server:

services:
  gateway:
    image: ghcr.io/omachala/diction-gateway:latest
    ports:
      - "8080:8080"
    environment:
      CUSTOM_BACKEND_URL: http://your-server:8000
      CUSTOM_BACKEND_MODEL: your-model-name
Enter fullscreen mode Exit fullscreen mode

More details on connecting to existing servers in this post.


The server and gateway are fully open source: github.com/omachala/diction

Top comments (0)