DEV Community

Apoorv Darshan
Apoorv Darshan

Posted on

Pick your ears: 8 speech-to-text backends, all BYOK

Voice input is only as good as your transcription, so Scowld lets you choose the speech-to-text engine instead of forcing one.

The STT options are bring-your-own-key (or built-in where it makes sense):

  • Native iOS speech recognition
  • OpenAI Whisper
  • Groq Whisper
  • Deepgram
  • AssemblyAI
  • Google Cloud STT
  • Browser Whisper
  • Text-only (skip STT entirely and just type)

Why give this many choices? Because the tradeoffs are real:

  • Native iOS needs no extra key.
  • Hosted Whisper variants (OpenAI, Groq) are handy for accuracy and languages.
  • Deepgram and AssemblyAI are strong if you already use them.
  • Text-only is there for when you just want to type to the companion from the iOS composer.

Your keys for these providers live in the iOS Keychain, and your selection is saved in local preferences. Pair your STT choice with a wake word, an LLM provider, and a TTS voice, and you've assembled a full voice loop out of parts you control.

The app is on the App Store. Set it up with whatever transcription stack you trust.

https://apps.apple.com/in/app/scowld-ai-voice-companion/id6760672848

Top comments (0)