A companion with a face needs a voice to match. Scowld handles text-to-speech with bring-your-own-key providers and gives you real control over how it sounds.
TTS options:
- ElevenLabs: voices like Celine and Claire, presets, custom voice IDs, and model selection
- OpenAI voices
The ElevenLabs path is the flexible one. You can use the built-in presets, drop in a custom voice ID, and pick which model to run, so the companion's voice can be tuned to taste rather than locked to a single default.
The voice output is what drives the VRM character's lip sync. The three-vrm avatar moves its mouth to the audio, so the face and the voice stay in sync as it talks back.
As with everything else in Scowld, your provider keys are stored in the iOS Keychain and your voice choices live in local preferences. You're paying your own TTS provider directly, and swapping voices is a settings change.
It's MIT licensed and open source, so the TTS wiring is all readable.
Read the source here.
Top comments (0)