This is a simplified guide to an AI model called Chat-Tts maintained by Thlz998. If you like these kinds of analysis, you should join AImodels.fyi or follow us on Twitter.
Model overview
chat-tts is an implementation of the ChatTTS model as a Cog model, developed by maintainer thlz998. It is similar to other text-to-speech models like bel-tts, neon-tts, and xtts-v2, which also aim to convert text into human-like speech.
Model inputs and outputs
chat-tts takes in text that it will synthesize into speech. It also allows for adjusting various parameters like voice, temperature, and top-k sampling to control the generated audio output.
Inputs
- text: The text to be synthesized into speech.
- voice: A number that determines the voice tone, with options like 2222, 7869, 6653, 4099, 5099.
- prompt: Sets laughter, pauses, and other audio cues.
- temperature: Adjusts the sampling temperature.
- top_p: Sets the nucleus sampling top-p value.
- top_k: Sets the top-k sampling value.
- skip_refine: Determines whether to skip the text refinement step.
- custom_voice: Allows specifying a seed value for custom voice tone generation.
Outputs
- The generated speech audio based on the provided text and parameters.
Capabilities
chat-tts can generate human-like spe...
Top comments (0)