An open-source project for creating AI voice clones using Coqui TTS (XTTS v2). This project enables you to generate natural-sounding speech in any voice by providing a sample audio file and text input.
🎯 Features
- Voice Cloning: Clone any voice from a sample audio file
- Multilingual Support: Works with multiple languages (English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, and more)
- High-Quality Output: Powered by Coqui TTS XTTS v2 model for natural-sounding speech
- Easy to Use: Simple notebook-based interface for quick voice generation
- GPU Support: Automatically uses CUDA if available for faster processing
📋 Requirements
- Python 3.7+
- PyTorch
- CUDA-capable GPU (optional, but recommended for faster processing)
- Google Colab or Jupyter Notebook environment
🚀 Installation
- Clone this repository:
git clone https://github.com/yourusername/ai-twin.git
cd ai-twin
- Install dependencies:
pip install -U scipy torch
- Install Coqui TTS:
git clone https://github.com/idiap/coqui-ai-TTS.git
cd coqui-ai-TTS
pip install -e .
💻 Usage
- Open
TorTTS_API.ipynbin Jupyter Notebook or Google Colab - Run the first cell to install dependencies and clone Coqui TTS
- Run the second cell to initialize the TTS model
- Upload a voice sample (MP3 or WAV format) - this will be used as the reference voice
- Upload a text file containing the text you want to convert to speech
- Run the final cell to generate the audio output
- Download the generated audio file
Example Workflow
# Initialize TTS
from TTS.api import TTS
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)
# Generate speech
tts.tts_to_file(
text="Your text here",
speaker_wav="path/to/voice_sample.mp3",
language="en",
file_path="output.wav"
)
📝 Supported Languages
English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese (Simplified), Japanese, and more.
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
📄 License
This project is open source and available under the MIT License.
📧 Contact
- Email: devpuffer0807@gmail.com
- Telegram: @devpuffer0807
🙏 Acknowledgments
- Coqui TTS - The amazing text-to-speech library that powers this project
- XTTS v2 - The voice cloning model used in this project
⚠️ Disclaimer
This tool is for educational and research purposes. Please ensure you have proper authorization before cloning voices, especially for commercial use. Always respect privacy and consent when working with voice data.
Made with ❤️ by the open source community
Top comments (0)