Puffer

Posted on Dec 16, 2025

AI Twin — Voice Cloning with Text-to-Speech

#ai #opensource #python #showdev

An open-source project for creating AI voice clones using Coqui TTS (XTTS v2). This project enables you to generate natural-sounding speech in any voice by providing a sample audio file and text input.

🎯 Features

Voice Cloning: Clone any voice from a sample audio file
Multilingual Support: Works with multiple languages (English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, and more)
High-Quality Output: Powered by Coqui TTS XTTS v2 model for natural-sounding speech
Easy to Use: Simple notebook-based interface for quick voice generation
GPU Support: Automatically uses CUDA if available for faster processing

📋 Requirements

Python 3.7+
PyTorch
CUDA-capable GPU (optional, but recommended for faster processing)
Google Colab or Jupyter Notebook environment

🚀 Installation

Clone this repository:

git clone https://github.com/yourusername/ai-twin.git
cd ai-twin

Install dependencies:

pip install -U scipy torch

Install Coqui TTS:

git clone https://github.com/idiap/coqui-ai-TTS.git
cd coqui-ai-TTS
pip install -e .

💻 Usage

Open TorTTS_API.ipynb in Jupyter Notebook or Google Colab
Run the first cell to install dependencies and clone Coqui TTS
Run the second cell to initialize the TTS model
Upload a voice sample (MP3 or WAV format) - this will be used as the reference voice
Upload a text file containing the text you want to convert to speech
Run the final cell to generate the audio output
Download the generated audio file

Example Workflow

# Initialize TTS
from TTS.api import TTS
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

# Generate speech
tts.tts_to_file(
    text="Your text here",
    speaker_wav="path/to/voice_sample.mp3",
    language="en",
    file_path="output.wav"
)

📝 Supported Languages

English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese (Simplified), Japanese, and more.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

📄 License

This project is open source and available under the MIT License.

📧 Contact

Email: devpuffer0807@gmail.com
Telegram: @devpuffer0807

🙏 Acknowledgments

Coqui TTS - The amazing text-to-speech library that powers this project
XTTS v2 - The voice cloning model used in this project

⚠️ Disclaimer

This tool is for educational and research purposes. Please ensure you have proper authorization before cloning voices, especially for commercial use. Always respect privacy and consent when working with voice data.

Made with ❤️ by the open source community

DEV Community