DEV Community

Cover image for Build Your Own Offline AI Voiceover Tool in Python — No APIs, No Limits
n3on
n3on

Posted on

Build Your Own Offline AI Voiceover Tool in Python — No APIs, No Limits

Are you looking for a high-quality text-to-speech (TTS) solution that works entirely offline, supports multiple voices, and outputs 320kbps MP3 files? Look no further!

I just released my open-source project: Offline Python TTS Voiceover Tool. You can check it out on GitHub here: voiceover_tool.

Features

  • Offline Operation: No API calls, no internet required.
  • Multi-Voice Support: Switch between multiple speaker embeddings easily.
  • High-Quality MP3 Output: 320kbps CBR MP3 files with ID3 metadata.
  • Fast and Lightweight: Optimized ONNX Runtime inference with quantized VITS models.
  • Flexible Input: CLI accepts inline text, text files, and batch directories.
  • Voice Cloning: Generate new voices with speaker embeddings.
  • Optional GUI: Minimal Tkinter interface for preview, queue management, and rendering.
  • Programmatic Integration: Local REST API for integrating TTS into your workflow.
  • Prosody Controls: Adjust pitch, rate, volume, and emphasis per phrase.
  • Subtitle and Timing Support: Export word-level timestamps in JSON, SRT, or VTT formats.

Why This Tool

Many TTS solutions require cloud APIs or come with licensing restrictions. This tool is:

  • Fully open-source under MIT license for the app code.
  • Cross-platform for Linux, macOS, and Windows.
  • Compact and efficient, designed for developers, content creators, and educators.

Top comments (0)