DEV Community

Ryan Winston
Ryan Winston

Posted on

Building a Voice AI Platform with 28 Modules in Python

What I Built

Omni-VRAM is an open-source voice AI platform with 28 modules.

GitHub: https://github.com/Liangchenxu/Omni-VRAM

Features

  • Speech Recognition: Whisper with 5 backends (faster-whisper, whisper.cpp, ONNX, TensorRT, OpenAI API)
  • Real-time Streaming: <200ms latency
  • Speaker Diarization: Who spoke when
  • Emotion Recognition: 6 emotions
  • TTS Synthesis: Edge-TTS + pyttsx3
  • Chinese Processing: Punctuation, tokenization, dialects
  • Meeting Assistant: Auto summarization with LLM
  • APIs: REST, WebSocket, gRPC
  • Docker: GPU and CPU support

Tech Stack

Python, PyTorch, CUDA, FastAPI, Whisper

Installation


bash
pip install omni-vram
Enter fullscreen mode Exit fullscreen mode

Top comments (0)