DEV Community

Darien Calcedo Aguirre
Darien Calcedo Aguirre

Posted on

Real-Time Spanish Voice Agent with Python, AssemblyAI & <100ms Latency

AssemblyAI Voice Agents Challenge: Real-Time

🧠 Real-Time Voice Assistant with AssemblyAI

This is a submission for the AssemblyAI Voice Agents Challenge

💡 What I Built

This project is a real-time voice assistant designed for the Real-Time Voice Performance category of the AssemblyAI Challenge.

It listens continuously and reacts instantly to spoken commands like:

  • 🕒 “Dime la hora”
  • 💡 “Enciende la luz”
  • 🔕 “Apaga la luz”
  • 🚨 “Activa la alarma”

With latency under 100 ms, it demonstrates fast and natural voice interaction. Perfect for use cases like smart homes or accessibility tools.

🎥 Demo Video

Here's a short demo showing how the voice assistant works in real time:

📎 Watch the video on Google Drive

🔗 GitHub Repository

🗂️ GitHub – Calcedo87/AI-Voice-Agent

⚙️ Technical Overview

The assistant uses a modular architecture based on the following components:

  • 🎙️ Audio Input: MicrophoneStream captures real-time audio.
  • 🧠 Command Matching: handle_command() detects commands using fuzzy matching.
  • 🗣️ Text-to-Speech: Uses pyttsx3 for voice responses.
  • 🔌 AssemblyAI Integration: Real-time transcription via WebSocket streaming API.

🔧 AssemblyAI Integration Snippet


python
API_ENDPOINT = f"wss://streaming.assemblyai.com/v3/ws?{urlencode(CONNECTION_PARAMS)}"

ws_app = websocket.WebSocketApp(
    API_ENDPOINT,
    header={"Authorization": MY_API_KEY},
    on_open=on_open,
    on_message=on_message,
    on_error=on_error,
    on_close=on_close,
)

---

👤 Built by [@Calcedo87](https://github.com/Calcedo87)  
Thanks for checking out my project!
Enter fullscreen mode Exit fullscreen mode

Top comments (0)