DEV Community

Hainan Zhao
Hainan Zhao

Posted on

Building Audio Alerts with TTS for AI Coding Agents

Building Audio Alerts with TTS for AI Coding Agents

Ever wish your AI coding agent could actually speak to you? I've been working on adding audio alerts with text-to-speech to Gemini CLI, and here's how it works.

The Problem

When running AI coding agents in the background, you often miss important moments:

  • When the agent needs your permission to run a tool
  • When an error occurs
  • When a long task completes

You either have to constantly check the terminal or miss important updates entirely.

The Solution: Audio Alerts + TTS

I built a hook system that plays themed sounds followed by spoken messages:

Sounds

Each theme has its own sound:

  • retro - 8-bit game sounds
  • espionage - High-tech clicks
  • hero - Dramatic fanfare
  • portal - Magical transitions
  • premium - Elegant chimes

Spoken Messages

After each sound, a TTS message plays:

Event Retro Theme Espionage Theme
Permission needed "Permission needed" "Agent requesting permission"
Error "Error detected" "Critical failure detected"
Done "Game over. You win" "Mission accomplished"

Cross-Platform Support

The implementation works on:

  • macOS: Uses built-in say command and afplay
  • Linux: Supports espeak, gtts-cli, and play
  • Windows: Uses PowerShell's System.Speech and SoundPlayer

How It Works

The hook intercepts Gemini CLI events:

# Play sound
play_audio "$ASSETS_DIR/done.wav"

# Wait a bit
sleep 0.3

# Speak the message
speak "Task complete"
Enter fullscreen mode Exit fullscreen mode

All runs non-blocking, so the agent keeps working while you get audio feedback.

Installation

Clone the repo:

git clone https://github.com/HainanZhao/gemini-extension-audio-alerts
cd gemini-extension-audio-alerts
Enter fullscreen mode Exit fullscreen mode

Set your theme:

export AUDIO_ALERTS_THEME=retro
Enter fullscreen mode Exit fullscreen mode

What's Next

Possible enhancements:

  • Custom voice selection
  • More themes
  • Sound mixing
  • Volume control

The code is open source and available at: https://github.com/HainanZhao/gemini-extension-audio-alerts


This post was written with assistance from AI coding agents.

Top comments (0)