This is a submission for the AssemblyAI Voice Agents Challenge
What I Built
Domain Expert Voice Agent
How’s My Day? is a one-shot voice check-in app that helps users feel heard — emotionally, not just functionally.
In just one tap:
- The user speaks how they’re feeling
- The app transcribes their voice in real-time using AssemblyAI Universal Streaming
- It detects the emotional tone of their voice using AssemblyAI’s Speech Understanding
- It matches that emotion to a hand-curated emotional tip using Algolia MCP Server
- Finally, it reads the tip aloud using AssemblyAI’s Text-to-Speech — or falls back to ElevenLabs if needed
✨ The experience feels like being heard by someone who cares — not a chatbot.
🧠 How I Used AssemblyAI
We used AssemblyAI’s Universal-Streaming API to:
- Capture and transcribe voice input with <300ms latency
- Show live transcript to the user while they speak
- Handle punctuation and natural pauses beautifully
What I Learned
AssemblyAI’s emotion detection is shockingly accurate — tone alone can reveal so much more than words
Transcription feels like magic when done right — and AssemblyAI nailed it
Using voice as input and output feels more natural than a chatbot for mental wellness apps
People want calm, 1-shot interactions — not 20-message bots
Challenges
Browser-based mic streaming + latency management was tricky
Emotion ↔ tip mapping needed thoughtful writing
Not all users want to hear their feelings read back — we added a toggle
AssemblyAI TTS is clean, but fallback was needed for broader support
Demo
GitHub Repository
How's My Day? - AI-Powered Voice Mood Tracker
A sophisticated, voice-powered mood tracking web application that listens to your feelings and provides supportive, human-like responses using cutting-edge AI technology.
✨ Features
- 🎤 Professional voice recording - File-based audio capture with high quality
- 🎯 AI-powered transcription - AssemblyAI integration for accurate speech-to-text
- 🧠 Enhanced mood detection - Local algorithm with scoring and emotion mapping
- 🤖 GPT-4o-mini responses - Human-like, empathetic AI-generated support messages
- 🔊 High-quality TTS - OpenAI text-to-speech with natural voice synthesis
- ⌨️ Real-time typing animation - Text appears character-by-character during speech
- 🎨 Modern UI - Clean, responsive design with smooth animations
- 🚀 Full-stack architecture - Node.js backend with Express server
🚀 Quick Setup Guide
Prerequisites
- Node.js (v14 or higher) - Download here
- Git (optional) - For cloning the repository
- Modern web browser - Chrome, Firefox, Safari, or Edge
1. Installation
# Clone or download the project
git clone <
…
Top comments (0)