This is a submission for the AssemblyAI Voice Agents Challenge
What I Built
AudiaGenix is an intelligent real-time voice assistant built for next-generation customer support and business automation.
It goes beyond simple transcription by:
Recognizing business terms, proper nouns & domain-specific vocabulary.
Detecting sentiment and emotion to adapt its tone.
Anticipating issues and proactively suggesting workflows (like appointment scheduling or troubleshooting).
Seamlessly escalating to human agents with full context transfer.
Learning dynamically from conversations to improve over time.
This project addresses all three prompt categories:
- Business Automation — automating routine support & scheduling tasks.
- Real-Time Performance — leveraging ultra-fast transcription (≤300ms latency) and live sentiment detection.
- Domain Expert — dynamically recognizing and adapting to business-specific terms.
Demo Video
https://vimeo.com/1104968010?share=copy
GitHub Repository
https://github.com/AberTheCreator/AudiaGenix
Technical Implementation & AssemblyAI Integration
import { AssemblyAI } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!
});
export interface TranscriptionResult {
text: string;
confidence: number;
words?: Array<{
text: string;
start: number;
end: number;
confidence: number;
}>;
}
export async function transcribeAudio(audioBuffer: Buffer): Promise {
try {
// Upload audio file to AssemblyAI
const uploadUrl = await client.files.upload(audioBuffer);
// Create transcription with real-time features
const transcript = await client.transcripts.transcribe({
audio: uploadUrl,
speaker_labels: true,
auto_highlights: true,
sentiment_analysis: true,
entity_detection: true,
punctuate: true,
format_text: true,
});
if (transcript.status === 'error') {
throw new Error(transcript.error || 'Transcription failed');
}
return {
text: transcript.text || '',
confidence: transcript.confidence || 0,
words: transcript.words?.map(word => ({
text: word.text,
start: word.start,
end: word.end,
confidence: word.confidence
}))
};
} catch (error) {
console.error('AssemblyAI transcription error:', error);
throw new Error('Failed to transcribe audio');
}
}
export async function createRealtimeTranscription() {
try {
// Create real-time transcription session
const rt = client.realtime.transcriber({
sample_rate: 16000,
word_boost: ['customer', 'support', 'billing', 'technical', 'issue'],
boost_param: 'high'
});
return rt;
} catch (error) {
console.error('Real-time transcription setup error:', error);
throw new Error('Failed to setup real-time transcription');
}
}
export function detectSentiment(text: string): 'positive' | 'neutral' | 'negative' | 'frustrated' {
const lowerText = text.toLowerCase();
// Enhanced sentiment detection
const frustrationWords = ['frustrated', 'angry', 'terrible', 'awful', 'hate', 'broken', 'stupid', 'worst', 'horrible', 'mad', 'annoyed'];
const negativeWords = ['bad', 'poor', 'disappointing', 'wrong', 'problem', 'issue', 'trouble', 'difficult', 'slow'];
const positiveWords = ['great', 'good', 'excellent', 'perfect', 'love', 'amazing', 'wonderful', 'fantastic', 'helpful', 'thank'];
const frustrationScore = frustrationWords.filter(word => lowerText.includes(word)).length;
const negativeScore = negativeWords.filter(word => lowerText.includes(word)).length;
const positiveScore = positiveWords.filter(word => lowerText.includes(word)).length;
if (frustrationScore > 0) return 'frustrated';
if (negativeScore > positiveScore) return 'negative';
if (positiveScore > 0) return 'positive';
return 'neutral';
}
Future Works.
Live real-time transcription session support with AssemblyAI’s Universal-Streaming.
Adaptive persona that changes tone and language based on user sentiment or urgency.
Proactive issue anticipation by detecting user intent and emotional cues.
Seamless escalation to a human agent with full context transfer.
Supervisor “whisper” mode for human feedback during live calls.
Real-time visual push of images, diagrams, or co-browsing links synced to the voice conversation.
These features would help AudiaGenix evolve from a helpful assistant into a fully adaptive, empathetic, and explainable customer support partner.
Conclusion
Thanks to Dev.to and AssemblyAI for hosting this challenge, it pushed me to design and build AudiaGenix, an intelligent customer support voice assistant that goes beyond typical transcription.
I am excited to keep developing AudiaGenix into a fully empathetic, explainable, and truly helpful voice agent for businesses and users alike.
If you found this interesting, feel free to react, leave a comment with your thoughts, and follow me.
Top comments (0)