This is a submission for the AssemblyAI Challenge : Really Rad Real-Time.
What I Built
SpeechCraft ποΈ - Real-time Speech Analytics Platform
Overview
SpeechCraft is an advanced real-time speech analytics platform that transforms spoken words into actionable insights. Using cutting-edge AI technology from AssemblyAI, it provides instant transcription while analyzing multiple dimensions of speech performance.
Key Features
1. Real-Time Transcription π
- Instant speech-to-text conversion
 - High-accuracy transcription
 - Support for natural conversation flow
 
2. Advanced Speech Metrics π
Speaking Pace Analysis
- Real-time words-per-minute tracking
 - Optimal pace guidance
 - Speed variation detection
 
Clarity Measurement
- Filler word detection
 - Sentence structure analysis
 - Pronunciation clarity scoring
 
Fluency Assessment
- Speech flow analysis
 - Transition word usage tracking
 - Pause pattern analysis
 
Speech Rhythm
- Sentence length variation
 - Speaking pattern analysis
 - Rhythm consistency scoring
 
Vocabulary Analysis
- Word variety measurement
 - Complex word usage tracking
 - Vocabulary richness scoring
 
3. Visual Analytics π
- Real-time metric visualization
 - Progress tracking
 - Performance trend analysis
 
Applications
Public Speaking
- Speech practice and improvement
 - Real-time feedback
 - Performance analytics
 
Education
- Language learning assistance
 - Speaking skill development
 - Pronunciation training
 
Professional Development
- Presentation skills enhancement
 - Communication training
 - Interview preparation
 
Content Creation
- Podcast transcription
 - Video content analysis
 - Speech quality improvement
 
Benefits
For Users
- Instant feedback on speaking performance
 - Comprehensive speech analytics
 - Objective performance metrics
 - Personal development tracking
 
For Organizations
- Communication skills training
 - Quality assurance for speakers
 - Standardized assessment tools
 - Data-driven improvement strategies
 
Future Enhancements
- Advanced sentiment analysis
 - Multi-language support
 - Custom metric configuration
 - Speech pattern recognition
 - Integration with learning management systems
 
Impact
SpeechCraft represents a significant advancement in speech analytics technology, providing users with powerful tools for improving their communication skills through real-time feedback and comprehensive analysis.
Demo
https://speechcraft.onrender.com/
Journey
Core Implementation π
- Server-Side Token Management
 
// Secure proxy server for token generation
app.get('/get-token', async (req, res) => {
    const response = await fetch('https://api.assemblyai.com/v2/realtime/token', {
        method: 'POST',
        headers: { 'Authorization': ASSEMBLY_AI_TOKEN }
    });
    res.json(await response.json());
});
- Real-time Audio Processing Pipeline
 
// Audio capture with optimized settings
const stream = await navigator.mediaDevices.getUserMedia({ 
    audio: {
        channelCount: 1,
        sampleRate: 16000,
        echoCancellation: true
    }
});
// WebSocket connection for real-time streaming
wsRef.current = new WebSocket(`wss://api.assemblyai.com/v2/realtime/ws?sample_rate=16000&token=${token}`);
// Send audio chunks every 250ms
mediaRecorder.ondataavailable = async (event) => {
    if (event.data.size > 0) {
        const base64Audio = await convertToBase64(event.data);
        wsRef.current.send(JSON.stringify({ audio_data: base64Audio }));
    }
};
- Real-time Transcript Processing
 
wsRef.current.onmessage = (message) => {
    const data = JSON.parse(message.data);
    if (data.message_type === 'FinalTranscript') {
        updateTranscription(data.text);
        updateMetrics(data.text);
    }
};
Key Features β‘
- Real-time audio streaming with optimized chunk size (250ms)
 - Secure WebSocket connection with token authentication
 - Automatic audio format handling
 - Error recovery and reconnection logic
 - Resource cleanup and memory management
 
Technical Highlights π§
- Sample rate: 16kHz mono audio
 - WebSocket protocol for low-latency communication
 - Base64 encoding for efficient data transmission
 - Automatic handling of partial and final transcripts
 - Integration with React state management
 
Credits:
Solution has been proudly provided by binarygarage.dev using assemblyai.com. For further information please contact contact@binarygarage.dev
              
    
Top comments (0)