DEV Education Track Submission Draft
What I Built
I built VoxVerify AI, a web application that analyzes audio samples and detects whether a voice is AI-generated or human using Google Gemini and OpenAI audio models.
The app allows users to:
- Record audio in real time
- Upload audio files
- Analyze speech across multiple languages
- View forensic metrics and confidence scores
To generate parts of the application and analysis logic, I used prompt-based audio analysis workflows in Google AI Studio, focusing on:
- Speech naturalness detection
- Pitch and prosody evaluation
- Structured output for confidence scoring
I also implemented a visual dashboard to display forensic audio metrics such as spectral purity, prosodic fluidity, and pitch variance.
Demo
Features demonstrated:
- Real-time audio recording
- Upload and analysis workflow
- AI-generated vs Human classification
- Confidence scoring and explanations
- Metrics visualization dashboard
You can include here:
-
Screenshots of:
- Recording interface
- Results dashboard
- Confidence and explanation output
If deployed, add:
Live Demo: https://voxverify-ai-voice-detection.vercel.app/
GitHub Repo: https://github.com/YOGARATHNAM-S/voxverify-ai-voice-detection.git
My Experience
Working through the Build Apps with Google AI Studio track was a valuable learning experience.
Key Takeaways
1. Prompt Engineering for Structured Outputs
I learned how to design prompts that return structured and interpretable results instead of plain text responses. This was important for generating metrics and confidence scores.
2. Multimodal AI Capabilities
It was interesting to see how Gemini can process not only text but also audio inputs, enabling real-world applications like speech analysis.
3. Integrating AI into Real Applications
One of the most useful lessons was understanding how to integrate AI APIs into a full-stack application using:
- React
- TypeScript
- REST API workflows
4. UX Matters in AI Apps
Presenting AI results clearly through charts and explanations significantly improves usability and trust.
What Surprised Me
What surprised me most was how effective AI models can be at identifying subtle speech characteristics like:
- Prosody patterns
- Pitch irregularities
- Acoustic consistency
This opened my perspective on how AI can assist in digital forensics and misinformation detection.
What Iβd Improve Next
If I continue developing this project, I plan to add:
- Real-time streaming detection
- Speaker fingerprinting
- Model-assisted waveform and MFCC visualization
- Batch audio analysis
Closing Thoughts
This track helped me understand how to move from:
Experimenting with AI β Building real applications with AI
That shift from experimentation to engineering was the most valuable part of the experience.
Top comments (0)