TL;DR: I built boloaurlikho.com — a free tool that transcribes calls and runs quality audits on them. Supports 99+ languages, no signup required. Here's the technical journey and what I learned about building AI-powered QA tools.
The Problem I Was Solving
I run a telecalling operation in India. Our QA process was painful — team leads manually listening to call recordings, filling Excel scorecards, spending 3x the call duration just to audit one conversation. At 500+ calls/day, we were only auditing about 5% of total volume.
I wanted something that could:
- Transcribe every call automatically (Hindi, English, Hinglish — we use all three)
- Flag compliance issues without human review
- Score calls on parameters like greeting, pitch delivery, objection handling, and closure
- Work without expensive enterprise contracts
The Tech Stack
The core is OpenAI's Whisper model for speech-to-text. Whisper's multilingual capability was the deciding factor — most transcription APIs choke on code-switched Indian English. Whisper handles "aap ka account number bata dijiye, I'll check the status" without breaking.
On top of transcription, I built audit layers:
- Keyword compliance: Checks if mandatory disclosures, greetings, and CTAs were spoken
- Sentiment analysis: Tracks tone shifts through the conversation
- Dead air detection: Flags excessive silence (usually means agent was scrambling)
- Timestamp markers: So reviewers can jump to problem spots instead of listening to full calls
What Surprised Me
1. Whisper's accuracy on Indian accents is genuinely impressive. We tested against Google Speech-to-Text and AWS Transcribe — Whisper won on Hindi and mixed-language content by a significant margin.
2. The audit layer is more valuable than the transcription. Everyone builds transcription tools. The real unlock is what you do with the text after. Automated scoring against custom QA parameters saves 80%+ of manual review time.
3. People use it for things I never expected. Sales teams auditing their own cold calls. Podcast creators checking interview quality. Students verifying lecture transcription accuracy. A lawyer transcribing witness depositions.
Try It
The tool is free at boloaurlikho.com. No signup, no paywall. Supports MP3, WAV, M4A, OGG, WEBM, FLAC. Currently handles files up to 20 minutes, with longer audio support coming soon.
If you're building something similar or have questions about the Whisper integration, happy to chat in the comments.
Top comments (0)