Hey everyone, I built NOVA-VAD, a lightweight explainable Voice Activity Detector that outperforms every major open source VAD on real-world noisy audio. GitHub: https://github.com/monishmal3375/nova-vad

M M — Wed, 24 Jun 2026 03:08:00 +0000

GitHub - monishmal3375/nova-vad: Lightweight, explainable VAD that beats Silero and WebRTC on noisy audio. No GPU required. · GitHub

Lightweight, explainable VAD that beats Silero and WebRTC on noisy audio. No GPU required. - monishmal3375/nova-vad

github.com

I built a VAD that beats Silero, Pyannote, and WebRTC on noisy audio — here's how

M M — Sun, 21 Jun 2026 02:50:48 +0000

I built NOVA-VAD — a lightweight, explainable Voice Activity Detector that beats every major open source VAD on real-world noisy audio.

GitHub:(https://github.com/monishmal3375/nova-vad)

Benchmark (100 held-out files, never seen during training)

Model	Accuracy	Lightweight	Explainable
WebRTC VAD	58.0%	✅	❌
Pyannote VAD	62.0%	❌	❌
Silero VAD	87.0%	❌	❌
NOVA-VAD	93.0%	✅	✅

What makes it different

No PyTorch or GPU required — pure scikit-learn
Explains every decision with confidence scores and feature importance
Built-in denoiser pipeline
Retrainable on your own data

No existing VAD does all three simultaneously.

Example output
File: speech_001.wav

Prediction: SPEECH (93.47% confidence)

MFCC Delta 1 std (10.63%) → HIGH spectral change rate — dynamic audio like speech
MFCC Delta 2 std ( 6.14%) → HIGH acceleration — rapidly changing audio, speech-like
Silence ratio ( 5.92%) → 56% silence — mix of speech and pauses

DEV Community: M M

Hey everyone, I built NOVA-VAD, a lightweight explainable Voice Activity Detector that outperforms every major open source VAD on real-world noisy audio. GitHub: https://github.com/monishmal3375/nova-vad

GitHub - monishmal3375/nova-vad: Lightweight, explainable VAD that beats Silero and WebRTC on noisy audio. No GPU required. · GitHub

I built a VAD that beats Silero, Pyannote, and WebRTC on noisy audio — here's how