Benchmarking five live translation systems with an open-source eval harness (including OpenAI's GPT-Realtime-Translate)

#ai #openai #machinelearning #opensource

We built an open-source evaluation harness for live speech-to-speech translation and used it to benchmark five platforms head-to-head. This post walks through the methodology (GEMBA-MQM v2 for accuracy, Ear-Voice Span for latency) and the results.

Eval harness: https://github.com/VoiceFrom/live-s2st-eval

DEV Community

Benchmarking five live translation systems with an open-source eval harness (including OpenAI's GPT-Realtime-Translate)

Top comments (0)