Free Registration, Free Dataset, and $20K Prize Pool: Join the 2nd MLC-SLM Challenge 2026

#ai #machinelearning #llm #nlp

The 2nd Multilingual Conversational Speech Language Models Challenge 2026 is now open for registration.

This year’s challenge focuses on advancing Speech Large Language Models for real-world multilingual conversational speech, with tasks covering speaker diarization, speech recognition, and conversational speech understanding.

Why join?

The 2nd MLC-SLM Challenge offers:

Free registration
Free access to a large-scale multilingual conversational speech dataset for registered participants, featuring around 2,100 hours of data across 14 languages
A total prize pool of** USD 20,000** Support for both academic and industry teams, as well as individual researchers

The first MLC-SLM Challenge attracted 78 teams from 13 countries and regions, with 489 valid leaderboard submissions and 14 technical reports. Its summary paper has also been accepted by ICASSP 2026.

Challenge tasks

Participants can work on two tracks:

Write on Medium
Task 1: Multilingual Conversational Speech Diarization and Recognition
Build systems that identify who is speaking when and transcribe multilingual conversational speech. No oracle segmentation or speaker labels will be provided during evaluation.

Task 2: Multilingual Conversational Speech Understanding
Build systems that understand multilingual conversations through acoustic and semantic information. Evaluation will be based on multiple-choice questions about the full conversation.

Both pipeline-based and end-to-end Speech LLM systems are welcome. External datasets and pretrained models are allowed, as long as they are freely accessible and clearly reported.

Dataset highlights

The challenge dataset contains around 2,100 hours of two-speaker conversational speech across 14 languages.

It also includes diverse regional accents, such as** Canadian French, Mexican Spanish, Brazilian Portuguese, British English, American English, Australian English, Indian English, and Philippine English.**

This makes the challenge a valuable testbed for researchers working on multilingual ASR, speaker diarization, Speech LLMs, and spoken language understanding.