Free Registration, Free Dataset, and $20K Prize Pool: Join the 2nd MLC-SLM Challenge 2026

Nexdata AI — Wed, 29 Apr 2026 07:15:49 +0000

The 2nd Multilingual Conversational Speech Language Models Challenge 2026 is now open for registration.

This year’s challenge focuses on advancing Speech Large Language Models for real-world multilingual conversational speech, with tasks covering speaker diarization, speech recognition, and conversational speech understanding.

Why join?

The 2nd MLC-SLM Challenge offers:

Free registration
Free access to a large-scale multilingual conversational speech dataset for registered participants, featuring around 2,100 hours of data across 14 languages
A total prize pool of** USD 20,000** Support for both academic and industry teams, as well as individual researchers

The first MLC-SLM Challenge attracted 78 teams from 13 countries and regions, with 489 valid leaderboard submissions and 14 technical reports. Its summary paper has also been accepted by ICASSP 2026.

Challenge tasks

Participants can work on two tracks:

Write on Medium
Task 1: Multilingual Conversational Speech Diarization and Recognition
Build systems that identify who is speaking when and transcribe multilingual conversational speech. No oracle segmentation or speaker labels will be provided during evaluation.

Task 2: Multilingual Conversational Speech Understanding
Build systems that understand multilingual conversations through acoustic and semantic information. Evaluation will be based on multiple-choice questions about the full conversation.

Both pipeline-based and end-to-end Speech LLM systems are welcome. External datasets and pretrained models are allowed, as long as they are freely accessible and clearly reported.

Dataset highlights

The challenge dataset contains around 2,100 hours of two-speaker conversational speech across 14 languages.

It also includes diverse regional accents, such as** Canadian French, Mexican Spanish, Brazilian Portuguese, British English, American English, Australian English, Indian English, and Philippine English.**

This makes the challenge a valuable testbed for researchers working on multilingual ASR, speaker diarization, Speech LLMs, and spoken language understanding.

Registration

Registration is now open.

Participation is free, and the dataset will be provided free of charge to registered participants.

Registration Link: https://forms.gle/jfAZ95abGy4ZiNHo7

More Details: https://www.nexdata.ai/competition/mlc-slm

Contact Email: mlc-slmw@nexdata.ai

Join the challenge and help advance the next generation of multilingual Speech LLMs.

Interspeech 2025 Multilingual Conversational Speech Language Model (MLC-SLM) Challenge

Nexdata AI — Thu, 20 Mar 2025 08:11:26 +0000

The Multilingual Conversational Speech LLM (MLC-SLM) Challenge is now open as a satellite event of Interspeech 2025!

Hosted by Meta, Google, Samsung Electronics, NAVER Corp, China Mobile, Northwestern Polytechnical University and Nexdata, this challenge aims to advance multilingual conversational AI by developing cutting-edge speech language models and providing access to a real-world multilingual conversational speech dataset.

The challenge consists of two tasks, both of which require participants to explore the development of speech language models (SLMs):

Task I: Multilingual Conversational Speech Recognition

Objective: Develop a multilingual LLM-based ASR model. Participants will be provided with oracle segmentation and speaker labels for each conversation.

Task II: Multilingual Conversational Speech Diarization and Recognition

Objective: Develop a system for both speaker diarization (identifying who is speaking when), and recognition (transcribing speech to text). No prior or oracle information will be provided during evaluation (e.g., no pre-segmented utterances or speaker labels). Both pipeline-based and end-to-end systems are encouraged, providing flexibility in system design and implementation.

The training set (Train) comprises approximately 11 languages: English (en), French (fr), German (de), Italian (it), Portuguese (pt), Spanish (es), Japanese (jp), Korean (ko), Russian (ru), Thai (th), Vietnamese (vi).

Important Dates (AOT Time)

March 10, 2025: Registration opens

March 15, 2025: Training data release

April 1, 2025: Development set and baseline system release

May 15, 2025: Evaluation set release and Leaderboard open

May 30, 2025: Leaderboard freeze and paper submission portal opens (CMT system)

June 15, 2025: Paper submission deadline

July 1, 2025: Notification of acceptance

August 18, 2025: Workshop date

We have set a prize pool of $20,000 for the winners. Based on performance, the top three teams in each track will be awarded:

1st Prize: $5,000

2nd Prize: $3,000

3rd Prize: $2,000

🔗 Join now: https://lnkd.in/gwR8dvVp

📩 Register here: https://lnkd.in/gUYs9M4Y