DEV Community

Dalmas Chituyi
Dalmas Chituyi

Posted on

Multimodal multilingual LLM

SeamlessM4T is a single multilingual and multimodal model that can multitask to translate and transcribe with multiple input and Output languages.

πŸ—’ Some of the tasks SeamlessM4T model can do:

β˜‘οΈSpeech to Speech Translation
β˜‘οΈSpeech to Text translation
β˜‘οΈText to Speech translation
β˜‘οΈText to Text translation
β˜‘οΈAutomatic Speech recognition.

β†— This is a significant improvement over previous machine translation models, which could only translate speech to text in a handful of languages with limited output languages. πŸ’‘ SeamlessM4T is also able to implicitly recognize the source language, without the need for a separate language identification model.

Built from the work done and the understanding of some of this models :

πŸ”Ž No Language Left Behind (NLLB). A text-to-text machine translation model that supports 200 languages.

πŸ”Ž Massively Multilingual Speech. Provides automatic speech recognition, language identification, and speech synthesis technology across more than 1,100 languages.

πŸ”Ž Universal Speech Translator. Model unwritten language through speech to speech translations.

πŸ”Ž Speech Matrix. Large-scale Mined Corpus of Multilingual Speech-to-Speech Translations.

Top comments (0)