DEV Community

Cover image for ๐Ÿง‘โ€๐Ÿ’ป AI-Powered Multilingual Translator โ€” Kaggle Notebook & Telegram Bot Project
Akse1588
Akse1588

Posted on

๐Ÿง‘โ€๐Ÿ’ป AI-Powered Multilingual Translator โ€” Kaggle Notebook & Telegram Bot Project

Welcome to my AI-powered multilingual translation project, where I integrated state-of-the-art technologies for transcription, translation, and voice synthesis. This project includes two main components: a Kaggle notebook and a Telegram bot, both powered by Whisper, Transformers, and gTTS. Letโ€™s dive into each of them!

๐Ÿ“Š Kaggle Project: AI Translator Notebook

The Kaggle Notebook serves as the backbone of this project, providing an end-to-end transcription and translation pipeline that can be easily replicated and adapted.

Purpose
The Kaggle notebook demonstrates how to:

  1. Transcribe audio files using Whisper (OpenAIโ€™s model)
  2. Translate transcribed text into multiple languages using Hugging Faceโ€™s M2M100 model
  3. Generate speech in the translated language using gTTS (Google Text-to-Speech) or other TTS libraries

This project aims to provide an easy-to-use solution for transcription and translation, making it useful for language learners, travelers, or anyone who needs real-time language support.

Features

  1. Audio Transcription: Uses Whisper to convert audio files (MP3, WAV, OGG) into text.
  2. Multilingual Translation: Translates transcribed text from one language to another with high accuracy.
  3. Text-to-Speech: Converts translated text back into audio, making it accessible in the target language.
  4. PDF Text Translation: Extracts text from PDF documents and translates it.

How to Use It

  1. Load your dataset: If you want to test with your own audio, PDF, or text, upload them directly into the notebook.
  2. Run the transcription cell: Use Whisper to transcribe the audio to text.
  3. Run the translation cell: Translate the transcription into the target language using the M2M100 model.
  4. Generate audio: Use gTTS to convert the translated text into speech.

Who Can Use It?

  1. Students: To transcribe and translate audio notes, lectures, and podcasts.
  2. Travelers: To easily convert voice messages into the local language.
  3. Language Enthusiasts: Learn new languages by transcribing, translating, and listening to sentences in the target language.
  4. Researchers: Process audio datasets or translate academic materials automatically.

๐Ÿค– Telegram Bot: AI-Powered Multilingual Translation

The Telegram bot offers the same powerful multilingual translation capabilities in a user-friendly format. The goal of the bot is to allow anyone to easily transcribe, translate, and listen to translated text/audio from within Telegram.

Purpose
The Telegram bot is designed to:

  1. Transcribe voice messages: Convert speech to text automatically.
  2. Translate text/audio: Convert transcribed or typed text into a different language.
  3. Generate speech: Translate text and speak it out loud in the target language.

Features

  1. Voice Message Transcription: The bot accepts voice messages, transcribes them using Whisper, and returns the transcription.
  2. Multilingual Translation: Translates the transcription to the target language (supports 10+ languages).
  3. Text-to-Speech: Generates speech in the translated language using gTTS or pyttsx3.
  4. PDF Parsing: Allows users to upload PDF documents, transcribes text, and translates it.

How to Use It

  1. Start the Bot: Open the Telegram bot and type /start to initiate the bot.
  2. Send a Voice Message: Record a voice message and the bot will transcribe and translate it automatically.
  3. Send Text: You can type or paste any text into the bot, and it will translate it for you.
  4. Send a PDF: Upload a PDF document, and the bot will extract text and translate it into the chosen language.
  5. Get Speech Back: After translation, the bot will provide the translated text as a voice message, so you can hear the translation.

Who Can Use It?

  1. Travelers: If youโ€™re traveling to a country where you donโ€™t speak the language, this bot can help translate your speech and provide an audio translation instantly.
  2. Students and Teachers: Easily translate lecture notes or class discussions. Teachers can use it for multilingual classroom support.
  3. Language Learners: Great for practicing pronunciation in different languages by hearing the translated speech.
  4. Professionals: For those who need quick translations in the field, whether in meetings, calls, or interviews.

๐Ÿ”ฎ Future Plans

  1. Speech Improvement: Implement advanced speech synthesis models for more natural-sounding voice outputs (e.g., ElevenLabs).
  2. Mobile App: Create a mobile version of the bot to help users access translations on the go.
  3. Customizable Voice Profiles: Allow users to choose from multiple voices and accents for translations.

๐Ÿง‘โ€๐Ÿ’ป About Me

Hi! Iโ€™m Aksel, a 16-year-old self-taught developer from Armenia ๐Ÿ‡ฆ๐Ÿ‡ฒ
Iโ€™m passionate about building useful tools with AI, back-end tech, and modern software engineering.

๐Ÿ”ง Skills & Interests

  1. ๐Ÿ‘จโ€๐Ÿ’ป Backend Dev: Python, PHP, Laravel, C++, MySQL
  2. ๐Ÿค– AI & NLP: Whisper, Transformers, LLMs
  3. ๐Ÿ“ฑ Telegram Bots, Automation, Web Development
  4. ๐ŸŽฎ Game Dev: Unreal Engine
  5. ๐Ÿ“š Lifelong Learner | Passionate about building with purpose

๐ŸŒ Connect with Me

This project combines the power of AI with real-world accessibility. By integrating tools like Whisper, Hugging Face Transformers, and gTTS, it enables seamless transcription, translation, and voice synthesis across multiple languages. Whether youโ€™re using the Telegram bot for quick translations or exploring the full pipeline on Kaggle, itโ€™s designed to help break language barriers. Built with care by a passionate young developer, itโ€™s a step toward smarter, more inclusive global communication. ๐ŸŒโœจ

Top comments (0)