DEV Community

Kai Sereni
Kai Sereni

Posted on

A Discord Bot that Teaches ASL

This is a submission for the Built with Google Gemini: Writing Challenge

What I Built with Google Gemini

I go to RIT, a school in New York State that shares a campus with the National Technical Institute for the Deaf (NTID). I really wanted to learn ASL, but the learning resources online were either unnecessarily time-intensive, behind a paywall, or both. I could have spent 30 hours learning ASL from YouTube, but I'm a programmer, so instead I spend 3 months developing a Discord bot that uses basic language acquisition principles to quickly make the user functionally conversational in ASL such that they can effectively communicate.
The bot intentionally doesn't teach deaf culture or slang because I believe that knowledge is best acquired by interacting with the real ASL community, so I decided to prioritize basic vocabulary and syntax and rely on the user to meet and talk to those who are more embedded in the culture using these basic tools.
The hardest part of this bot was creating a system to translate a string of text into ASL. I found a database online that contained recordings of people demonstrating different signs, but I couldn't directly translate a string of text because there are syntactical differences, plus there are sets of words in english that are homonyms but have different signs. In order to generate all the necessary translations, I had to have the Google Gemini API convert spoken english syntax to ASL syntax, complete a homonym clarification step, get each word in the resulting string from the database, and string them all together in a GIF for the Discord bot to finally send. This was a really cool project that combined both Python and Javascript.
Gemini helped by writing a lot of the more tedious portions of the codebase. I wrote the lesson plan into a Json with tags which were to be replaced with a random word from a word bank that I also wrote, and words were to be slowly introduced to the user as they go through different modules. The user would also begin each module with a set of flash cards introducing them to the new words. Gemini helped with the parsing of the Json, as well as the syntax for the library I used for the video processing (moviepy) and for the Gemini API. I was also very unfamiliar with the syntax for the newer versions of the Discord applications API (the API for Discord bots), so Gemini helped me with that.

Demo and Repos

I haven't implemented the ASL GIFs yet so I'm using placeholder text on the backs of the flashcards in this demo.

>> DEMO VIDEO <<
Text-to-ASL Repo: github.com/KaiSereni/text-to-asl
ASL Discord Bot Repo: github.com/KaiSereni/discord-asl

What I Learned

When I code with AI, I try to use AI as a learning tool first and foremost. Gemini explained the syntax of some of the libraries I was using, which was extremely helpful as the documentation can occasionally be rather hard to find online, and it's rare to see these niche Python libraries well-documented.

Google Gemini Feedback

One of my biggest issues with Gemini is that it can't be run locally on my own machine. I've taken a look at Gemma, but I wish Google released some larger models (400b, for example) as open-weight or even open-source.
I also hope that Gemini could be built into a browser, especially alongside an IDE integration like Antigravity. I had to figure out how to scrape data from the ASL website manually which involved a lot of tedious HTML parsing, which was rather dull.
Additionally, language models aren't very good with making sure the documentation they're working with for different libraries are up-to-date, and I'm sure this is something Google could fix either during training or with some other technique. Ask Gemini to write anything at all with moviepy and you'll see what I mean.
Lastly, this is somewhat off-topic, but I hope Google makes their TPUs available for the consumer market. It's been absurdly difficult to get your hands on a GPU with enough VRAM to run an LM lately, not to mention the fact that it's wildly outside my price range.

Top comments (0)