ASL Hand Sign Recognition using Neural Networks and Mediapipe

This project uses MediaPipe to extract hand landmarks and a Random Forest model to recognize American Sign Language (ASL) alphabet letters. It also includes real-time sign recognition using your webcam.

Features

Uses the ASL Alphabet dataset from Kaggle
Extracts hand landmarks using MediaPipe
Trains a Random Forest classifier on landmark data
Tests accuracy on validation data
Predicts ASL hand signs in real-time via webcam

This model supports all static ASL letters (A–Z), even motion-based signs like "J" and "Z".

Tools & Technologies Used

MediaPipe: For extracting 21 hand landmarks (x, y, z) from each hand
OpenCV: For webcam input and image processing
scikit-learn: For training and evaluating the Random Forest model
Python: Language used for scripting and development

How It Works

Data Collection: Hand images for each ASL letter are passed through MediaPipe to extract 21 landmarks per hand.
Data Formatting: Each landmark includes (x, y, z), resulting in 63 values per frame.
Model Training: A Random Forest classifier is trained using this data.
Real-Time Prediction: The webcam captures live hand gestures, which are processed and classified into ASL letters.

Steps to Build It Yourself

Step 1: Collect Data
Use the ASL Alphabet dataset from Kaggle. Extract hand landmarks from each image using MediaPipe and save them in a CSV file for training.

Step 2: Train the Model
Train a Random Forest classifier using the landmark CSV. This model learns to distinguish between different hand poses.

Step 3: Test the Accuracy
Run tests using a validation set to check how well your model is performing. Aim for high accuracy — Random Forests usually perform well with this kind of data.

Step 4: Real-Time Recognition
Use your webcam to capture hand signs in real-time, feed them through MediaPipe, and classify them with the trained model. The predicted letter is shown on screen.

Why Use Landmarks Instead of Images?
Using landmark coordinates is far more efficient than training heavy image-based models like CNNs. It reduces training time, improves performance on low-resource devices, and works surprisingly well with static signs.

Final Thoughts
This project is a practical, lightweight introduction to real-time hand gesture recognition. If you're interested in computer vision, sign language, or accessibility tech, this is a great way to dive in. By combining MediaPipe and a simple machine learning model, we’ve built something that can make communication more inclusive — one hand sign at a time.

Checkout : https://github.com/VivanRajath/ASL

DEV Community

ASL Hand Sign Recognition using Neural Networks and Mediapipe

Top comments (0)