DEV Community

Cover image for Building a Movie Recommender System: A Journey from Pre-Med to ML 🎬
Urooj Fatima
Urooj Fatima

Posted on

Building a Movie Recommender System: A Journey from Pre-Med to ML 🎬

Transitioning from a pre-medical background to Electrical Engineering at NUST taught me one thing: Math is the universal language of logic. Recently, I decided to dive deep into Machine Learning to build a Content-Based Movie Recommender System.

In this post, I’ll walk you through how I used NLP and Cosine Similarity to suggest movies based on user preferences.

The Tech Stack πŸ› οΈ
Language: Python

Libraries: Pandas, NumPy, Scikit-learn, NLTK

Dataset: TMDB 5000 Movies Dataset

The Workflow 🧠

  1. Data Cleaning & Feature Selection
    The first step was to merge datasets and extract relevant features like genres, keywords, cast, and crew. I created a "tags" column that combines all these textual descriptions.

  2. Text Preprocessing (Stemming)
    To make sure "action" and "actions" are treated the same, I used NLTK's PorterStemmer.

Python
from nltk.stem.porter import PorterStemmer
ps = PorterStemmer()

Applied to the tags column

  1. Vectorization (Bag of Words)
    I converted the text tags into 5,000-dimensional vectors using CountVectorizer, removing common English stop words.

  2. The Mathematical Engine: Cosine Similarity
    Instead of Euclidean distance, I used Cosine Similarity to calculate the angular distance between movie vectors. The closer the vectors, the more similar the movies!

Key Challenges 🚧
The biggest hurdle was managing the large similarity matrix in a cloud environment. Dealing with memory limits and "truncated files" taught me a lot about efficient data handling and the importance of proper serialization using pickle.

Conclusion & Future Scope
This project was a fantastic way to apply linear algebra and NLP concepts. My next step is to deploy this as a full web app and integrate movie posters via API.

Check out the full source code on my GitHub: πŸ‘‰ https://github.com/Urooj25/Movie-Recommender-System.git

Let’s Connect!
I'm always open to feedback and collaboration. Drop a comment or connect with me on LinkedIn!

Top comments (0)