Today, I’m excited to guide you through building a graph-based movie recommendation system in Python. We’ll use the BFS (Breadth-First Search) algorithm to recommend movies based on similarity and genre.
If you want to watch movies while following along, you can download UNITV for Fire TV and start streaming right away.
What We’re Building
By the end of this project, our CLI will allow us to:
- Search movies by title or genre
- Find similar movies based on direct connections
- Explore extended recommendations (movies similar to similar movies)
- Display top-rated movies sorted by rating
Project Setup
Here’s the recommended folder structure:
movie_recommender/
├── README.md
├── data/
│ └── movies_sample.json
├── main.py
└── src/
├── movie_data.py
└── search.py
- data/movies_sample.json → Small internal movie database
- main.py → CLI entry point
- src/ → Movie logic and BFS algorithm
Sample Movie Data
{
"movies": [
{
"id": 1,
"title": "Inception",
"year": 2010,
"genres": ["Sci-Fi", "Thriller"],
"rating": 8.8
},
{
"id": 2,
"title": "28 Days Later",
"year": 2002,
"genres": ["Sci-Fi", "Drama", "Thriller"],
"rating": 7.3
}
],
"connections": [
{
"movie_id": 1,
"similar_to": [2]
}
]
}
Creating the MovieDatabase Class
This class loads the movie data and connections:
import json
class MovieDatabase:
def __init__(self):
self.movies = {} # id -> movie info
self.connections = {} # id -> list of connected movie ids
def load_data(self, filepath):
with open(filepath, 'r') as f:
data = json.load(f)
for movie in data['movies']:
self.movies[movie['id']] = movie
for connection in data.get('connections', []):
self.connections[connection['movie_id']] = connection['similar_to']
def get_movie(self, movie_id):
return self.movies.get(movie_id)
def get_all_movies(self):
return list(self.movies.values())
Creating the MovieSearcher Class
Handles search and recommendations:
class MovieSearcher:
def __init__(self, database):
self.db = database
def search_by_title(self, query):
results = []
query_lower = query.lower()
for movie in self.db.movies.values():
if query_lower in movie['title'].lower():
results.append(movie)
return results
def search_by_genre(self, genre):
results = []
genre_lower = genre.lower()
for movie in self.db.movies.values():
movie_genres = [g.lower() for g in movie['genres']]
if any(genre_lower in g for g in movie_genres):
results.append(movie)
return results
def find_recommendations(self, movie_id, depth=1):
if movie_id not in self.db.movies:
return []
visited = set()
queue = [(movie_id, 0)]
recommendations = []
while queue:
current_id, current_depth = queue.pop(0)
if current_id in visited or current_depth > depth:
continue
visited.add(current_id)
if current_id != movie_id:
recommendations.append({
'movie': self.db.movies[current_id],
'distance': current_depth
})
if current_id in self.db.connections:
for connected_id in self.db.connections[current_id]:
if connected_id not in visited:
queue.append((connected_id, current_depth + 1))
return recommendations
Wrapping Up
This project shows how the BFS algorithm can recommend movies based on **direct and extended connections**. You can now extend it to support:
- Search by actor or director
- Top-rated recommendations
- CLI improvements or a GUI front-end
Thank you for reading! I hope you enjoyed this guide and learned more about graph algorithms in practical Python applications.
Happy coding,
Dwayne John
Top comments (0)