“MongoDB: The Netflix Data Saga”

#dataengineering #mongodb #database #learningjourney

Episode 1: Data Never Sleeps

Tonight’s episode takes us deep into the world of Netflix titles, but not through binge-watching. Instead, we’re stepping behind the scenes — into the database where all the magic begins.

Our tool of choice? MongoDB.
Our mission? To clean, query, and command the data — just like the mastermind of a thrilling heist.

Act 1: The Pilot — Setting Up MongoDB

Every great show starts with a setup.
We installed MongoDB, spun up our database called netflixDB, and created a collection called titles.
Think of it as the stage where our actors (movies & TV shows) perform.

Act 2: Enter the Cast — Insert Records

Just like introducing characters in the first season, we manually inserted 10 Netflix titles into our titles collection. Each came with their attributes: title, country, release_year, description, and even a ratingValue.

db.titles.insertMany([
  { "show_id": "s1", "title": "The Irishman", "country": "United States", "ratingValue": 4.7 },
  { "show_id": "s2", "title": "Sacred Games", "country": "India", "ratingValue": 4.5 },
  ...
])

Act 3: The Drama — Who Rules the Ratings?

Every series has its critics, and in our dataset, ratings decide who takes the throne.
We asked MongoDB: “Show us the top 5 Netflix titles with the highest average rating.”

db.titles.aggregate([
  { $group: { _id: "$title", avgRating: { $avg: "$ratingValue" } } },
  { $sort: { avgRating: -1 } },
  { $limit: 5 }
])

And just like in a finale twist, Inception and Dangal battled for the top spot!

Act 4: The Mystery of the Word “Good”

What if we only looked at descriptions with the word good?
MongoDB, our detective, quickly scanned all storylines:

db.titles.countDocuments({
  description: { $regex: "good", $options: "i" }
})

Act 5: Stories from India

Every drama has its backdrop.
We filtered our data to see only those titles that originated in India:

db.titles.find({ country: "India" })

Act 6: Plot Twists — Update & Delete

Just like characters evolve, so does data.
We updated one record (s3) with a refreshed description:

db.titles.updateOne(
  { show_id: "s3" },
  { $set: { description: "Updated review: A very good and insightful Netflix documentary." } }
)

But sometimes, characters leave the show.
So we deleted record s6:

db.titles.deleteOne({ show_id: "s6" })

Finale: Exporting the Series

And no season is complete without taking the story global.
We exported our data to JSON and CSV formats for safe keeping and further analysis.

Closing Credits 🎬

By the end of this binge-worthy database journey, we:
✅ Inserted Netflix titles
✅ Found top 5 highest-rated shows
✅ Counted descriptions with “good”
✅ Filtered titles by country
✅ Updated and deleted records
✅ Exported query results

💡 Why it matters:
This exercise mirrors real-world data engineering tasks:

inserting raw data,

performing aggregations,

filtering for insights,

maintaining data quality,

and finally exporting results for downstream use.

And just like any great show, this is only Season 1.
Stay tuned for more adventures in MongoDB & Data Engineering.