DEV Community

Hari Venkatesh
Hari Venkatesh

Posted on

Building a Simple Yelp-Style Dataset in MongoDB (Step-by-Step Guide)

When I first started exploring MongoDB, I wanted to try something practical instead of just going through documentation. So, I decided to build a mini Yelp-style dataset where businesses have reviews, and then run queries to analyze them.

This blog walks you through how I created the dataset, inserted it into MongoDB, and ran useful queries like finding top-rated businesses and analyzing reviews. πŸš€

πŸ“Œ Why This Project?

Data is everywhere, and being able to store, organize, and query it effectively is an essential skill for developers. MongoDB’s flexible schema makes it perfect for datasets like business reviews, where each review might have slightly different details.

πŸ“Œ Deliverables

Here’s what I aimed to achieve in this project:

βœ… Create a dataset of 25 businesses with reviews.
βœ… Insert data into MongoDB (using mongosh).
βœ… Run meaningful queries:

Find top 5 businesses by rating

Count reviews containing β€œgood”

Fetch reviews for a specific business

Update and delete records

βœ… Export data if needed for further analysis

πŸ“Œ Step 1: Creating the Dataset

I prepared a dataset of 25 businesses, each with:

  • business_id
  • name
  • rating
  • review

Here’s a small sample from the dataset:

{ "business_id": "B1", "name": "Cafe Mocha", "rating": 4, "review": "Good coffee and snacks" }
{ "business_id": "B2", "name": "Pizza House", "rating": 5, "review": "Excellent pizza, very good taste!" }
{ "business_id": "B3", "name": "Burger Point", "rating": 3, "review": "Average burger but good fries" }

πŸ“Œ Step 2: Setting Up MongoDB

I used MongoDB Compass for visualization and mongosh (MongoDB Shell) for running commands.

mongosh

Switch to a database (I called it yelp):

use yelp

πŸ“Œ Step 3: Inserting Data

Using insertMany(), I inserted all 25 records:

db.reviews.insertMany([
{ "business_id": "B1", "name": "Cafe Mocha", "rating": 4, "review": "Good coffee and snacks" },
{ "business_id": "B2", "name": "Pizza House", "rating": 5, "review": "Excellent pizza, very good taste!" },
...
{ "business_id": "B25", "name": "Coffee Day", "rating": 3, "review": "Average coffee, good sitting place" }
])

πŸ“Œ Step 4: Running Queries
πŸ”Ή 1. Top 5 Businesses by Rating
db.reviews.aggregate([
{ $group: { _id: "$business_id", name: { $first: "$name" }, avgRating: { $avg: "$rating" } } },
{ $sort: { avgRating: -1 } },
{ $limit: 5 }
])

πŸ‘‰ This gave me a leaderboard of the highest-rated restaurants.

πŸ”Ή 2. Count Reviews Containing the Word β€œGood”
db.reviews.countDocuments({ review: /good/i })

πŸ‘‰ MongoDB’s regex search made it super easy to analyze customer sentiment.

πŸ”Ή 3. Get Reviews for a Specific Business
db.reviews.find({ business_id: "B2" })

πŸ‘‰ Perfect to pull all reviews for Pizza House. πŸ•

πŸ”Ή 4. Update a Review
db.reviews.updateOne(
{ business_id: "B5" },
{ $set: { review: "Service improved, food is good now" } }
)

πŸ”Ή 5. Delete a Record
db.reviews.deleteOne({ business_id: "B25" })

πŸ“Œ Step 5: Insights & Learnings

MongoDB made it very easy to store flexible data (reviews don’t need a rigid schema).

Queries like regex searches helped in basic sentiment analysis.

Aggregations are powerful for ranking and analytics.

This small project gave me confidence to handle real-world datasets.

πŸ“Œ Next Steps

Add more fields like location, date, user_id.

Perform sentiment analysis on reviews.

Build a simple frontend to display results from MongoDB.

πŸš€ Final Thoughts

This project may look small, but it’s a great stepping stone for anyone starting with databases. By simulating a real-world use case (like Yelp), I not only practiced MongoDB commands but also learned how to analyze data effectively.

If you’re new to MongoDB, I highly recommend creating your own dataset and experimenting with queries. It’s one of the best ways to learn! πŸ™Œ

Top comments (0)