Babar Bilal

Posted on Mar 9, 2025

Best Practices for Designing Scalable MongoDB Models with Mongoose

Creating complex models in MongoDB using Mongoose requires careful planning to ensure scalability, maintainability, and efficiency. Here are the best practices for designing complex models in MongoDB with Mongoose.

1. Schema Design Best Practices

✅ Use Embedded Documents for One-to-Few Relationships

If the related data is small and read together frequently, embed it inside the document.

Example: A User with multiple addresses

const mongoose = require("mongoose");

const AddressSchema = new mongoose.Schema({
  street: String,
  city: String,
  zip: String,
  country: String,
});

const UserSchema = new mongoose.Schema({
  name: String,
  email: { type: String, required: true, unique: true },
  addresses: [AddressSchema], // Embedded subdocument
});

const User = mongoose.model("User", UserSchema);

✔ Pros: Faster read operations, fewer queries

❌ Cons: Updates require writing the entire document again

Use when:

Data is frequently read together
The number of embedded documents is small (<10)

✅ Use References (Normalization) for One-to-Many Relationships

If the related data is large or frequently updated separately, store references (ObjectIds).

Example: A User with multiple Orders (large dataset)

const OrderSchema = new mongoose.Schema({
  user: { type: mongoose.Schema.Types.ObjectId, ref: "User" }, // Reference to User
  totalPrice: Number,
  items: [{ product: String, quantity: Number }],
});

const Order = mongoose.model("Order", OrderSchema);

✔ Pros: Efficient updates, avoids document bloat

❌ Cons: Requires populate() to fetch related data

Use when:

The sub-collection grows large (>10 items)
You need independent CRUD operations on the sub-collection

🔹 Fetching referenced data with populate:

Order.find().populate("user").exec((err, orders) => {
  console.log(orders);
});

✅ Hybrid Approach (Partial Embedding + References)

For medium-sized related data, embed only frequently used fields and reference the rest.

Example: Embed order summary but reference order details

const OrderSchema = new mongoose.Schema({
  user: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
  totalPrice: Number,
  items: [{ product: String, quantity: Number }],
  shipping: {
    address: String,
    status: { type: String, default: "Processing" }, // Frequently queried field
  },
});

✔ Best of both worlds – fast reads and efficient updates

2. Schema Design Optimizations

✅ Indexing for Fast Queries

Indexes improve query speed. Always index fields that are frequently queried.

const UserSchema = new mongoose.Schema({
  email: { type: String, required: true, unique: true, index: true }, // Index for fast lookup
  createdAt: { type: Date, default: Date.now, index: -1 }, // Sort index for fast filtering
});

✔ Use indexes on:

Frequently queried fields (email, username)
Fields used in sorting (createdAt)
Fields used in filtering (status, category)

🔹 Check Index Usage

db.users.getIndexes();
db.orders.find({ userId: "123" }).explain("executionStats");

✅ Timestamps for Tracking

Use timestamps: true in your schema to automatically store createdAt and updatedAt.

const OrderSchema = new mongoose.Schema({
  totalPrice: Number,
}, { timestamps: true });

✅ Use `lean()` for Read-Only Queries

lean() improves performance by returning plain JavaScript objects instead of full Mongoose documents.

Order.find().lean().exec();

✔ 30-50% faster than normal queries

Use when:

You don’t need to modify the retrieved data
You only need raw JSON output for API responses

3. Handling Large Data Efficiently

✅ Pagination for Large Datasets

Use pagination to limit query results for better performance.

const page = 1;
const limit = 10;

Order.find()
  .skip((page - 1) * limit)
  .limit(limit)
  .exec();

✔ Avoid limit(1000), as it can cause performance issues

✅ Aggregation Pipeline for Complex Queries

Use aggregation for reporting and complex queries.

Order.aggregate([
  { $match: { status: "Completed" } },
  { $group: { _id: "$user", totalSpent: { $sum: "$totalPrice" } } },
]);

4. Soft Deletes Instead of Permanent Deletion

Instead of deleting a document, use a deletedAt field.

const UserSchema = new mongoose.Schema({
  name: String,
  email: String,
  deletedAt: { type: Date, default: null },
});

✔ Hides deleted items without losing data

🔹 Query only active users:

User.find({ deletedAt: null });

5. Virtual Fields for Computed Values

Virtual fields do not get stored in the database but are calculated dynamically.

UserSchema.virtual("fullName").get(function () {
  return `${this.firstName} ${this.lastName}`;
});

✔ Use for derived data without increasing DB size

Conclusion

🚀 Best Practices Summary
✅ Embed small data, reference large data

✅ Use lean(), pagination, and caching

✅ Index frequently queried fields

✅ Use soft deletes instead of actual deletion

✅ Use environment variables for security

✅ Use middleware for automation

Following these practices will help you build efficient, scalable, and maintainable MongoDB applications with Mongoose! 🚀

DEV Community

Best Practices for Designing Scalable MongoDB Models with Mongoose

1. Schema Design Best Practices

✅ Use Embedded Documents for One-to-Few Relationships

Example: A User with multiple addresses

✅ Use References (Normalization) for One-to-Many Relationships

Example: A User with multiple Orders (large dataset)

✅ Hybrid Approach (Partial Embedding + References)

Example: Embed order summary but reference order details

2. Schema Design Optimizations

✅ Indexing for Fast Queries

✅ Timestamps for Tracking

✅ Use `lean()` for Read-Only Queries

3. Handling Large Data Efficiently

✅ Pagination for Large Datasets

✅ Aggregation Pipeline for Complex Queries

4. Soft Deletes Instead of Permanent Deletion

5. Virtual Fields for Computed Values

Conclusion

Top comments (0)

1. Schema Design Best Practices

✅ Use Embedded Documents for One-to-Few Relationships

Example: A User with multiple addresses

✅ Use References (Normalization) for One-to-Many Relationships

Example: A User with multiple Orders (large dataset)

✅ Hybrid Approach (Partial Embedding + References)

Example: Embed order summary but reference order details

2. Schema Design Optimizations

✅ Indexing for Fast Queries

✅ Timestamps for Tracking

✅ Use lean() for Read-Only Queries

3. Handling Large Data Efficiently

✅ Pagination for Large Datasets

✅ Aggregation Pipeline for Complex Queries

4. Soft Deletes Instead of Permanent Deletion

5. Virtual Fields for Computed Values

Conclusion

✅ Use `lean()` for Read-Only Queries