Dmitry Romanoff

Posted on Jul 12

Playing with Indexes in MongoDB

#database #programming #mongodb #learning

Indexes in MongoDB can dramatically improve the performance of your queries—if used right. In this article, we'll walk through a practical example of how indexes affect query performance, using a simple book database.

We'll go from inserting mock data to analyzing query execution plans, adding indexes, and understanding index limitations.

1. Generate Sample Data

Let's start by populating a collection with mock book data:

// Choose the database
use mylibrary;

// Define helper functions (basic randomizer)
function getRandomElement(arr) {
  return arr[Math.floor(Math.random() * arr.length)];
}

function getRandomInt(min, max) {
  return Math.floor(Math.random() * (max - min + 1)) + min;
}

var genres = ['Fiction', 'Non-Fiction', 'Science', 'Fantasy', 'Biography', 'History', 'Mystery', 'Romance'];
var formats = ['epub', 'pdf', 'txt', 'audio'];
var firstNames = ['John', 'Mary', 'Alice', 'Robert', 'Linda', 'Michael', 'Sarah', 'David'];
var lastNames = ['Smith', 'Johnson', 'Williams', 'Brown', 'Jones', 'Miller', 'Davis', 'Garcia'];

for (var i = 0; i < 1000; i++) {
  var author = getRandomElement(firstNames) + " " + getRandomElement(lastNames);
  var title = "Book Title " + (i + 1);
  var book = {
    author: author,
    title: "title,"
    genre: getRandomElement(genres),
    format: getRandomElement(formats),
    num_of_pages: getRandomInt(50, 1000),
    year_when_published: getRandomInt(1900, 2025)
  };

  db.books.insertOne(book);
}

2. Check Query Execution Without Index

Let’s look at the query plan for a simple .find() operation:

db.books.find().explain("executionStats")

{
  queryPlanner: {
    plannerVersion: 1,
    namespace: 'mylibrary;.books',
    winningPlan: { stage: 'COLLSCAN' }
  },
  executionStats: {
    executionSuccess: true,
    executionTimeMillis: '0.794',
    planningTimeMillis: '0.079',
    executionStages: {
      stage: 'COLLSCAN',
      nReturned: '1915',
      executionTimeMillisEstimate: '0.614'
    }
  },
  ...
}

This uses a COLLSCAN (collection scan), which means MongoDB is scanning every document—inefficient for large datasets.

3. Create an Index on the `author` Field

Let’s improve performance by indexing the author field:

db.books.createIndex({ author: 1 })

4. List All Indexes

db.books.getIndexes()

[
  { v: 4, key: { _id: 1 }, name: '_id_', ns: 'mylibrary;.books' },
  {
    v: 4,
    key: { author: 1 },
    name: 'author_1',
    ns: 'mylibrary;.books'
  }
]

5. Query by Author

db.books.find({ author: "John Smith" }).limit(5).pretty()

6. Query Plan for Author Search

db.books.find({ author: "John Smith" }).explain("executionStats")

{
  queryPlanner: {
    ...
    winningPlan: { stage: 'IXSCAN', indexName: 'author_1', direction: 'forward' }
  },
  executionStats: {
    executionSuccess: true,
    executionTimeMillis: '0.293',
    ...
  }
}

✅ The query now uses an IXSCAN (index scan) which is much faster!

7. Query by Title (No Index)

db.books.find({ title: "Book Title 10" }).explain("executionStats")

{
  queryPlanner: {
    ...
    winningPlan: { stage: 'COLLSCAN' }
  },
  executionStats: {
    executionSuccess: true,
    executionTimeMillis: '1.775',
    ...
  }
}

Still doing a full collection scan—not ideal.

8. Query by Author + Title

db.books.find({ author: "John Smith", title: "Book Title 10" }).explain("executionStats")

{
  queryPlanner: {
    ...
    winningPlan: { stage: 'IXSCAN', indexName: 'author_1', direction: 'forward' }
  },
  ...
}

This is better than a full scan, but still not ideal if you're frequently querying by title.

9. Compound Index on `year_when_published` and `title`

db.books.createIndex({ year_when_published: 1, title: 1 })

db.books.getIndexes()

[
  ...
  {
    v: 4,
    key: { year_when_published: 1, title: 1 },
    name: 'year_when_published_1_title_1',
    ns: 'mylibrary;.books'
  }
]

Query plan now improves:

db.books.find({ title: "Book Title 10" }).explain("executionStats")

{
  queryPlanner: {
    ...
    winningPlan: {
      stage: 'IXSCAN',
      indexName: 'year_when_published_1_title_1',
      ...
    }
  },
  ...
}

10. Index Size Limitations

Let's test MongoDB's index size limit by trying to insert a document with a huge title:

var longTitle = "A".repeat(5000); // 5000-character string

db.books.insertOne({
  title: longTitle,
  author: "Test Author",
  year_when_published: 2020,
  genre: "Fiction",
  format: "pdf",
  num_of_pages: 300
});

Error:

MongoServerError: key too large to index. Index name: year_when_published_1_title_1

🚫 MongoDB’s index key size is limited (1024 bytes). Long strings in indexed fields can break inserts if the index becomes too large.

🔚 Conclusion

Indexes can drastically improve query performance.
Use .explain("executionStats") to analyze and optimize.
Compound indexes are powerful, but mindful of their field sizes.
Avoid indexing large strings unless absolutely necessary.

DEV Community

Playing with Indexes in MongoDB

1. Generate Sample Data

2. Check Query Execution Without Index

3. Create an Index on the `author` Field

4. List All Indexes

5. Query by Author

6. Query Plan for Author Search

7. Query by Title (No Index)

8. Query by Author + Title

9. Compound Index on `year_when_published` and `title`

10. Index Size Limitations

🔚 Conclusion

Top comments (0)

1. Generate Sample Data

2. Check Query Execution Without Index

3. Create an Index on the author Field

4. List All Indexes

5. Query by Author

6. Query Plan for Author Search

7. Query by Title (No Index)

8. Query by Author + Title

9. Compound Index on year_when_published and title

10. Index Size Limitations

🔚 Conclusion

3. Create an Index on the `author` Field

9. Compound Index on `year_when_published` and `title`