Indexing makes the read queries faster. We can apply indexing on any of the fields in the document including the embedded fields. When documents are indexed MongoDB will search with filtered offset first rather than doing COLLSCAN ( column-span) on all documents and then applying filter criteria to it.
Let us understand this by creating an index to the existing collection and comparing the query execution stats before and after applying the indexing.
By the way, I am using Mongo DB Atlas Free cluster for this walk-through. Mongo Altas is a cloud-based solution to host and manage NoSQL data and it also provides a free cluster for learning purposes with one click option to load sample data set into your free cluster. I liked it :)
To connect to the Atlas cluster I am using VS Code extension “MongoDB VS Code”. There are other ways as well such as using mongo shell, mongo Atlas UI. Please refer to the link below https://www.mongodb.com/docs/atlas/getting-started/
I am using the “sample_airbnb” data set loaded into my Atlas account and querying them using MongoDB for VS code extension.
Please note In case the image screenshot are not clearly visible kindly open them in a new tab or click on the image description to open in new tab.
In above query , I am executing find query on collection: “listingAndReviews” where the property_type is house. I am also using regex to fetch result ignoring case sensitive and the explain(‘executionStats’) function to get query plan.
In the presence of indexing, the winningPlan section will have inputStage.stage as IXSCAN
I removed all pre-created indexes from the “listingAndReviews” collection and re-execute the same query. We can see the winningPlan.stage as “COLLSPAN” and the executionStats.executionTimeMillis as 118. The totalDocsExamined as 5555 to find the 606 records matching our search criteria.
Let us create an index on the field “property_ type” to fetch results faster.
To define an index , We need to understand the queries required and the volume of read and write operations that going to be performed on the collection. Indexing will have some amount of taxation on the write operation that’s the reason we shouldn’t apply indexing to all the document fields.
Note* Creating indices on the fields where the fetch query will return close to all documents from the collection will not help and in fact will slow the query execution. Create index based on the query use case such that the query result should return only the limited fetch result.
Compound Index
The above examples are on the single index field. There is another type of index called “Compound Index” where two or more fields are considered for indexing in the same order they were added.
If we have two fields added as compound indexes the precedence will start from left to right. For example, let us create a compound index on “listingAndReviews” collection using “beds and address.country_code ” (embedded document) fields.
The order of the fields is important here and precedence is from left to right. i.e. the field beds get first priority and followed by address.country_code.
In the above query, Even though we use two filters in the find query to fetch results having beds count as 1 and can accommodate 4 people. here “accommodates” field is not part of our index keys. Mongo DB performs IXSCAN i.e. index scan and not the COLLSPAN column spanning all the documents within the collection because we have the “beds” added as part of our key index and the precedence is from left to right.
In case we query only with accommodates fields, Mongo DB performs COLLSPAN and examines all the documents of a total count of 5555
Rules for defining an index
- Use the ESR (Equality, Sort, Range) Rule
- Create Indexes to Support Your Queries
- Use Indexes to Sort Query Results
We need index not only for fetching the results quickly but also to use sorting. In case of non indexed document, Mongo DB has limit of 32 MB in memory and this will time out in case sorting (Default in- memory sorting) of huge millions of document without indexing.
- Ensure Indexes Fit in RAM
- Create Queries that Ensure Selectivity
In the next part of this series, we will drill down to other types of indexes and the best practices to follow while creating indexes.
To be continued.
Top comments (0)