Welcome to a series on enriching the search experience by using Views with MongoDB Search!
MongoDB lexical and vector indexes are built directly from the data in the associated collection. Every document is mapped through an index configuration into one, or more if using embeddedDocuments, Lucene documents. A mapping determines what fields are indexed and, primarily for string fields, how they are indexed. A mapping only can map what it sees: the fields on each available document.
There are situations where filtering what documents are indexed is necessary, perhaps when archived is true. Rather than indexing all documents and filtering them out at $search time, we can simply avoid indexing them altogether. In this case, our index size will only be based on non-archived documents.
And there's situations where enriching a document before it is indexed can enhance searches, such as indexing the size of an array rather than using a full collection scanning query-time expression, or transforming a boolean into a string to support faceting.
Index it like you want to search it—check out the recipes in this series to learn more.
First view: Indexing array size
Our first enrichment recipe fits when your application needs to query over the size of an array. Here's a couple of documents:
[
{
_id: 1,
values: [1,2,3,4]
},
{
_id: 2,
values: [1,2,3,4,5]
}
]
We would like to find all documents that have exactly four values. We could use a brute force $match aggregation pipeline in this way:
[
{
$match: {
values: { $size: 4 }
}
}
]
That aggregation returns the right results—however, at the expense of a COLLSCAN (collection scan) to evaluate that expression for every document in the collection. Scale matters—this will be fine to a point, but then it'll be way too slow.
Rather than compute repeatedly at query time visiting documents that do not match the criteria, it would be way more efficient to have the size of the array as a separate field and index that instead. Sure, we could use the Computed Pattern, though that would require our application to compute the size, store it alongside the array, and ensure it is kept in sync. There's an alternative: to compute the size during indexing.
Creating an enriched view
First, create a standard view on the docs collection, called docs_with_sizes:
db.createView(
"docs_with_sizes",
"docs",
[
{
$addFields: {
num_values: { $size: "$values" }
}
}
]
)
This is using JavaScript in mongosh, though you can create a view from other environments as well.
This view acts as a collection such that db.docs_with_sizes.find({}) returns:
[
{ _id: 1, values: [1, 2, 3, 4], num_values: 4 },
{ _id: 2, values: [1, 2, 3, 4, 5], num_values: 5 }
]
The num_values field isn't stored in the database—it's computed on the fly.
Here's where the powerful part comes in—creating a search index on a view:
db.docs_with_sizes.createSearchIndex(
"view_index",
{
"mappings": {
"dynamic": true
}
}
)
Our view_index has indexed the view, incorporating the added num_values field as an indexed, and thus queryable, value.
The final result
With this enriched index, we can straightforwardly query on num_values...
[
{
$search: {
index: "view_index",
equals: {
path: "num_values",
value: 4
}
}
}
]
...yielding these results:
[ { _id: 1, values: [1, 2, 3, 4], num_values: 4 } ]
Conclusion
Index what you want to search. Using Views with MongoDB Search enhances and enables a number of interesting use cases that would otherwise be difficult or burdensome to tackle. Add these techniques to your search toolkit. Stay tuned for additional recipes in this series!
Top comments (0)