Audience:
This guide is intended for backend engineers working with MongoDB in production,
especially those using aggregation pipelines and$lookup.
MongoDB indexing issues rarely come from missing indexes.
They come from incorrect query and aggregation shape.
This document captures hard rules learned from real production systems, where
indexes existed but were never used, queries were slow, and COLLSCAN appeared
despite correct-looking indexes.
(Collection names used below are illustrative but represent real production patterns.)
Core Principles
- Every index must map to a real query
- Every index must have a documented reason
- Indexes are validated using
explain() - Unused indexes are technical debt
- Aggregation shape determines index usability
How to Verify Index Usage (Read This First)
Before creating or blaming an index, always verify usage.
Query Planner
db.collection.explain("queryPlanner").aggregate([...])
Check:
winningPlan.stage = IXSCAN
Execution Stats (MANDATORY)
db.collection.explain("executionStats").aggregate([...])
Check:
-
totalDocsExaminedis low -
totalKeysExaminedis reasonable - No
COLLSCAN
Note:
queryPlannershows possible plans.
executionStatsshows what actually executed.
Index Usage Counter
db.collection.aggregate([{ $indexStats: {} }])
Interpretation:
-
ops > 0→ active index -
ops = 0→ candidate for removal
Index Design Rules
Rule 1: Entry Point Matters
MongoDB executes aggregation pipelines left to right.
Indexes apply only to the collection where the pipeline starts
$lookupdoes not reorder execution like SQL joins.
Example (Form ↔ FormStep)
✅ Good (filter FormStep first):
FormStep → $match → $lookup Form
❌ Bad (filter after join):
Form → $lookup FormStep → $match
Example (Payment ↔ WalletPayment)
✅ Good (filter WalletPayment first):
WalletPayment → $match → $lookup Payment
❌ Bad:
Payment → $lookup WalletPayment → $match
If you start on the wrong collection, no index can save you.
Rule 2: Unique Equality Ends Index Traversal
If a query includes equality on a unique field:
paymentId = <value>
Then:
- MongoDB uses only that unique index
- No compound index fields after it will be used
- This is expected behavior
Example (unique paymentId)
WalletPayment.find({
paymentId: paymentId,
status: 'SUCCESS',
date: { $gte: start, $lte: end },
})
Even if this index exists:
@@index([status, date])
MongoDB will still use only:
WalletPayment_paymentId_key
Because paymentId is unique, the query is resolved after the equality match, and the remaining predicates are evaluated in memory.
Rule 3: $expr Prevents Index Optimization
-
$exprmay still touch an index -
BUT it prevents:
- range bounds
- compound index usage
- efficient scans
❌ Avoid:
$expr: {
$gte: ['$createdAt', date]
}
✅ Use:
createdAt: {
$gte: date
}
If you use $expr, assume reduced index effectiveness.
Rule 4: $lookup Index Rules
Indexes on the foreign collection are used only if:
-
$matchis inside$lookup.pipeline -
$matchis the first stage - No
$expr - Predicates are sargable (direct field comparisons)
- No unique equality dominates the lookup
Note:
$lookupwithletvariables internally uses$expr.
Index usage is still possible only when equality predicates remain sargable.
Example (Form → FormStep)
$lookup: {
from: 'FormStep',
let: { formId: '$_id' },
pipeline: [
{
$match: {
formId: '$$formId',
status: 'PENDING'
}
}
],
as: 'steps'
}
Index used on FormStep:
@@index([formId, status])
Example (Payment → WalletPayment)
$lookup: {
from: 'WalletPayment',
let: { paymentId: '$_id' },
pipeline: [
{
$match: {
paymentId: '$$paymentId',
status: { $in: ['SUCCESS', 'FAILED'] },
date: {
$gte: startDate,
$lte: endDate,
},
},
},
],
as: 'walletPayment',
}
Important:
- If
paymentIdis unique, MongoDB will still resolve the lookup using only the uniquepaymentIdindex - Indexes on
statusordatewill not be traversed
Rule 5: $match After $lookup Is In-Memory
$lookup → $unwind → $match
❌ Indexes on the joined collection will NOT be used.
Bad example
Form → $lookup FormStep → $match { 'steps.status': 'PENDING' }
Good example
FormStep → $match { status: 'PENDING' } → $lookup Form
Always filter before or inside $lookup.
Aggregation-Specific Guidelines
Date Filtering
Always filter dates using direct field predicates:
createdAt: {
$gte: new Date(start),
$lte: new Date(end)
}
Never use $expr for date ranges.
Sorting
If you sort:
$sort: {
createdAt: -1
}
Then createdAt must appear last in the index.
$facet
Each facet runs its own pipeline.
Indexes must support each facet independently.
Index Removal Policy
An index must be reviewed for removal if:
-
ops = 0for 14 days in production - No query references it in the registry
- It was created speculatively
Removing unused indexes is safe and healthy.
Golden Rules (Memorize These)
- One index = one query pattern
-
$matchearly or inside$lookup - Unique equality dominates everything
-
$exprkills optimization -
$lookupis not SQL JOIN -
ops = 0means debt
Final Note
Indexes do not make queries fast.
Correct query shape makes indexes usable.
Top comments (0)