DEV Community

MongoDB Guests for MongoDB

Posted on

Detective mode: Fixing the MongoDB Aggregation pipeline

This article was written by Darshan Jayarama.

It was a Friday, and I was quickly wrapping up my week, finishing the tasks that were assigned. Suddenly, the board became red — another S1. I was like, not now!!!

The application team started complaining about the API responses from the database, going from 600ms to 17 seconds. Booooom!!! That’s insane. No other choice — since it was production, I had to jump in. The war room was ready. The monitoring dashboard was nothing but a heart patient’s ECG — spikes, timeouts, and the main app was choking.

What's the Mystery?

From the dashboard, we identified a query that was asked by the VP to add a breakdown by region to the dashboard. It was just one more field. But it was ONE.MORE.FIELD.

Innocent query, which was harmless for years:

db.orders.aggregate([
 {
   $match: {
     status: "completed",
     created_at: { $gte: new Date("2025-01-31") }
   }
 },
 {
   $group: {
     _id: "$customer_id",
     total_orders: { $sum: 1 },
     total_revenue: { $sum: "$amount" },
     avg_order_value: { $avg: "$amount" }
   }
 },
 {
   $sort: { total_revenue: -1 }
 },
 {
   $limit: 100
 }
])
Enter fullscreen mode Exit fullscreen mode

Simple, right? top 100 customer revenue dashboard.

Execution time 500–700ms, Docs Examined ~2.5M.

We then added just one more field query:

db.orders.aggregate([
 {
   $match: {
     status: "completed",
     created_at: { $gte: new Date("2024-01-01") }
   }
 },
 {
   $lookup: { 
     from: "customers",
     localField: "customer_id",
     foreignField: "_id",
     as: "customer"
   }
 },
 {
   $unwind: "$customer"
 },
 {
   $group: {
     _id: {
       customer_id: "$customer_id",
       region: "$customer.region"
     },
     total_orders: { $sum: 1 },
     total_revenue: { $sum: "$amount" },
     avg_order_value: { $avg: "$amount" }
   }
 },
 {
   $sort: { total_revenue: -1 }
 },
 {
   $limit: 100
 }
])
Enter fullscreen mode Exit fullscreen mode

Execution time >15000ms, Docs Examined 2.5M X 2.5M, memory exploded.

The Investigation: Agent Darshan reporting

Evidence #1:

MongoDB profiler, setting the status to 2, helps us collect all the queries that are being run on the given database. Set this level only when needed, else you’re gonna blow up the log file:

db.setProfilingLevel(2, { slowms: 100 })
db.system.profile.find().sort({ ts: -1 }).limit(1).pretty()
Enter fullscreen mode Exit fullscreen mode

I could see no mercy in our query, brutally attacking the MongoDB cluster:

{
 "op": "command",
 "ns": "production.orders",
 "command": { /* our aggregation */ },
 "keysExamined": 2489234,      //  Used index on status
 "docsExamined": 2489234,      // But still examined all matched docs
 "numYield": 18943,
 "nreturned": 100,
 "responseLength": 8234,
 "millis": 647,
 "planSummary": "IXSCAN { status: 1 }",  //  Used status index
 "execStats": {
   "stage": "FETCH",
   "filter": {
     "created_at": { "$gte": ISODate("2025-01-31") }  // 👈 Filter applied AFTER index
   },
   "inputStage": {
     "stage": "IXSCAN",
     "keyPattern": { "status": 1 },
     "indexName": "status_1",
     "keysExamined": 2489234,
     "seeks": 1,
     "dupsTested": 0,
     "dupsDropped": 0,
     "direction": "forward"
   },
   "totalDocsExamined": 2489234,
   "totalKeysExamined": 2489234,
   "executionTimeMillisEstimate": 447
 }
}
Enter fullscreen mode Exit fullscreen mode

Red Flags:

  • DocsExamined = ~2.5M, whereas nReturened 100
  • PlanSummary: Single key index on boolean status_1.
  • High yield.

Evidence #2:

Need to review and explain the plan to make the query confess to the brutal attack:

{
 "stages": [
   {
     "$cursor": {
       "queryPlanner": {
         "winningPlan": {
          "stage": "IXSCAN",
           "keyPattern": { "status": 1 },
           "filter": {
             "status": { "$eq": "completed" },
             "created_at": { "$gte": new Date("2025-01-31") }
           }
         }
       },
       "executionStats": {
         "executionTimeMillis": 647,  // $match stage: 647ms
         "totalDocsExamined": 2489234,
         "totalKeysExamined": 2489234,  //
         "nReturned": 2489234  // Passed 2.48M docs to next stage
       }
     }
   },
   {
     "$lookup": {  // 💀💀💀 THE SERIAL KILLER
       "from": "customers",
       "as": "customer",
       "executionTimeMillis": 13934,  // 13.9 seconds just for $lookup!
       "totalDocsExamined": 2489234,  // Looked up 2.48M customers
       "stage": "NESTED_LOOP_JOIN",  // No index on customers._id
       "indexesUsed": []  // 👈 DISASTER
     }
   },
   {
     "$unwind": {
       "executionTimeMillis": 823
     }
   },
   {
     "$group": {
       "executionTimeMillis": 1156,
       "usedDisk": true,  // 💀 Spilled to disk (>100MB memory)
       "spilledRecords": 847293
     }
   },
   {
     "$sort": {
       "executionTimeMillis": 87,
       "sortPattern": { "total_revenue": -1 },
       "usedDisk": false  // Lucky, only 100 docs to sort after $limit
     }
   }
 ],
 "executionTimeMillis": 16847
}
Enter fullscreen mode Exit fullscreen mode

Time breakdown:

  • $match: 647ms → $lookup: 13.9 seconds → $unwind: 823ms → $group: 1156ms → $sort: 87ms → Total 16.9 seconds

Suspects list

Suspect #1:

Unoptimized index for $match. Even though we had a single index on status:1 and created_at:1, we didn't have a compound index on both.

  • Motive: Index inefficiency
  • Proof: was high nReturned to the next stage.

Suspect #2:

The $lookup brutalness, for each customer_id from the previous stage, finding the join on the customers collection with an absent element in indexUsed. Isn’t that brutal?

  • Motive: just one more field request
  • Proof: empty indexUsed and 13.9s execution time.

Suspect #3:

$group surrendered with memory hog, spilled ~847k documents to disk as it exceeded 100MB per stage MongoDB limit.

  • Motive: Grouping 2.4M records with useDisk true
  • Proof: 847k spilledRecords.

Suspect #4:

We were performing an expensive $lookup operation before we could filter down to 100 records.

  • Motive: Not thinking about pipeline optimization
  • Proof: Processing 2.48M docs when we only needed 100

Recapping all mistakes to fix!

We were doing $match: 647ms(inefficient index)→ $lookup: 13.9 seconds(massive 2.4M join) → $unwind: 823ms → $group: 1156ms(memory overflow) → $sort: 87ms → Total 16.9 seconds

We redesigned as,

  • $match with compound index for efficient filtering
  • $group by customer_id reduce the dataset
  • $sort and $limit top 100 customer and we have only 100 records now.
  • $lookup for the 100 customer data we wanted.
  • $project to add region information.

Does it fix the issues?

The Solution:

2.5M orders
$match (INDEXED — fast!)
2.48M orders
$group (by customer_id only)
~500K customer groups
$sort + $limit
100 top customers
$lookup (only 100 lookups!) ← ✅ FIXED
100 results with customer data
$project (extract region)
100 final results

The best method of using $lookup is not using it!

db.orders.aggregate([
 {
   $match: {
     status: "completed",
     created_at: { $gte: new Date("2024-01-01") }
   }
 },
 {
   $group: {
     _id: "$customer_id",  // Group BEFORE lookup
     total_orders: { $sum: 1 },
     total_revenue: { $sum: "$amount" },
     avg_order_value: { $avg: "$amount" }
   }
 },
 {
   $sort: { total_revenue: -1 }
 },
 {
   $limit: 100  // Reduce to 100 BEFORE lookup
 },
 {
   $lookup: {  // Now only looking up 100 customers
     from: "customers",
     localField: "_id",
     foreignField: "_id",
     as: "customer"
   }
 },
 {
   $unwind: "$customer"
 },
 {
   $project: {
     customer_id: "$_id",
     region: "$customer.region",
     total_orders: 1,
     total_revenue: 1,
     avg_order_value: 1
   }
 }
])
Enter fullscreen mode Exit fullscreen mode

Curious to know the metrics now? I am excited to display as well:

Execution time 847ms, docsExamined 2.5M, lookup performed 100 documents.

✅ Issue resolved

  • Root cause: $lookup before aggregation.
  • Fix: Aggregate first, then join the top 100
  • Performance: 17s → 0.8s (20x improvement)
  • Added indexes: (status, created_at)

The Costly Lessons Learned:

  1. One field can kill you. That “simple” region addition cost us $24,000 in downtime.
  2. $lookup is not free. It’s actually very expensive. Treat it like a loaded gun.
  3. Order matters. Aggregate → Limit → Join. Not Join → Aggregate → Limit.
  4. Indexes aren’t optional. They’re the difference between 1.8 seconds and 0.15 seconds.
  5. Explain is your friend. If you’re not running, explain, you’re flying blind.
  6. The 100MB limit is real. When $group says “usedDisk: true”, you’re already in trouble.
  7. Test with production data volumes. 1,000 docs is not the same as 2,500,000 docs.

Top comments (0)