DEV Community: Mohammad Zbib

MongoDB at Scale: Common Anti Patterns That Silently Kill Performance

Mohammad Zbib — Wed, 18 Feb 2026 15:41:18 +0000

Is your MongoDB app running slow? The problem might not be your code, but how your database is set up. MongoDB is very flexible, but this can sometimes lead to big performance issues if you don't use it right. This article will explain common MongoDB problems and give you smart ways to fix them, so your apps can be fast and handle a lot of users.

The Good and Bad of MongoDB's Flexibility

MongoDB lets you build apps quickly because its data structure is easy to change. But this freedom can also hide problems. If you treat MongoDB like a traditional SQL database, you'll run into issues like making too many small requests (N+1 queries), scanning huge amounts of data, and using complicated data processing steps that slow everything down. To truly master MongoDB, you need to understand how it works inside and design your data and queries to match its strengths.

Common Performance Problems and How to Solve Them

1. The N+1 Query Problem: A Hidden Performance Killer

The N+1 query problem happens when your app first gets a list of main items, and then for each of those items, it makes a separate request to get related details (the "N" queries). This means your app talks to the database too much, which wastes time and makes the database work harder.

Example:

Imagine you have orders and customers collections. To show a list of orders with customer names, a common mistake is:

// 1. Get all orders (the "+1" query)
const orders = await db.collection('orders').find().toArray();

// 2. For each order, get its customer (the "N" queries)
for (const order of orders) {
  order.customer = await db.collection('customers').findOne({ _id: order.customerId });
}

If you have 100 orders, this runs 101 queries. Each of those 100 customer queries might have to scan through the whole customers collection if you don't have the right index, making things even slower.

Smart Solutions:

Keep Related Data Together (Embedding): If data is often used together and doesn't change much, put it directly inside the main document. For example, put the customer's name and key details right into the order document. This means you only need one request to get all the info.
Batching Requests: If you can't embed data, gather all the IDs you need and ask for them in one go. Instead of 100 separate requests for customers, make one request asking for all 100 customer IDs at once. This saves a lot of back-and-forth communication.

2. The High Cost of Scanning Millions of Documents

When MongoDB can't use an index to find data, it has to read every single document in a collection. This is called a COLLSCAN (collection scan). If your collection has millions of documents, reading all of them takes a very long time, uses a lot of your computer's power, and makes your app feel very slow. In systems with many servers (sharded clusters), this problem gets even worse because it has to scan across all of them.

Smart Solutions:

Good Indexing Strategy: Indexes are like a book's table of contents. They help MongoDB find data quickly. Create indexes on all fields you search by, sort by, or use in $lookup operations. For searches that use multiple fields, create compound indexes (indexes on several fields). Also, try to make covered queries, where all the data needed for a query is found directly in the index, so MongoDB doesn't even need to look at the main documents.
Choose Fields Carefully for Indexes: Put indexes on fields that have many different values (high cardinality) and are good at narrowing down search results. Avoid indexing fields with only a few different values unless they are part of a compound index that helps a lot.
Partial Indexes: If only some of your documents have a certain field, or if you only care about a specific group of documents, you can use partial indexes. These indexes are smaller and faster because they only cover a part of your collection.
TTL Indexes: For data that expires (like old logs), TTL indexes automatically delete old documents. This keeps your collections from getting too big and helps queries stay fast.
Use explain() to See What's Happening: Use db.collection.explain("executionStats") to understand how your queries are running. Look for COLLSCAN (bad!), and check totalKeysExamined (how many index entries it looked at) versus totalDocsExamined (how many documents it looked at). If totalDocsExamined is much higher than the number of results you get, your index isn't working well.

3. The `$lookup` (Join) Cost: A Tricky Balance

MongoDB's $lookup feature lets you combine data from different collections, similar to a JOIN in SQL. It's powerful, but it's not a magic bullet. Each $lookup step uses up resources and can slow things down, especially with large amounts of data or if you don't have the right indexes.

Smart Solutions:

Index the Joined Fields: Always make sure the fields you use to connect collections in $lookup (the localField and foreignField) have indexes. This helps MongoDB find matching documents quickly.
Use $lookup Less Often: Don't use $lookup for every connection between data. If you often need related data together, embedding it is usually better. Use $lookup only when embedding isn't practical, like for very large related data or complex many-to-many relationships.
Watch Memory Use: $lookup operations can use a lot of memory. If they use too much, MongoDB might have to write temporary data to disk, which is very slow. Keep an eye on memory usage and consider changing aggregationMemoryLimitMb if needed, or simplify your query.

4. The Aggregation Pipeline Trap: Powerful but Dangerous

MongoDB's aggregation framework is great for complex data tasks. But if you build long, complicated pipelines (sequences of operations), they can become very slow. Each step in the pipeline processes the results of the previous one, so one slow step early on can make the whole pipeline crawl.

Smart Solutions:

Avoid large aggregation pipelines: MongoDB is not a relational database. Complex pipelines with numerous $lookup stages significantly degrade performance. Instead, execute multiple small, focused queries with minimal lookups, and perform data composition at the application/server layer.
Keep Pipelines Simple: Avoid making pipelines too long or complex. Break down big tasks into smaller ones, or do some of the data processing in your application code.
Order Steps Smartly: The order of steps in your pipeline matters a lot. Always use $match (filter) and $project (choose specific fields) early. This reduces the amount of data that later, more expensive steps have to process.
$unwind with Caution: The $unwind step creates a separate document for each item in an array. If your arrays are very large, this can create a huge number of documents, using up a lot of memory and slowing things down. Look for other ways to handle arrays if $unwind is too slow.
Index for $group: If you use $group to combine documents, make sure the field you're grouping by (_id) is indexed. This helps MongoDB group documents efficiently.
Watch Memory (Again): Many aggregation steps can use a lot of memory and spill to disk. Use explain() to check how much memory and disk space your pipelines are using. For very large tasks, consider doing some processing outside of MongoDB, like with Apache Spark.

How to Keep MongoDB Fast

Making MongoDB fast for big applications is an ongoing effort. It means always thinking about how you design your data, how you use indexes, and how you write your queries. You need to constantly check how your database is performing and make changes.

Always Check for Slow Queries: Turn on the database profiler (db.setProfilingLevel(1)) to find queries that are taking too long. Look at the results to see which queries are run often, take a long time, or look at too many documents.
Monitor Your Database: Use tools like MongoDB Cloud Manager or Ops Manager to watch important numbers like how many operations are happening, memory use, CPU use, and how fast data is copied between servers. Pay attention to how much data is being read from and written to disk.
Sharding for Huge Data: For extremely large amounts of data, you need to split it across many servers (sharding). But choosing the wrong way to split your data can create problems like some servers being overloaded while others are idle. Pick a sharding key that spreads data evenly and works well with your common queries.
Tune Your Hardware: Make sure your servers have enough CPU, RAM, and fast storage (SSDs). Also, adjust MongoDB settings (like wiredTigerCacheSizeGB) to match how your app uses the database.

In Short

To make MongoDB perform well at an advanced level, you need to deeply understand how it works. It's about smart data design, careful indexing, and writing efficient queries. By avoiding common mistakes and using these advanced tips, you can build strong, fast, and scalable applications that handle heavy use with ease.

Building Real Time Systems That Actually Scale

Mohammad Zbib — Sat, 30 Aug 2025 15:23:02 +0000

There’s something magical about building apps that feel alive. When a message appears instantly in a chat. When notifications pop up the moment something happens. When multiple people can edit the same document together without friction. That’s the power of real time applications.

But behind the magic is a story of architecture. A journey where each step solves one problem, but often creates the next. Let’s walk through that journey, from the simplest setups to production ready systems.

The beauty of Starting Simple

Let's start where most of us do, putting everything in one place.
Your app handles WebSocket connections, processes business logic,
talks to the database, and sends real time updates all from the same
server. It's like having one super capable person running your
entire operation.

And honestly, this setup works well at the start.Deployment is simple. Debugging is easier since everything lives in one place. Fast Real time updates as there’s no extra network hop. For early users, it feels almost magical.

When Your Success Becomes Your Problem

Success brings more users. And more users, means your single server starts juggling a lot more balls. Each WebSocket connection needs memory and processing power. Your database queries take longer. Background tasks compete for attention.

Suddenly, everything starts slowing down together. When you need to
deploy a small bug fix, all your users get disconnected. When your
database gets busy, your real time updates lag. It's like having one
person trying to answer phones, cook meals, and greet customers all
at once.

Breaking Things Apart: Microservices

Breaking things apart solves these problems. Scaling one giant server is expensive and wasteful, need more payment processing power? You have to scale everything, even parts that don't need it. Plus, deploying a tiny notification fix kicks off every user.

The natural next step is breaking things apart. Each microservice
gets its own responsibility, user management, payments,
notifications and each one handles its own WebSocket connections
too. When something happens, services broadcast events to each other
using a message system like RabbitMQ.

This feels cleaner. Services can scale independently,
deployments don't kill everything, and your code is cleaner with
each service focused on its specific job. RabbitMQ makes sure all
services stay in sync, and you can add more instances when you need
them.

The WebSocket Headache

But wait now you have a different challenge. Microservices come with trafdeoffs, with 5 services and 3 instances each, you suddenly have 15 WebSocket servers running. That's a lot of overhead for something that used to be simple. Plus, scaling becomes tricky when you need more instances just to handle WebSocket connections, not business logic.

And let's talk about your frontend team, they're not happy. Now
they need to manage multiple WebSocket connections, figure out which
service to connect to for what data, and handle all the complexity
that brings.

One Gateway to Rule Them All

Here's where the socket gateway pattern comes in place. Instead of each service managing its own WebSocket connections, you create one dedicated service just for real time communication. Think of it as a
specialized receptionist who knows exactly where to route every
message.

Your frontend connects to just one place, making development much
simpler. The gateway listens for events from all your services and
delivers them to the right users. Meanwhile, your business services
can focus on what they do best, without worrying about WebSocket
management.

The Empty Room Problem

Multiple gateway instances got a problem. When a service broadcasts an event, every gateway instance receives it even
if most of them don't have the target users connected. It's like
shouting an announcement in every room of a building, even when the
person you're looking for is only in one room.

Imagine sending a notification to 1,000 users across 10 gateway
instances. Each instance has to check if it has those users
connected, resulting in 10,000 lookups where most come back empty.
Your system spends more time looking for users it doesn't have than
delivering messages to users it does have.

Smart Routing with Redis

The solution is simple and elegant. Each gateway registers its connected users in Redis, creating a shared directory of who’s where. When a service needs to send an update, it looks up the user in Redis and sends the event directly to the right gateway. No guessing, no broadcasting.

Connections are dynamic, but Redis keeps everything up to date. Users can move between instances, gateways can scale out, and messages always find their destination. With optional optimizations like pub/sub and TTLs, the system stays fast, precise, and scalable.

Scaling real time systems always brings new challenges. How do you keep messages fast and reliable as users and regions grow? If you have any suggestions, tips, or ideas, I’d love to hear from you, connect with me on LinkedIn.

DEV Community: Mohammad Zbib

MongoDB at Scale: Common Anti Patterns That Silently Kill Performance

The Good and Bad of MongoDB's Flexibility

Common Performance Problems and How to Solve Them

1. The N+1 Query Problem: A Hidden Performance Killer

2. The High Cost of Scanning Millions of Documents

3. The $lookup (Join) Cost: A Tricky Balance

4. The Aggregation Pipeline Trap: Powerful but Dangerous

How to Keep MongoDB Fast

In Short

Building Real Time Systems That Actually Scale

The beauty of Starting Simple

When Your Success Becomes Your Problem

Breaking Things Apart: Microservices

The WebSocket Headache

One Gateway to Rule Them All

The Empty Room Problem

Smart Routing with Redis

3. The `$lookup` (Join) Cost: A Tricky Balance