The Relational Hangover — How I Tried to Force SQL into MongoDB (and Cried in BSON)

#scalingdiaries #mongodb #nosql #databasedesign

When I made the leap from PostgreSQL to MongoDB, I thought I was ready. I had read the brochures, watched the webinars, and even memorized the sales pitch: "A flexible schema database that scales horizontally!".

But there's a dark side to database migrations that no one warns you about: The Relational Hangover.

It’s that splitting headache you get when you try to force NoSQL to act like a relational database. You look at MongoDB’s clean, document-based universe and, instead of embracing its flexible document-first distributed engine, you start drawing imaginary lines, building invisible tables, and whispering "SELECT * FROM..." to yourself in the dark.

Today in ScalingDiaries, we are going to look at three schema design sins I committed during my transition—and how to fix them before your Atlas bill starts looking like a phone number.

Sin #1: The Postgres with a Mustache (Too Many Collections)
In PostgreSQL, normalization is law. You have a table for users, a table for orders, a table for items, and a table just to map user_roles. When I arrived in MongoDB, I did the exact same thing. I created a collection for every single entity.

I essentially drew a mustache on PostgreSQL, called it NoSQL, and went to lunch.

The Problem: MongoDB can handle multiple collections, but when you build a "relational schema in disguise," you are going to run into performance issues. Having too many collections degrades storage engine performance and creates unnecessary database complexity. In SQL, table joins are highly optimized. In MongoDB, trying to join ten different collections using native $lookup operations will drag your read performance down into the abyss.

The Cure: Embrace the document model! If data is always read together (like a user's address profile), embed it. MongoDB is built to store nested, rich JSON/BSON structures. Denormalize intentionally to avoid expensive reads.

Sin #2: The Unbounded Array (The "Infinite Comment" Trap)
My SQL brain told me: "Hey, a blog post has comments. Since MongoDB documents can hold arrays, let's just dump the comments into an array inside the blog post document!"

// My terrible, horrible, no-good schema
{
  "_id": ObjectId("64b2c3d4e5f6a7b8c9d0e2f3"),
  "title": "My First NoSQL App",
  "body": "It's going great!",
  "comments": [
    { "user": "alice", "text": "Awesome post!" },
    { "user": "bob", "text": "This schema is questionable." }
    // ... 10,000 comments later
  ]
}

It worked beautifully. Until it didn't.

The Problem: This is the classic Unbounded Array Anti-Pattern. In MongoDB, a single BSON document has a strict limit of 16 Megabytes. While 16MB sounds huge (that’s a lot of text!), a growing, unbounded array will eventually exceed this limit.

Long before you hit the 16MB ceiling, you’ll face severe performance degradation. Reading a blog post means loading all comments into memory. Building indexes on a rapidly growing array will put an immense strain on your database resources, slowing down read and write operations.

The Cure: If an array can grow indefinitely, do not embed it. Instead, use References. Store comments in a separate collection, with each comment pointing back to the post’s _id.

Alternatively, use the Subset Pattern: embed only the first 5 or 10 most recent comments in the blog post document for fast loading, and store the rest in a referenced collection for when the user clicks "View All."

Sin #3: The $lookup Addiction (Faking Joins)
Because I normalized everything (Sin #1), I needed a way to stitch my data back together. Enter $lookup—MongoDB's native way to perform left outer joins.

I became addicted. Need to show a user profile, their orders, and their order items? Just stack three $lookup stages in an aggregation pipeline!

db.users.aggregate([
  { $lookup: { from: "orders", localField: "_id", foreignField: "userId", as: "user_orders" } },
  { $unwind: "$user_orders" },
  { $lookup: { from: "items", localField: "user_orders.itemId", foreignField: "_id", as: "item_details" } }
])

The Problem: While $lookup is an incredibly powerful tool for analytics and stream processing (and works beautifully inside the new Atlas Stream Processing engine [2, 4]), using it in your core, high-frequency transactional queries is a recipe for disaster.

Each $lookup is computationally expensive. If you are running multiple $lookup operations on high-throughput queries, you are destroying MongoDB's performance advantage. You’re essentially paying a performance premium for a distributed database, only to force it to behave like a slow relational database.

The Cure: Design your schema around how your application queries data, not how the data naturally relates. If your dashboard needs to show a customer's name next to their order, denormalize and duplicate the customerName field directly inside the orders document. Yes, you have to manage updating it if their name changes, but your read queries will run at machine speed—sub-100ms latency.

The 2026 Verdict: Unlearn to Scale
Moving from SQL to NoSQL isn’t just a syntax change; it’s a psychological transition. If you are constantly fighting the document model, trying to enforce foreign key constraints, or relying on endless joins, you’re missing out on what makes MongoDB a powerhouse.

In 2026, with the release of MongoDB 8.3 delivering massive performance increases (up to 45% more reads and 35% more writes over MongoDB 8.0), the database is faster than ever. But a database is only as fast as the schema you design.

Stay tuned for Chapter 3 of ScalingDiaries, where we will explore how to right-size our Atlas nodes to save up to 50% on storage costs using the brand new storage-optimized search nodes!

Until then—stop joining, start embedding, and let your data be flexible. 🚀

DEV Community

The Relational Hangover — How I Tried to Force SQL into MongoDB (and Cried in BSON)

Top comments (0)