DEV Community

Cover image for PostgreSQL Design at Scale: Normalization vs JSONB (A Real-World Guide)
Abdullah al Mubin
Abdullah al Mubin

Posted on

PostgreSQL Design at Scale: Normalization vs JSONB (A Real-World Guide)

When you start building an app, everyone gives the same advice:

“Normalize your database.”

And they’re right… at the beginning.

But once your app starts scaling?

That same “clean design” can become your biggest bottleneck.

This is where denormalization (especially JSONB in PostgreSQL) becomes a powerful tool.

Let’s break this down with a real-world example.


Table of Contents

  • The Scenario: A High-Traffic Food Delivery App
  • Part 1: Normalization — The Clean Approach
  • The Problem with JOINs at Scale
  • Part 2: Denormalization (JSONB) — The Fast Approach
  • Better Query Example (Real-World)
  • The Trade-Off (This Is Important)
  • When Should You Use JSONB
  • When to Use Normalization
  • Hybrid Approach
  • JSONB Indexing
  • Final Thoughts

The Scenario: A High-Traffic Food Delivery App

Imagine you built:

bitedash.com

At scale:

  • Millions of users
  • Millions of orders
  • Heavy traffic during peak hours

Now a user opens:

“Order History”

Seems simple, right?

But under the hood, your database might be struggling.


Part 1: Normalization — The Clean Approach

Normalization is like a perfectly organized filing system.

Every piece of data has exactly one place.


Typical Schema

You design your database like this:

  • orders
  • order_items
  • products
  • users

Everything is clean, structured, and reusable.


The Problem with JOINs at Scale

To show one "Order History" page, PostgreSQL has to:

  • Join orders
  • Join order_items
  • Join products
  • Join users

That’s a 4-table JOIN


At Small Scale

Works perfectly.


At Large Scale

When your database grows:

  • Orders → 10 million rows
  • Order_items → 50 million rows

Every request now requires expensive JOIN operations


The Cost

  • High CPU usage
  • Slower queries
  • Increased latency
  • Poor user experience

Mental Model

Normalization = Clean but computationally expensive at scale


Part 2: Denormalization (JSONB) — The Fast Approach

Now let’s flip the approach.

Instead of storing data separately…

Store everything together.


What is JSONB?

In PostgreSQL:

  • Binary JSON format
  • Faster than plain JSON
  • Supports indexing
  • Efficient querying

New Schema

Instead of multiple tables:

You store everything inside one table:

orders

With a column:

order_details (JSONB)


Example Data

{
  "items": [
    { "name": "Pizza", "price": 15 },
    { "name": "Coke", "price": 2 }
  ],
  "user_name": "John Doe",
  "user_id": 123,
  "total": 17.00
}
Enter fullscreen mode Exit fullscreen mode

The Result

SELECT order_details FROM orders WHERE user_id = 123;
Enter fullscreen mode Exit fullscreen mode
  • No JOINs
  • Single query
  • Extremely fast

Why It’s Fast

  • One row fetch
  • No table stitching
  • Less CPU work

Mental Model

JSONB = Pre-packaged data ready to serve


Better Query Example (Real-World)

Let’s say your product team asks:

“Show all orders where the user bought Pizza and total is greater than $10”


With Normalization

SELECT o.id, o.total
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
WHERE p.name = 'Pizza'
AND o.total > 10;
Enter fullscreen mode Exit fullscreen mode

Problem

  • Multiple JOINs
  • Large tables
  • Slower at scale

With JSONB (use numeric cast for totals)

SELECT order_details
FROM orders
WHERE (order_details->>'total')::numeric > 10
  AND order_details @> '{"items":[{"name":"Pizza"}]}';
Enter fullscreen mode Exit fullscreen mode

More robust match using jsonb_path_exists (recommended for arrays of objects)

SELECT order_details
FROM orders
WHERE (order_details->>'total')::numeric > 10
  AND jsonb_path_exists(
        order_details,
        '$.items[*] ? (@.name == "Pizza")'
      );
Enter fullscreen mode Exit fullscreen mode

jsonb_path_exists is more flexible for matching nested arrays/objects than the simple @> containment operator.


Add Index for Speed

-- GIN index for general JSONB containment/search
CREATE INDEX idx_orders_order_details_gin ON orders USING GIN (order_details);

-- Expression index for numeric total (fast range/greater-than queries)
CREATE INDEX idx_orders_total_numeric ON orders (((order_details->>'total')::numeric));
Enter fullscreen mode Exit fullscreen mode

Result

  • Fast filtering
  • No joins
  • Efficient queries

The Trade-Off (This Is Important)

Feature Normalization JSONB (Denormalization)
Read Speed Slower (JOINs) Very fast (single read)
Write Speed Faster (small rows) Slower (large JSON writes)
Data Integrity Strong (FK constraints) Risk of inconsistency
Storage Efficient Larger (data duplication)

There is no “perfect” solution. It’s always a trade-off.


Part 3: When Should You Use JSONB?

Use JSONB strategically, not everywhere


Use JSONB for: “Snapshots of the Past”

Example:

A user places an order.

Tomorrow:

  • Pizza price changes
  • Product name changes

If you rely on JOINs:

Old orders will show new values and its Wrong


Solution

Store order as JSONB:

  • Captures exact state at that moment
  • Keeps historical data correct
  • Makes reads extremely fast

Benefits

  • Accurate history
  • Faster dashboards
  • Better scalability

Part 4: When to Use Normalization

Normalization is still critical.

Use it for live, changing data.


Examples

  • User accounts
  • Passwords
  • Emails
  • Inventory
  • Account balances

Why?

You need a single source of truth

You cannot afford:

  • Duplicate data
  • Inconsistent updates

Part 5: The Hybrid Approach (Best of Both Worlds)

At scale, real systems use both approaches together.


Example Architecture

  • Users → Normalized
  • Products → Normalized
  • Orders → JSONB snapshot

Practical performance note

If you frequently filter by fields inside the JSONB (for example user_id or total), store those as first-class columns too:

  • orders.user_id (bigint)
  • orders.total (numeric)

Then keep order_details (JSONB) as the immutable snapshot. This gives you indexable scalars for fast filters/ranges and a JSON snapshot for state and flexible reads.


Result

  • Fast reads
  • Data consistency
  • Scalability

Pro Tip: Indexing JSONB

You can still make JSONB queries fast.

CREATE INDEX idx_order_user_name 
ON orders ((order_details->>'user_name'));
Enter fullscreen mode Exit fullscreen mode

What this does:

  • Fast search inside JSON
  • Works like a normal column index

Combine GIN indexes for general JSONB containment with expression indexes for frequently filtered scalars.


Mental Model

  • Normalization = Clean + consistent
  • JSONB = Fast + flexible

Final Thoughts

Database design is always a balancing act.

Early stage:

Normalize everything

At scale:

Denormalize strategically

The best systems aren’t fully normalized or fully denormalized.
They’re carefully designed hybrids.

Top comments (0)