DEV Community

Cover image for Does PostgreSQL support as much "schema flexibility" as MongoDB? Not for indexing!
Franck Pachot for MongoDB

Posted on • Edited on

Does PostgreSQL support as much "schema flexibility" as MongoDB? Not for indexing!

Another day, another myth. Someone on Medium claims PostgreSQL offers as much "schema flexibility" as MongoDB, and tells me that I'm misinformed when I warn about such misconception: you-are-misinformed. I think I'm informed, reading documentation and testing reproducible examples, and none of the examples worked: it returns wrong results, and didn't use the index.
Let's go though it. Others' mistakes can be valuable learning material and PostgreSQL with JSON can be tricky.

The article it refers to provides the following example of document:

{
  name: "Amit",
  email: "amit@example.com",
  order: [
    { amount: 1200, status: "paid"   , createdAt: "2025-08-15" },
    { amount:  300, status: "failed" , createdAt: "2025-08-16" }
  ]
}
Enter fullscreen mode Exit fullscreen mode

I create a PostgreSQL table to store this document, as well as another with older dates, to validate the query filtering:

postgres=# CREATE TABLE my_table (
    id BIGSERIAL PRIMARY KEY,
    jsonb_col JSONB
);
CREATE TABLE

postgres=# INSERT INTO my_table(jsonb_col) VALUES  ('
{
  "name": "Amit",
  "order": [
    { "amount": 1200, "status": "paid",   "createdAt":  "2025-08-15" },
    { "amount":  300, "status": "failed", "createdAt": "2025-08-16" }
  ]
}
'),('
{
  "name": "Older",
  "order": [
    { "amount": 1200, "status": "paid",   "createdAt":  "2024-08-15" },
    { "amount":  300, "status": "failed", "createdAt": "2024-08-16" }
  ]
}
');

INSERT 0 2
Enter fullscreen mode Exit fullscreen mode

I used a string for the date because JSONB data types are limited to object, array, string, number, boolean, and null, unlike MongoDB's BSON types which can map all your application data.

1. Incorrect operator (exists vs. match)

The response on medium starts like this:

You are misinformed.
CREATE INDEX my_index ON my_table USING GIN (jsonb_col);
SELECT * FROM my_table WHERE jsonb_col @? '$.order.createdAt >= "2025-08-12"';

I copy/paste the example and it provides a wrong result as it includes a user with all orders created before "2025-08-12":

postgres=# CREATE INDEX my_index ON my_table USING GIN (jsonb_col);

CREATE INDEX

postgres=# SELECT * FROM my_table 
           WHERE jsonb_col @? '$.order.createdAt >= "2025-08-12"';

 id |                                                                           jsonb_col
----+----------------------------------------------------------------------------------------------------------------------------------------------------------------
  1 | {"name": "Amit", "order": [{"amount": 1200, "status": "paid", "createdAt": "2025-08-15"}, {"amount": 300, "status": "failed", "createdAt": "2025-08-16"}]}
  2 | {"name": "Older", "order": [{"amount": 1200, "status": "paid", "createdAt": "2024-08-15"}, {"amount": 300, "status": "failed", "createdAt": "2024-08-16"}]}
(2 rows)
Enter fullscreen mode Exit fullscreen mode

Did you spot the error? PostgreSQL JSON operators are tricky:

  • @? is a jsonb_path_exists operator
  • @@ is a jsonb_path_match operator

The query returned all rows because the path exists. What the author wanted was filtering on the date in this path. Here is the query corrected:

postgres=# SELECT * FROM my_table 
           WHERE jsonb_col @@ '$.order.createdAt >= "2025-08-12"';
 id |                                                                         jsonb_col
----+------------------------------------------------------------------------------------------------------------------------------------------------------------
  1 | {"name": "Amit", "order": [{"amount": 1200, "status": "paid", "createdAt": "2025-08-15"}, {"amount": 300, "status": "failed", "createdAt": "2025-08-16"}]}
(1 row)
Enter fullscreen mode Exit fullscreen mode

The result is the expected one: only one user ordered after "2025-08-12". Let's check if the index was used.

2. Index unused for range query

With such small dataset, the query runs a full table scan, but that may be a cost-based decision:

postgres=# explain (analyze, costs off, summary off)
           SELECT * FROM my_table
           WHERE jsonb_col @@ '$.order.createdAt >= "2025-08-12"'
;
                                  QUERY PLAN
------------------------------------------------------------------------------
 Seq Scan on my_table (actual time=0.016..0.018 rows=1.00 loops=1)
   Filter: (jsonb_col @@ '($."order"."createdAt" >= "2025-08-12")'::jsonpath)
   Rows Removed by Filter: 1
   Buffers: shared hit=1
 Planning:
   Buffers: shared hit=1
Enter fullscreen mode Exit fullscreen mode

I disable sequential scan and the index is used:

postgres=# set enable_seqscan to off;
SET

postgres=# explain (analyze, costs off, summary off)
           SELECT * FROM my_table
           WHERE jsonb_col @@ '$.order.createdAt >= "2025-08-12"'
;

                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Bitmap Heap Scan on my_table (actual time=0.023..0.025 rows=1.00 loops=1)
   Recheck Cond: (jsonb_col @@ '($."order"."createdAt" >= "2025-08-12")'::jsonpath)
   Rows Removed by Index Recheck: 1
   Heap Blocks: exact=1
   Buffers: shared hit=4
   ->  Bitmap Index Scan on my_index (actual time=0.009..0.009 rows=2.00 loops=1)
         Index Cond: (jsonb_col @@ '($."order"."createdAt" >= "2025-08-12")'::jsonpath)
         Index Searches: 1
         Buffers: shared hit=3
 Planning:
   Buffers: shared hit=1
Enter fullscreen mode Exit fullscreen mode

However, the index was not used to find the rows, despite seeing "createdAt" >= "2025-08-12" in Index Cond. Look at the number of rows returned by the index scan: rows=2.00.
All rows were read, it is an index full scan, the index condition was not used because it cannot be used on an GIN index. The rows were filtered later when accessing the table, with Rows Removed by Index Recheck in order to return rows=1. If you had ten million rows in the table, and only a few with recent orders, it would have done the same: read all index entries, read all rows from the table, and finally reduce the result.

Reading PostgreSQL execution plans is tricky, but the actual rows number of explain analyze does not lie. The GIN index was not used to seek to a range of values. You can also read the documentation:

The standard distribution of PostgreSQL includes a GIN operator class for arrays, which supports indexed queries using these operators:
<@ @> = &&

Those are only equality operators:

  • @> - Contains: arrayA @> arrayB means that Array A contains all elements of Array B.
  • <@ - Is contained by: arrayA <@ arrayB means that Array A is contained in Array B (all elements of A are in B).
  • = - Equals: arrayA = arrayB means that arrays are equal (same elements, same order).
  • && - Overlap: arrayA && arrayB means that arrays have any elements in common (intersection is not empty).

GIN indexes cannot access ranges of values or return sorted values. They are used to find exact values, efficiently with bitmap scans, but constrained to equality predicates and without preserving the order of values.

3. Expression-based index: wrong result

The author adds another solution: use an expression based index rather than a GIN index:

Alternatively you could just make a more specific (and faster) index just on the creation time.
CREATE INDEX my_index ON my_table ((jsonb_col #>> '{order,createdAt}')::timestamptz);
SELECT * FROM my_table WHERE (jsonb_col #>> '{order,createdAt}')::timestamptz > now() - '2 days'::interval;

I copy/paste the index creation but it fails:

postgres=# CREATE INDEX my_index ON my_table ((jsonb_col #>> '{order,createdAt}')::timestamptz);

ERROR:  syntax error at or near "::"
LINE 1: ...x ON my_table ((jsonb_col #>> '{order,createdAt}')::timestam...
Enter fullscreen mode Exit fullscreen mode

Expression-based indexes must include the expression in parentheses:

postgres=# CREATE INDEX my_index ON my_table ( ( (jsonb_col #>> '{order,createdAt}')::timestamptz) );

ERROR:  functions in index expression must be marked IMMUTABLE
Enter fullscreen mode Exit fullscreen mode

Casting to a timestamp is non-deterministic, as it depends on the timezone, so I keep it as a character string in the index:

postgres=# CREATE INDEX my_index2 ON my_table ((jsonb_col #>> '{order,createdAt}'));

CREATE INDEX
Enter fullscreen mode Exit fullscreen mode

The index is created, let's run the query:

postgres=# SELECT * FROM my_table 
           WHERE (jsonb_col #>> '{order,createdAt}') > '2025-08-12'
;

 id | jsonb_col
----+-----------
(0 rows)
Enter fullscreen mode Exit fullscreen mode

The result is empty because jsonb_col #>> '{order,createdAt}' reads a scalar value at "order.createdAt" and none exist. It is different from the field in the array, which is, when there are two items in the array:

  • "order.0.createdAt" or jsonb_col #>> '{order,0,createdAt}'
  • "order.1.createdAt" or jsonb_col #>> '{order,1,createdAt}'

In MongoDB, if you create an index on "order.createdAt" it will transparently index all those possible paths, because MongoDB is a document database that supports flexible schema. However, in PostgreSQL, you need to use different index types for scalars and arrays.

In short, the solution proposed by the author uses the index but returns a wrong result (rows=0.00):

postgres=# explain (analyze, costs off, summary off)
           SELECT * FROM my_table
           WHERE (jsonb_col #>> '{order,createdAt}') > '2025-08-12'
;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Index Scan using my_index2 on my_table (actual time=0.012..0.012 rows=0.00 loops=1)
   Index Cond: ((jsonb_col #>> '{order,createdAt}'::text[]) > '2025-08-12'::text)
   Index Searches: 1
   Buffers: shared hit=1
(4 rows)
Enter fullscreen mode Exit fullscreen mode

With expression-based index, you can get equality, sort or ranges, but they can be used only when there's no array in the path. There are designed for the flat tabular tables of SQL databases and relational model in normal forms. Indexing a path like "order.createdAt" or (jsonb_col #>> '{order,createdAt}') is like having an "order_createdAt" column without JSON and cannot apply to a one-to-many relationship document.

4. Immutable function-based index: wrong result

The author proposes an alternative solution, to force the cast to timestamp being considered immutable (which it is not) and index it:

Making a function for that lookup cleans things up nicely.
CREATE FUNCTION order_created(jsonb) RETURNS timestamptz IMMUTABLE STRICT LANGUAGE sql AS $$
SELECT ($1 #>> '{order,createdAt}')::timestamptz;
$$;
Then your index and query look like:
CREATE INDEX my_index ON my_table (order_created(jsonb_col));
SELECT * FROM my_table WHERE order_created(jsonb_col) > now() - '2 days'::interval;

Of course, this is not better as the function still reads the "order" array as if it was a scalar value. In addition, the function-based index materializes a value calculated for a specific timezone, but the function that will use it may be in a different timezone, so the comparison will be wrong.
To demonstrate that it is not immutable, despite the declaration, I create the function for the first item of the array, adding .0 to the path:

postgres=# CREATE FUNCTION order_created(jsonb) 
           RETURNS timestamptz IMMUTABLE STRICT LANGUAGE sql AS $$
            SELECT ($1 #>> '{order,0,createdAt}')::timestamptz;
           $$;

CREATE FUNCTION
Enter fullscreen mode Exit fullscreen mode

For the same value, the one in the table, the function returns different values:

postgres=# SET TIMEZONE TO 'UTC';
SET
postgres=# select now() - order_created( jsonb_col ) from my_table
;
         ?column?
--------------------------
 -2 days -02:19:56.097701
 362 days 21:40:03.902299
(2 rows)

postgres=# SET TIMEZONE TO 'Pacific/Apia';
SET
postgres=# select now() - order_created( jsonb_col ) from my_table
;
         ?column?
--------------------------
 -1 days -13:19:42.152896
 363 days 10:40:17.847104
(2 rows)
Enter fullscreen mode Exit fullscreen mode

If you ever create an immutable function for an expression that is not immutable, you must add enough tests to be sure it raises an error when used in a different context. And JSONB has limited datatypes, with no dates. Anyway, expression and function-based indexes are the same and cannot be used on documents with an array in the path.

Conclusion

PostgreSQL offers advanced datatypes and indexing, with JSONB allowing for some non-relational data modeling. However, achieving MongoDB-like schema flexibility in PostgreSQL is challenging regarding JSON operations and indexing.

For data models with arrays, nested fields, and various types like dates, MongoDB’s BSON and indexing prioritizes indexing and query simplicity, for performance and corectness. In contrast, PostgreSQL's JSONB integrates with many SQL features and index types but often requires complex syntax and indexing trade-offs, leading to potential errors or misconceptions. JSON and JSONB datatypes store and retrieve documents in PostgreSQL, but it remains a SQL database, with SQL tables and columns as its core strength, and not a document database.

  • For data-centric, normalized schemas: use PostgreSQL with a relational data model—even though you can add JSON and JSONB columns for occasional flexibility.
  • For flexible, application-centric document schemas: Use MongoDB. It is designed to process rich, evolving structures, arrays, and nested data—even though you can create references and perform lookups between documents for additional use-cases.

Be cautious about information on Medium or elsewhere that lacks demos and documentation. Many accept claims without verification. I always test, and I started with db<>fiddle for this example, which makes it easy to test different database versions. By fact-checking, you’ll learn, make informed decisions and avoid repeating others' mistakes. I have a blog post series on the Myths about MongoDB.

Top comments (0)