Extended RUM in DocumentDB extension for PostgreSQL: Efficient ESR (Equality, Sort, Range) Queries

#postgres #sql #document #database

Last year, I examined RUM indexes within this series on multi-key indexing, demonstrating that they cannot substitute MongoDB's compound indexes for sorted queries. A year later, Microsoft has fixed this in the DocumentDB extension for PostgreSQL with an Extended RUM index that preserves the ordering of the keys, allowing an ordered scan rather than a bitmap scan. Let's revisit our pagination query to see how it performs now.

I start a container with the latest DocumentDB (version v0.112-0 from May 26, 2026):

docker run -d --name documentdb-local -p 10260:10260 -p 9712:9712 ghcr.io/documentdb/documentdb/documentdb-local:latest  --username franck --password franck --start-pg

I can connect to PostgreSQL on port 9712, where many extensions are installed, including the extended RUM index:

docker exec -it documentdb-local psql -p 9712 postgres

psql (17.10 (Debian 17.10-1.pgdg13+1))
Type "help" for help.

postgres=# \dx
                                        List of installed extensions
Name           | Version |   Schema   |                        Description                         
-------------------------+---------+------------+------------------------------------------------------------
 documentdb              | 0.112-0 | public     | API surface for DocumentDB for PostgreSQL
 documentdb_core         | 0.112-0 | public     | Core API surface for DocumentDB on PostgreSQL
 documentdb_extended_rum | 0.112-0 | public     | DocumentDB Extended RUM index access method
 pg_cron                 | 1.6     | pg_catalog | Job scheduler for PostgreSQL
 plpgsql                 | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgis                 | 3.6.3   | public     | PostGIS geometry and geography spatial types and functions
 tsm_system_rows         | 1.0     | public     | TABLESAMPLE method which accepts number of rows as a limit
 vector                  | 0.8.2   | public     | vector data type and ivfflat and hnsw access methods
(8 rows)

postgres=#

I can also connect to the MongoDB-compatible API:

docker exec -it documentdb-local mongosh -u franck -p franck 'mongodb://localhost:10260/?tls=true&tlsAllowInvalidCertificates=true'

Current Mongosh Log ID: 6a0b3b537d2a1c3471d1a7ba
Connecting to:          mongodb://<credentials>@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true&directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.8.3
Using MongoDB:          7.0.0
Using Mongosh:          2.8.3

For mongosh info see: https://www.mongodb.com/docs/mongodb-shell/

[direct: mongos] test>

Like in the previous post, I created a simple collection with 10,000 documents:

[direct: mongos] test>
 for (let i = 0; i < 10000; i++) {
  db.demo.insertOne({
    a: 1,
    b: Math.random(),
    ts: new Date()
  });
}

I create a compound index that follows the MongoDB Equality, Sort, Range rule—designed for queries with an equality filter on a and a sort on ts:

[direct: mongos] test>
 db.demo.createIndex({ "a": 1, "ts": -1 });

I run the same query as in the previous post, which, with the standard RUM indexes, produced a Bitmap Index Scan followed by a Sort of all documents matching a: 1 before returning the top 10:

[direct: mongos] test>
 db.demo.find(
 { a: 1 }
).sort(
 { ts: -1 }
).limit(10).explain("executionStats");

The good surprise is that with the current version of DocumentDB, the execution plan looks like MongoDB's native IXSCAN with no additional sort step:

[direct: mongos] test> db.demo.find(
  { a: 1 }).sort({ts:-1}).limit(10).explain("executionStats")
;

{
  explainVersion: 2,
...
  executionStats: {
    nReturned: Long('10'),
    executionTimeMillis: 0.286,
    executionStartAtTimeMillis: 0.256,
    totalDocsExamined: Long('10'),
    totalKeysExamined: Long('10'),
    executionStages: {
      stage: 'LIMIT',
      nReturned: Long('10'),
      executionTimeMillis: 0.286,
      executionStartAtTimeMillis: 0.256,
      totalDocsExamined: 10,
      totalKeysExamined: 10,
      numBlocksFromCache: 25,
      inputStage: {
        stage: 'FETCH',
        nReturned: Long('10'),
        executionTimeMillis: 0.267,
        executionStartAtTimeMillis: 0.253,
        totalKeysExamined: 10,
        numBlocksFromCache: 25,
        inputStage: {
          stage: 'IXSCAN',
          nReturned: Long('10'),
          executionTimeMillis: 0.267,
          executionStartAtTimeMillis: 0.253,
          indexName: 'a_1_ts_-1',
          totalKeysExamined: 10,
          numBlocksFromCache: 25
        }
      }
    }
  },
  ok: 1
}

It read only the minimum necessary to get the result: ten index entries (totalKeysExamined: 10) in the expected order and fetched only ten documents (totalDocsExamined: 10). This is the most efficient execution plan.

Comparing the Two RUM Index Definitions

I connect to PostgreSQL to describe the table that stores my collection documents (you will see later how I obtained the name):

postgres=# \d documentdb_data.documents_7

            Table "documentdb_data.documents_7"

     Column      |  Type  | Collation | Nullable | Default 
-----------------+--------+-----------+----------+---------
 shard_key_value | bigint |           | not null | 
 object_id       | bson   |           | not null | 
 document        | bson   |           | not null | 

Indexes:

    "collection_pk_7" PRIMARY KEY, btree (shard_key_value, object_id)

    "documents_rum_index_25" documentdb_extended_rum (document documentdb_extended_rum_catalog.bson_extended_rum_composite_path_ops (pathspec='[ "a", { "ts" : -1 } ]', tl='2691'))

Check constraints:

    "shard_key_value_check" CHECK (shard_key_value = '7'::bigint)

postgres=#

What was a standard RUM index in the previous post is now an extended RUM index:

Attribute	Previous post	Current test
Index Type	`documentdb_rum`	`documentdb_extended_rum`
Operator Class	two `bson_rum_single_path_ops`	one `bson_extended_rum_composite_path_ops`
Fields	`a` (asc), `ts`	`a` (asc), `ts` (desc)
Sort Direction on `ts`	Not specified	Explicitly `-1` (descending)
Path Encoding	Two separate `path=` entries	Single JSON `pathspec` array

The extended RUM index acts as a sort-order-aware composite index, embedding the descending direction directly into the pathspec. Unlike the previous approach, which stored each path independently, this approach encodes all indexed fields as a single composite pathspec and generates a single composite index entry per document, preserving the relative ordering between fields. An index scan (RumOrderedScan) efficiently covers both filtering and sorting, eliminating the need for a separate Sort node in the PostgreSQL execution plan. This benefit is evident when executing the same query via the DocumentDB API in PostgreSQL:

postgres=# explain (analyze, buffers, verbose, costs off)
select document from bson_aggregation_find(
  'test',
  '{
    "find": "demo",
    "filter": { "a": 1 },
    "sort":   { "ts": -1 },
    "limit": 10
  }'::documentdb_core.bson
);
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=0.108..0.138 rows=10 loops=1)
   Output: document, (bson_orderby(document, '{ "ts" : { "$numberInt" : "-1" } }'::bson))
   Buffers: shared hit=4
   ->  Index Scan using "a_1_ts_-1" on documentdb_data.documents_7 collection (actual time=0.104..0.117 rows=10 loops=1)
         Output: document, bson_orderby(document, '{ "ts" : { "$numberInt" : "-1" } }'::bson)
         Index Cond: (collection.document @= '{ "a" : { "$numberInt" : "1" } }'::bson)
         Order By: (collection.document |-<> '{ "ts" : { "$numberInt" : "-1" } }'::bson)
         Buffers: shared hit=4
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.453 ms
 Execution Time: 0.192 ms
(12 rows)

postgres=#

Note: I got the name of the internal table I described above from this execution plan, which uses the collection name in the query. The MongoDB API's explain() shows a MongoDB-compatible execution plan, and EXPLAIN in PostgreSQL shows the PostgreSQL version of it.

Comparison of Execution Plans

Here is how the new behavior with an ordered index scan compares to the previous bitmap scan.

Feature	Ordered Index Scan	Bitmap Index Scan
PostgreSQL Node	`Index Scan`	`Bitmap Index Scan`
Ordering	Handled by sort direction in index	Lost—requires a `Sort` node
Scan Type	`scanType: RumOrderedScan`	`scanType: RumFastScan / RumRegularScan`
Efficiency	Supports early termination (`LIMIT`)	Must scan all matching TIDs into bitmap
RUM Entry Point	`rumgettuple()`	`rumgetbitmap()`
Sort Step	None — `useSimpleScan = true`	`rum_tuplesort_performsort()` required
Memory Usage	Low—one tuple at a time	High—full `TIDBitmap` + sort state
Index Structure Used	B-tree walk via `orderStack`	Posting list / posting tree dump
Filter Evaluation	Inline via `ValidateIndexEntry()`	Post-collection in `keyGetItem()`
Seek Optimization	Yes—advances `queryKey` as entries exhaust	No
Multi-column Support	Multi-column via composite pathspec	Multi-column via separate entries
`LIMIT` benefit	✅ Full — stops after N rows	❌ None — bitmap built before `LIMIT` applies
Recheck Behavior	`xs_recheckorderby` per tuple	`xs_recheck` on bitmap result
Trigger Condition	`RumEnableOrderedOperatorScans` + `willSort` + `norderbys > 0`	Default path

Index Only Scan

Other improvements are coming to Extended RUM, like Index Only Scan, currently supported for COUNT:

postgres=# explain (analyze, buffers, verbose, costs off)  
SELECT document FROM bson_aggregation_count(  
  'test',  
  '{  
    "count": "demo",  
    "query": { "a": 1 }  
  }'::documentdb_core.bson  
);
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate (actual time=22.760..22.763 rows=1 loops=1)
   Output: documentdb_api_internal.bsoncommandcount(1)
   Buffers: shared hit=109
   ->  Index Only Scan using "a_1_ts_-1" on documentdb_data.documents_7 collection (actual time=0.089..14.292 rows=10000 loops=1)
         Output: collection.document
         Index Cond: (collection.document @= '{ "a" : { "$numberInt" : "1" } }'::bson)
         Heap Fetches: 0
         Buffers: shared hit=109
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.441 ms
 Execution Time: 22.883 ms
(12 rows)

Index Only Scan will will support covering projections in the future (see IsQueryValidForIndexOnlyScan)

Conclusion

A year ago, DocumentDB's RUM indexes had a significant limitation for pagination queries: even with the right compound index, the planner would fall back to a Bitmap Index Scan followed by a full Sort, meaning every matching document had to be collected and sorted before the first result could be returned. A LIMIT 10 query on 10,000 documents would examine all 10,000—defeating the purpose of the compound index.

With v0.112-0, this is fixed. The new documentdb_extended_rum index type, combined with the RumOrderedScan execution path, reduces the gap with native MongoDB behavior:

The index encodes sort direction directly in the pathspec ({ "ts": -1 })
The planner chooses an Index Scan instead of a Bitmap Index Scan
No Sort node appears in the plan
LIMIT 10 examines exactly 10 index entries and 10 documents

This is more than just a cosmetic change. In time-series queries—such as filtering on a low-cardinality field, sorting by timestamp descending, and retrieving the first page—the difference between the two plans ranges from O(result) to O(total size). For OLTP systems, pagination queries are common and need to be quick, scalable, and predictable, since they show results to the user before she selects, refines filters, or moves to the next page.

This ordered scan is also essential for TTL indexes to efficiently identify expiration candidates.

The key ingredients that make this work together are visible from the DocumentDB open-source code:

documentdb_extended_rum—composite pathspec with explicit sort direction
bson_extended_rum_composite_path_ops — single operator class covering all fields
RumOrderedScan — B-tree walk in index order via orderStack, bypassing rumgetbitmap()
useSimpleScan — returns one tuple at a time, enabling true LIMIT pushdown
RumAllowOrderByRawKeys — the GUC that enables this path, now on by default

This behavior is enabled by default but not forced (it is a planner decision based on the query characteristics, with runtime adaptation to optimize the scan type):

postgres=# \dconfig *rum*order*

           List of configuration parameters
                  Parameter                   | Value
----------------------------------------------+-------
 documentdb_rum.enable_ordered_operator_scans | on
 documentdb_rum.forceRumOrderedIndexScan      | off
(2 rows)

In under a year, DocumentDB evolved from "RUM instead of GIN, but with the same pagination limitations" to "RUM with ordered scan, aligning more with MongoDB's IXSCAN behavior for ESR-pattern indexes". For developers implementing cursor-based pagination or queries with a selective filter and sorting on a time or sequence field, this marks the version at which it begins to function as expected. It also improves TTL indexes maintenance.