First/Last per Group: PostgreSQL DISTINCT ON and MongoDB DISTINCT_SCAN Performance

#postgres #mongodb #database #performance

On Stack Overflow, the most frequent question for PostgreSQL is: "Select first row in each GROUP BY group?" I've written about it previously, presenting multiple alternatives and execution plans: How to Select the First Row of Each Set of Grouped Rows Using GROUP BY.
Solving this problem in a way that's both straightforward and high-performing can be challenging with SQL databases. However, MongoDB's aggregation provides a simple syntax and an efficient execution plan. When you use $first or $last alongside $sort and $group, MongoDB can perform an efficient loose index scan, which is similar to an index skip scan, reading only what is necessary for the result.

PostgreSQL

With PostgreSQL, the DISTINCT ON ... ORDER BY syntax is the easiest for the developer, but not the best for performance.

create table demo (
 primary key (a, b, c),
  a int, b timestamptz,
  c float,
  d text
);
-- insert a hundred thousand rows
insert into demo
 select 
  a,
  now() as b,
  random() as c,
  repeat('x',5) as d
 from generate_series(1,5) a
    , generate_series(1,20000) c
 -- ignore bad luck random;
 on conflict do nothing
;
-- run 10 more times (now() will be different):
\watch count=9
vacuum analyze demo;

In PostgreSQL 18, all rows are read, but most are eliminated so that only one row remains per group:

explain (buffers, analyze, verbose, costs off)
select
 distinct on (b) a, b, c, d
from demo
where a=1
order by b, c
;

                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Unique (actual time=0.025..94.601 rows=10.00 loops=1)
   Output: a, b, c, d
   Buffers: shared hit=199959
   ->  Index Scan using demo_pkey on public.demo (actual time=0.024..77.263 rows=200000.00 loops=1)
         Output: a, b, c, d
         Index Cond: (demo.a = 1)
         Index Searches: 1
         Buffers: shared hit=199959
 Planning Time: 0.077 ms
 Execution Time: 94.622 ms

Although the DISTINCT ON ... ORDER BY syntax is straightforward, it is not efficient here: the number of rows processed (rows=200,000.00) and buffers read (Buffers: shared hit=199,959) is excessive compared to the result size (rows=10.00).

If you want to avoid unnecessary reads, you have to write a complex recursive CTE:

with recursive skip_scan as (
 (
  -- get the first row
  select * from demo
  where a=1
  order by b,c limit 1
 ) union all (
  -- get the next row
  select demo.*
  from skip_scan , lateral(
   select * from demo
   where demo.a = skip_scan.a and demo.b > skip_scan.b
   order by b,c limit 1
  ) demo
 )
)
select * from skip_scan
;

This simulates an index loose scan with nested loops, iterating from a recursive WITH clause.

MongoDB aggregation

In MongoDB, using an aggregation pipeline makes it easy and efficient to get either the first or last document from each group.

I create a collection and fill it with similar data:

// Create collection and compound unique index to mimic PRIMARY KEY(a, b, c)
db.demo.drop();
db.demo.createIndex(
  { a: 1, b: 1, c: 1 },
  { unique: true }
);

// Function to insert bulk records
function insertDemoBatch() {
  const batch = [];
  const now = new Date();
  for (let a = 1; a <= 5; a++) {
    for (let j = 1; j <= 20000; j++) {
      batch.push({
        a: a,
        b: now,                   // similar to now()
        c: Math.random(),         // random float [0,1)
        d: 'x'.repeat(5)          // repeat string
      });
    }
  }
  try {
    db.demo.insertMany(batch, { ordered: false }); // ignore duplicates
  } catch (e) {
    print(`Insert completed with some duplicates ignored: ${e.writeErrors?.length ?? 0} errors`);
  }
}

// Run 10 times — now() will be different each run
for (let i = 0; i < 10; i++) {
  insertDemoBatch();
}

Here is the aggregation that groups and keeps the first value of each group:

db.demo.aggregate([
  { $match: { a: 1 } },         // equivalent to WHERE a=1
  { $sort: { b: 1, c: 1 } },    // equivalent to ORDER BY b, c
  { $group: {
      _id: "$b",                 // equivalent to DISTINCT ON (b)
      a: { $first: "$a" },       
      b: { $first: "$b" },
      c: { $first: "$c" },
      d: { $first: "$d" }      
  }},
  { $project: {                  // equivalent to SELECT a, b, c, d
      _id: 0, a: 1, b: 1, c: 1, d: 1 
  }},
]).explain("executionStats");

The execution plan is efficient, reading only one document per group (totalDocsExamined: 10) and seeking to the end of each group (keysExamined: 11) in the index scan:

...
        executionStats: {                                                                                                                                                                            
          executionSuccess: true,
          nReturned: 10,
          executionTimeMillis: 0,
          totalKeysExamined: 11,
          totalDocsExamined: 10,
          executionStages: {
            isCached: false,
            stage: 'FETCH',
            nReturned: 10,
            executionTimeMillisEstimate: 0,
            works: 11,
            advanced: 10,
            needTime: 0,
            needYield: 0,
            saveState: 1,
            restoreState: 1,
            isEOF: 1,
            docsExamined: 10,
            alreadyHasObj: 0,
            inputStage: {
              stage: 'DISTINCT_SCAN',
              nReturned: 10,
              executionTimeMillisEstimate: 0,
              works: 11,
              advanced: 10,
              needTime: 0,
              needYield: 0,
              saveState: 1,
              restoreState: 1,
              isEOF: 1,
              keyPattern: { a: 1, b: 1, c: 1 },
              indexName: 'a_1_b_1_c_1',
              isMultiKey: false,
              multiKeyPaths: { a: [], b: [], c: [] },
              isUnique: true,
              isSparse: false,
              isPartial: false,
              indexVersion: 2,
              direction: 'forward',
              indexBounds: {
                a: [ '[1, 1]' ],
                b: [ '[MinKey, MaxKey]' ],
                c: [ '[MinKey, MaxKey]' ]
              },
              keysExamined: 11
            }
          }
        }
      },
      nReturned: Long('10'),
...

MongoDB uses DISTINCT_SCAN in aggregation when the pipeline starts with a $sort and $group using $first or $last. The planner checks for a matching index that has the correct sort order, adjusting the scan direction if needed. It also checks that the group fields are not multi-key. If the conditions are met, MongoDB rewrites the pipeline to use DISTINCT_SCAN and $groupByDistinct, optimizing by skipping to the relevant index entries and retrieving only needed documents.

This pattern is common in real‑world queries such as: