How does it scale? A basic OLTP benchmark on MongoDB

#mongodb #benchmark #performance #scalability

Choosing a database requires ensuring that performance remains fast as your data grows. For example, if a query takes 10 milliseconds on a small dataset, it should still be quick as the data volume increases and should never approach the 100ms threshold that users perceive as waiting. Here’s a simple benchmark: We insert batches of 1,000 operations into random accounts, then query the account with the most recent operation in a specific category—an OLTP scenario using filtering and pagination. As the collection grows, a full collection scan would slow down, so secondary indexes are essential.

We create an accounts collection, where each account belongs to a category and holds multiple operations—a typical one-to-many relationship, with an index for our query on operations per categories:

db.accounts.createIndex({
  "category": 1,
  "operations.date": 1,
  "operations.amount": 1,
});

The index follows the MongoDB Equality, Sort, Range guideline.

To increase data volume, this function inserts operations into accounts (randomly distributed to 10 million accounts over 3 categories):

function insert(num) {
  const ops = [];
  for (let i = 0; i < num; i++) {
    const account  = Math.floor(Math.random() * 10_000_000) + 1;
    const category = Math.floor(Math.random() * 3);
    const operation = {
      date: new Date(),
      amount: Math.floor(Math.random() * 1000) + 1,
    };
    ops.push({
      updateOne: {
        filter: { _id: account },
        update: {
          $set: { category: category },
          $push: { operations: operation },
        },
        upsert: true,
      }
    });
  }
  db.accounts.bulkWrite(ops);
}

This adds 1,000 operations and should take less than one second:

let time = Date.now();
insert(1000);
console.log(`Elapsed ${Date.now() - time} ms`);

A typical query fetches the account, in a category, that had the latest operation:

function query(category) {
  return db.accounts.find(
    { category: category },
    { "operations.amount": 1 , "operations.date": 1 }
  ).sort({ "operations.date": -1 })
   .limit(1);
}

Such query should take a few milliseconds:

let time = Date.now();
print(query(1).toArray());
console.log(`Elapsed ${Date.now() - time} ms`);

I repeatedly insert new operations, by batches of 1,000, in a loop, and measure the time taken for the query while the collection grows, stopping once I reach one billion operations randomly distributed into the accounts:

for (let i = 0; i < 1_000_000; i++) { 
  // more data  
  insert(1000);  
  // same query
  const start = Date.now();  
  const results = query(1).toArray();  
  const elapsed = Date.now() - start;  
  print(results);  
  // elapsed time
  console.log(`Elapsed ${elapsed} ms`);  
}  
console.log(`Total accounts: ${db.accounts.countDocuments()}`);

In a scalable database, the response time should not significantly increase while the collection grows. I've run that in MongoDB, and response time stays in single-digit milliseconds. I've run that in an Oracle Autonomous Database, with the MongoDB emulation, but I can't publish the results as Oracle Corporations forbids the publication of database benchmarks (DeWitt Clause). However, you can copy/paste this test and watch the elapsed time while data is growing, on your own infrastructure. I also tested Amazon DocumentDB with the New Query Planner, but since the index is not used for pagination, response time increases as the collection grows.

I ran it in MongoDB for a couple of hours, then checked the collection size and the elapsed query times:


test> db.accounts.countDocuments();

9797064

test> db.accounts.aggregate([
  { $project: { category: 1, opsCount: { $size: "$operations" } } },
  { $group: { _id: "$category", operations: { $sum: "$opsCount" } } },
  { $sort: { _id: 1 } }
 ] );

[
  { _id: 0, operations: 8772705 },
  { _id: 1, operations: 8771114 },
  { _id: 2, operations: 8771181 }
]

test> let time = Date.now(); insert(1000); console.log(`Elapsed ${Date.now() - time} ms`);

Elapsed 292 ms

test> let time = Date.now(); print(query(1).toArray()); console.log(`Elapsed ${Date.now() - time} ms`);

[
  {
    _id: 7733303,
    operations: [
      { date: ISODate('2025-11-11T14:41:44.139Z'), amount: 68 },
      { date: ISODate('2025-11-11T14:50:58.409Z'), amount: 384 },
      { date: ISODate('2025-11-11T15:57:15.743Z'), amount: 890 },
      { date: ISODate('2025-11-11T17:09:52.410Z'), amount: 666 },
      { date: ISODate('2025-11-11T17:50:08.232Z'), amount: 998 }
    ]
  }
]

Elapsed 3 ms

The collection has 25 million operations distributed across 10 million accounts. Inserts are still in less than a millisecond per document (300ms for 1000 documents), and the query is still in 3 milliseconds to get the account with the latest operation in a category. This is the scalability you want to achieve for an OLTP application.

Here is a Docker Compose you can use as a template to do your own tests: mongo-oltp-scalability-benchmark.