DEV Community

Cover image for How does it scale? A basic OLTP benchmark on MongoDB
Franck Pachot
Franck Pachot

Posted on • Edited on

How does it scale? A basic OLTP benchmark on MongoDB

Choosing a database requires ensuring that performance remains fast as your data grows. For example, if a query takes 10 milliseconds on a small dataset, it should still be quick as the data volume increases and should never approach the 100ms threshold that users perceive as waiting. Here’s a simple benchmark: we insert batches of 1,000 operations into random accounts, then query the account with the most recent operation in a specific category—an OLTP scenario using filtering and pagination. As the collection grows, a full collection scan would slow down, so secondary indexes are essential.

We create an accounts collection, where each account belongs to a category and holds multiple operations—a typical one-to-many relationship, with an index for our query on operations per categories:

db.accounts.createIndex({
  "category": 1,
  "operations.date": 1,
  "operations.amount": 1,
});
Enter fullscreen mode Exit fullscreen mode

The index follows the MongoDB Equality, Search, Range guideline.

To increase data volume, this function inserts operations into accounts (randomly distributed to ten million accounts over three categories):

function insert(num) {
  const ops = [];
  for (let i = 0; i < num; i++) {
    const account  = Math.floor(Math.random() * 10_000_000) + 1;
    const category = Math.floor(Math.random() * 3);
    const operation = {
      date: new Date(),
      amount: Math.floor(Math.random() * 1000) + 1,
    };
    ops.push({
      updateOne: {
        filter: { _id: account },
        update: {
          $set: { category: category },
          $push: { operations: operation },
        },
        upsert: true,
      }
    });
  }
  db.accounts.bulkWrite(ops);
}
Enter fullscreen mode Exit fullscreen mode

This adds 1,000 operations and should take less than one second:

let time = Date.now();
insert(1000);
console.log(`Elapsed ${Date.now() - time} ms`);
Enter fullscreen mode Exit fullscreen mode

A typical query fetches the account, in a category, that had the latest operation:

function query(category) {
  return db.accounts.find(
    { category: category },
    { "operations.amount": 1 , "operations.date": 1 }
  ).sort({ "operations.date": -1 })
   .limit(1);
}
Enter fullscreen mode Exit fullscreen mode

Such query should take a few milliseconds:

let time = Date.now();
print(query(1).toArray());
console.log(`Elapsed ${Date.now() - time} ms`);
Enter fullscreen mode Exit fullscreen mode

I repeatedly insert new operations, by batches of one thousand, in a loop, and measure the time taken for the query while the collection grows, stopping once I reach one billion operations randomly distributed into the accounts:

for (let i = 0; i < 1_000_000; i++) { 
  // more data  
  insert(1000);  
  // same query
  const start = Date.now();  
  const results = query(1).toArray();  
  const elapsed = Date.now() - start;  
  print(results);  
  // elapsed time
  console.log(`Elapsed ${elapsed} ms`);  
}  
console.log(`Total accounts: ${db.accounts.countDocuments()}`);  
Enter fullscreen mode Exit fullscreen mode

In a scalable database, the response time should not significantly increase while the collection grows. I've run that in MongoDB, and response time stays in single digit milliseconds. I've run that in an Oracle Autonomous Database, with the MongoDB emulation, but I can't publish the results as Oracle Corporations forbids the publication of database benchmarks (DeWitt Clause). However, you can copy/paste this test and watch the elapsed time while data is growing, on your own infrastructure. I also tested Amazon DocumentDB with the New Query Planner, but since the index is not used for pagination, response time increases as the collection grows.

I ran it in MongoDB for a couple of hours, then checked the collection size and the elapsed query times:


test> db.accounts.countDocuments();

9797064

test> db.accounts.aggregate([
  { $project: { category: 1, opsCount: { $size: "$operations" } } },
  { $group: { _id: "$category", operations: { $sum: "$opsCount" } } },
  { $sort: { _id: 1 } }
 ] );

[
  { _id: 0, operations: 8772705 },
  { _id: 1, operations: 8771114 },
  { _id: 2, operations: 8771181 }
]

test> let time = Date.now(); insert(1000); console.log(`Elapsed ${Date.now() - time} ms`);

Elapsed 292 ms

test> let time = Date.now(); print(query(1).toArray()); console.log(`Elapsed ${Date.now() - time} ms`);

[
  {
    _id: 7733303,
    operations: [
      { date: ISODate('2025-11-11T14:41:44.139Z'), amount: 68 },
      { date: ISODate('2025-11-11T14:50:58.409Z'), amount: 384 },
      { date: ISODate('2025-11-11T15:57:15.743Z'), amount: 890 },
      { date: ISODate('2025-11-11T17:09:52.410Z'), amount: 666 },
      { date: ISODate('2025-11-11T17:50:08.232Z'), amount: 998 }
    ]
  }
]

Elapsed 3 ms

Enter fullscreen mode Exit fullscreen mode

The collection has 25 million operations distributed across 10 million accounts. Inserts are still in less than a millisecond per document (300ms for 1000 documents), and the query is still in 3 milliseconds to get the account with the latest operation in a category. This is the scalability you want to achieve for an OLTP application.

Here is a Docker Compose you can use as a template to do your own tests: mongo-oltp-scalability-benchmark

Top comments (1)

Collapse
 
jorge_imperial_5c5a144c37 profile image
jorge imperial

Tried this and it compares very well against more resource hungry options.