DEV Community

DEV-AI
DEV-AI

Posted on

MongoDB Pagination: IXSCAN vs COLLSCAN and Building Efficient Page Bookmarks

IXSCAN (Index Scan) uses indexes to locate documents, reading only relevant index entries and then fetching matching documents. This approach is highly efficient when querying specific data or range queries with proper indexes.[1][2][3]

COLLSCAN (Collection Scan) reads every document in the collection sequentially, examining each one to find matches. While occasionally faster for small collections that fit in memory or when returning most documents, COLLSCAN becomes exponentially slower as collection size grows.[4][3][5][1]

The difference is dramatic: IXSCAN can reduce query time from seconds to milliseconds on large collections by examining only indexed entries instead of millions of documents.[6][1]

The Skip() Performance Problem

Even with IXSCAN, using skip() creates a critical performance bottleneck. MongoDB must traverse every skipped index entry sequentially — skip(1000) means reading and discarding 1,000 index entries before returning results.[7][8][9]

Performance degrades linearly with offset depth: skip(500) is twice as slow as skip(250), and skip(100000) becomes unusable. In 10M collection, skip(1000) took 3.5 seconds despite optimal IXSCAN — at skip(10000), this would exceed 30 seconds.[8][7]

Recommended Approach: Pre-Built Page Bookmarks

The most efficient solution for consistent pagination is creating a bookmark list of _id values at regular intervals (every 500 records). This requires one-time setup cost but provides instant access to any page afterward.[10][11][12]

Complete Implementation

Step 1: Build the Page Bookmark Array

async function buildPageBookmarks(pageSize = 500) {
  const bookmarks = [];
  let lastId = null;
  let position = 0;

  while (true) {
    // Build query - first page has no filter
    const query = lastId ? { _id: { $gt: lastId } } : {};

    // Fetch one batch
    const docs = await db.isr_records
      .find(query, { projection: { _id: 1 } })
      .sort({ _id: 1 })
      .limit(pageSize)
      .toArray();

    // No more documents
    if (docs.length === 0) break;

    // Store the first _id of this page as bookmark
    bookmarks.push({
      page: position,
      startId: docs[0]._id,
      count: docs.length
    });

    // Update cursor for next iteration
    lastId = docs[docs.length - 1]._id;
    position++;

    console.log(`Processed page ${position}, bookmark: ${docs[0]._id}`);
  }

  return bookmarks;
}

// Execute once to build bookmarks
const pageBookmarks = await buildPageBookmarks(500);
console.log(`Created ${pageBookmarks.length} page bookmarks`);

// Save to cache or database for reuse
await db.pagination_cache.insertOne({
  collection: 'isr_records',
  pageSize: 500,
  bookmarks: pageBookmarks,
  createdAt: new Date()
});
Enter fullscreen mode Exit fullscreen mode

Step 2: Fast Page Access Using Bookmarks

async function getPageFast(pageNumber, pageSize = 500) {
  // Retrieve cached bookmarks
  const cache = await db.pagination_cache.findOne({
    collection: 'isr_records',
    pageSize: pageSize
  });

  if (!cache || !cache.bookmarks[pageNumber]) {
    throw new Error('Page bookmark not found. Rebuild cache.');
  }

  const bookmark = cache.bookmarks[pageNumber];

  // Use bookmark _id for instant range query
  const query = pageNumber === 0 
    ? {} 
    : { _id: { $gte: bookmark.startId } };

  const results = await db.isr_records
    .find(query, { projection: { _id: 1 } })
    .sort({ _id: 1 })
    .limit(pageSize)
    .toArray();

  return {
    page: pageNumber,
    data: results,
    nextPage: pageNumber + 1 < cache.bookmarks.length ? pageNumber + 1 : null
  };
}

// Access any page instantly - no skip() required
const page0 = await getPageFast(0);    // Records 1-500
const page2 = await getPageFast(2);    // Records 1001-1500
const page10 = await getPageFast(10);  // Records 5001-5500
Enter fullscreen mode Exit fullscreen mode

Step 3: Simple Sequential Pagination (No Bookmarks)

async function getNextPage(lastSeenId = null, pageSize = 500) {
  const query = lastSeenId 
    ? { _id: { $gt: lastSeenId } } 
    : {};

  const results = await db.isr_records
    .find(query, { projection: { _id: 1 } })
    .sort({ _id: 1 })
    .limit(pageSize)
    .toArray();

  return {
    data: results,
    nextCursor: results.length > 0 
      ? results[results.length - 1]._id 
      : null,
    hasMore: results.length === pageSize
  };
}

// Usage for forward-only pagination
let cursor = null;
const page1 = await getNextPage(cursor);
cursor = page1.nextCursor;

const page2 = await getNextPage(cursor);
cursor = page2.nextCursor;
Enter fullscreen mode Exit fullscreen mode

Performance Comparison

Approach Page 1 Page 100 Page 1000 Random Access
skip() with IXSCAN 50ms 2s 30s+ Slow everywhere[7]
Range query (sequential) 50ms 50ms 50ms Not supported[10]
Bookmark-based 100ms* 50ms 50ms Fast everywhere[12]

*Initial bookmark creation is one-time cost[12]

When to Rebuild Bookmarks

Bookmarks become stale as data changes. Rebuild when:[12]

  • Significant inserts/deletes occur (>10% of collection)
  • Scheduled maintenance windows (daily/weekly)
  • Cache expiration triggers
  • User reports pagination inconsistencies

Key Takeaways

Always prioritize IXSCAN through proper indexing — _id is indexed by default. Never use skip() for large offsets — performance degrades linearly with offset value. Bookmark approach provides O(1) random access to any page while maintaining IXSCAN efficiency. For sequential-only pagination, simple range queries with $gt cursors are sufficient and performant.[11][1][7][6][10][8][12]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Top comments (0)