Ankit Sood

Posted on Aug 9 • Edited on Aug 14

🚀 Slow queries on Cosmos DB ?Here’s How I Fixed Pain with Deep Pagination & Cross-Partition Queries

#cosmosdb #database #dataengineering #mongodb

If you have ever paginated through thousands of records in Azure Cosmos DB (MongoDB API) and felt your queries are slow, painfully slow, you’re not alone.

This happened to me recently.

I had a collection with a partitioned on invoiceId, but my queries were filtering on purchase Order Number and combination of vendorNumber & createDate.

The result? The first few pages were fine. But by page 10+, response time exploded.

Before jumping on to what was going on and how I fixed it, quick starter on what Cosmos DB is and how it works.

Cosmos DB Introduction

Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service.

The MongoDB API mode means Cosmos DB will present itself as if you were connecting to a MongoDB server. Hence, your MongoDB drivers, tools, and queries mostly work as-is, but data is stored in Cosmos DB’s underlying storage engine.

Think of it as:

A MongoDB-compatible skin over Cosmos DB’s distributed, serverless, elastic infrastructure.

Now, let's discuss how Cosmos DB stores the data.

When you create a collection/container in Cosmos DB, you must pick a partition key (e.g., /invoiceId).
Cosmos DB then:

Hashes the value of the partition key for each document
Stores the document in a physical and logical partition based on that hash
Each physical partition has its own storage and throughput budget

So:

All docs with the same invoiceId go to the same partition
Different invoiceId values may go to different partitions

Why It Matters ?

Query by Partition Key = Laser Target 🎯

If you query using the partition key, Cosmos DB knows exactly which single physical partition has your data.

Example:
Partition key = /invoiceId

db.invoices.find({ invoiceId: "u123" })

Here’s what happens internally:

Cosmos DB hashes "u123" → finds Partition #7.
It queries only Partition #7.
You get the result quickly and cheaply.

Advantages:

Low RU cost (minimal data scanned)
Fast (single partition)
No fan-out (doesn’t hit every partition)

Cross-Partition Query = Scatter & Gather 🍂

If you don’t provide the partition key in your filter — or you filter by a field that isn’t the partition key — Cosmos DB must check all partitions.

Example:
Partition key = /invoiceId

db.invoices.find({ status: "Paid" })

What happens internally:

Cosmos DB doesn’t know which partition contains matching docs.
Sends the query to all partitions in parallel (scatter).
Collects results from each (gather).
Merges them, applies sorting, etc.
Returns the final set.

Drawbacks:

Higher RU cost (every partition processes the query)
Slower (parallel calls + merge step)
Possible page-by-page retrieval if result is large

Real-World Scenario

Suppose you have documents like this, with a partition key on invoiceId:

{
    "invoiceId": 890,
    "source": "POS",
    "vendorNumber": 10,
    "invoiceNumber": "INV-123",
    "destStoreNbr": 11,
    "country": "IND",
    "totalAmt": 21.76,
    "createDate": "1970-01-01T00:14:05.691Z",
    "updateDate": "1970-01-01T00:14:05.691Z",
}

❌ The Problem

Our clients wanted to search invoices by:
1. Invoice Number
2. Vendor Number + Date
The primary collection was partitioned for write scalability, not for search queries.
Even worse, a vendor number + date could have over 10,000 invoice IDs in a single day in some cases.

Why were these search patterns a problem ?

All these searches were resulting in the usecases mentioned below:

Cross-partition queries forcing database to scan across multiple partitions (slower by design)
Deep pagination using skip and limit forcing database to skip thousands of docs before returning required page (skip + limit problem). Potentially resulting in RU throttling if too much data is requested.

It’s like flipping to page 10,000 in a book by reading every page before it. Inefficient, right?

Hence, we needed a way to:

Make these queries fast and cheap.
Avoid schema duplication or extra writes in the request path.

✅ Solution

Instead of querying the primary collection directly, we created a Lookup Collection with ahead of time processing. You can think of it as a small, precomputed “routing table” for searches.

What’s in the Lookup Collection?

Type = inv → maps invoiceNumber → invoiceId (with partition key)
Type = vendorNumber → maps vendorNumber + date → multiple invoiceIds
A single lookup doc contains at most 1000 invoice IDs.
- If more exist, we create multiple docs for the same vendor/date with a count field to track pagination.
TTL = time to live for each lookup document.

⚙️ The Architecture

The solution has 4 different components:

Primary Collection:
- It holds all invoice documents.
- It is partitioned for writes, not for queries.
- Acts as the source of truth.
Change Data Capture (CDC):
- It listens to inserts only in the primary collection.
- Publishes those inserts to Kafka.
Lookup Consumers
- Two independent java deployments running with their own Kafka consumer groups one for each type of search pattern.
  - Invoice Lookup Consumer → creates type="inv" docs.
  - Vendor Lookup Consumer → creates type="vendorNumber" docs, with pagination logic.
- Both writing to same lookup collection.
API Search Flow
- Search by invoice number:
  - Query lookup collection for type="inv" and prtnKey=invoiceNumber.
  - If found → query primary collection with the exact partition key.
- Search by vendor number + date:
  - Query lookup collection for type="vendorNumber" with pagination.
  - Gather invoice IDs → query primary collection with partition keys.
- This ensures every primary collection query has the partition key → no cross-partition scans.

Sample Invoice mapping document:

{
    "_id" : "6370cfb35934c7afdcffbb6f",
    "prtnKey" : "INV-123",
    "type" : "inv",
    "invIds" : [
        890
    ],
    "ttl" : 252288000,
}

Sample Vendor mapping document:

{
    "_id" : "6370cfb35934c7afdcffbb6a",
    "prtnKey" : "10",
    "createdDate" : 1754739742,
    "type" : "vendorNumber",
    "invIds" : [
        890,
        891,
        892,
        893,
    ],
    "count" : 4
    "ttl" : 252288000,
}

In both the documents shared above, prtnKey is the partition key for the lookup collection and type signified which value is stored as part of the prtnKey.

📃 Why this works ?

Criteria	Results
Speed	Lookup docs are tiny and indexed for lightning-fast reads.
Scalability	CDC keeps lookup data fresh without affecting write throughput.
Flexibility	Adding a new search type? Just add a new lookup doc type.
Pagination	The 1000-ID limit per doc avoids monster documents and RU spikes.

Final Thoughts

Deep pagination and cross-partition queries can cripple performance in Cosmos DB. The good news? With smarter strategies and thoughtful data modeling, you can keep queries fast—even at scale.

Have you faced similar issues? Share your experience or tips in the comments!

DEV Community