DEV Community

Fabrice Grenouillet
Fabrice Grenouillet

Posted on

Filtering 13M Records at 2ms: Where to Draw the Line Between UX and Performance?

Filtering 13M Records at 2ms: Where to Draw the Line Between UX and Performance?

I’m currently building Ekit Studio, a data-driven SaaS where users can manage dynamic schemas.

Naturally, I looked at tools like Airtable or Notion for inspiration.

From a UX perspective, the dream is simple:

“I should be able to filter and sort on any field, instantly.”

But when you're hitting 13 million records on a MongoDB collection, “instantly” becomes a real engineering constraint.


1. Context: A Scoped Architecture

To keep things fast, the data is multi-tenant and always scoped.

  • Total records: ~13M
  • Per scope: ~4.4M (filtered by project + proto + lang)
  • Stack: Node.js + MongoDB (running on a beefy Xeon)

To guarantee performance, I designed a slot-based indexing system, mapping user-defined fields to a fixed set of indexed slots:

  • 4 string slots (s1, s2, s3, s4)
  • 1 number slot (n1)
  • 1 date slot (d1)

Only those slots are guaranteed to be filterable and sortable at scale.


2. The Surprising Benchmarks

During stress tests, I noticed a clear divide in how MongoDB behaves with unindexed queries.

Small datasets (< 10k rows)

Everything is basically “open bar”.

  • Multi-field filtering
  • Unindexed sorting
  • ~10 ms max

At this scale, you don’t need a strategy — you just need a database.


Large datasets (~13M total / ~4.4M scoped)

When queries are scoped using the main composite index (project + proto + lang), results look like this:

  • Pagination only (no filter, no sort): ~1.2 ms
  • Unindexed “contains” filter (scoped): ~2.2 ms
  • Pagination + contains filter: ~2.0 ms

Takeaway:

Unindexed filters, when scoped properly, stay acceptable much longer than I expected.

The wall: unindexed sorting.

As soon as you try to sort millions of rows without an index, MongoDB hits its in-memory sort limits and performance collapses.


3. The Strategy: “Performance Mode”

Rather than punishing small users for the needs of large datasets, I’m leaning toward an adaptive UX.

  • Below 10k rows

    → open filtering and sorting on all fields (Airtable-like UX)

  • Above 10k rows

    → automatically enable Performance Mode

    → only indexed slots are available for filtering and sorting

Even in the “open” mode, there are guardrails:

  • hard query timeouts (maxTimeMS)
  • no deep offset pagination
  • no unbounded sorts

4. Lessons Learned

  • Sorting breaks before filtering.
  • Prefixes are king. Always scope your queries.
  • UX vs engineering: a restricted UX that actually works is better than a “free” one that freezes.

I’d love to hear from you

For those who’ve built similar systems:

  • Where did you draw the line? 10k? 100k?
  • Did you allow unindexed sorting early on?
  • Have you tried on-demand or adaptive indexing?

I’m especially interested in real-world war stories.

Let’s discuss 👇

Top comments (0)