Filtering 13M Records at 2ms: Where to Draw the Line Between UX and Performance?
I’m currently building Ekit Studio, a data-driven SaaS where users can manage dynamic schemas.
Naturally, I looked at tools like Airtable or Notion for inspiration.
From a UX perspective, the dream is simple:
“I should be able to filter and sort on any field, instantly.”
But when you're hitting 13 million records on a MongoDB collection, “instantly” becomes a real engineering constraint.
1. Context: A Scoped Architecture
To keep things fast, the data is multi-tenant and always scoped.
- Total records: ~13M
-
Per scope: ~4.4M (filtered by
project + proto + lang) - Stack: Node.js + MongoDB (running on a beefy Xeon)
To guarantee performance, I designed a slot-based indexing system, mapping user-defined fields to a fixed set of indexed slots:
- 4 string slots (
s1,s2,s3,s4) - 1 number slot (
n1) - 1 date slot (
d1)
Only those slots are guaranteed to be filterable and sortable at scale.
2. The Surprising Benchmarks
During stress tests, I noticed a clear divide in how MongoDB behaves with unindexed queries.
Small datasets (< 10k rows)
Everything is basically “open bar”.
- Multi-field filtering
- Unindexed sorting
- ~10 ms max
At this scale, you don’t need a strategy — you just need a database.
Large datasets (~13M total / ~4.4M scoped)
When queries are scoped using the main composite index (project + proto + lang), results look like this:
- Pagination only (no filter, no sort): ~1.2 ms
- Unindexed “contains” filter (scoped): ~2.2 ms
- Pagination + contains filter: ~2.0 ms
Takeaway:
Unindexed filters, when scoped properly, stay acceptable much longer than I expected.
The wall: unindexed sorting.
As soon as you try to sort millions of rows without an index, MongoDB hits its in-memory sort limits and performance collapses.
3. The Strategy: “Performance Mode”
Rather than punishing small users for the needs of large datasets, I’m leaning toward an adaptive UX.
Below 10k rows
→ open filtering and sorting on all fields (Airtable-like UX)Above 10k rows
→ automatically enable Performance Mode
→ only indexed slots are available for filtering and sorting
Even in the “open” mode, there are guardrails:
- hard query timeouts (
maxTimeMS) - no deep offset pagination
- no unbounded sorts
4. Lessons Learned
- Sorting breaks before filtering.
- Prefixes are king. Always scope your queries.
- UX vs engineering: a restricted UX that actually works is better than a “free” one that freezes.
I’d love to hear from you
For those who’ve built similar systems:
- Where did you draw the line? 10k? 100k?
- Did you allow unindexed sorting early on?
- Have you tried on-demand or adaptive indexing?
I’m especially interested in real-world war stories.
Let’s discuss 👇
Top comments (0)