Praveen Rajamani

Posted on Jun 5

The Untold Story of Why Your App Dies at 1,000,000 Rows

#webdev #database #performance #architecture

My colleague and I were chatting about data problems when he mentioned something called the coin flip.

We were talking about what happens to apps when the data gets big - the kind of big where your loading spinner stops spinning and just... stays. And then he asked me something I did not have an answer to:

"Have you ever thought about the coin flip approach?"

I had not. But I could not stop thinking about it after. So I did what any developer does when an idea refuses to leave their head - I decided to write about it.

But before we get to the coin flip, let me tell you a story. A very relatable, very painful story that most developers have lived through at least once.

The day everything was fine - until it wasn't

Picture this. You have built something. It works. It is fast, clean, and the demo went perfectly. Your manager clapped. Your team was impressed. Someone said "ship it" and you shipped it.

Then a user - one single user - uploaded a million rows of data.

The loading spinner appeared.

It kept spinning.

It kept spinning some more.

A colleague walked past and asked if the page was broken. You said "no no it is just loading." You smiled confidently. Inside you were already opening the terminal.

You saw the memory number. You closed the terminal. You opened it again hoping the number had changed.

It had not. It had become worse.

You closed the terminal again.

You considered a career in farming.

⚠️ What actually happened behind the scenes:

Database fetched all 1,000,000 rows → sent across the network → server loaded into memory → processed one by one → tried to send it all to the browser → browser rendered 1,000,000 DOM elements → laptop fan achieved liftoff → tab crashed → someone filed a bug report: "is this broken?"

Every step in that chain was doing work it was never built to do at that scale. And the fix - here is the part nobody tells you - is almost never to make the code faster. It is to make the code do less.

Back to the coin flip

So my colleague and I are talking, and he asks me this:

"If you wanted to know the average age of everyone in a city of one million people - would you actually need to ask all one million people?"

Obviously not. You ask a few thousand, make sure they are a representative mix, and the answer is statistically close enough to be useful. This is literally how elections get predicted before all the votes are counted. How weather forecasts work. How Netflix knows you will probably like that show even though you have never seen anything like it.

The coin flip is this idea applied to data. For every row in your million-row dataset, you flip a coin - heads means process it, tails means skip it. Since the coin does not care which rows it lands on, you get a random cross-section of the whole dataset. The answer you get is statistically close to what you would get from all one million rows, in half the time.

You do not need to read every page of a book to know what it is about. A smart sample tells you almost as much as the full thing in a fraction of the time. This is not cheating. This is how the real world actually works.

This is called probabilistic processing. And once you see it, you will start noticing it everywhere.

The four techniques that come from the coin flip

The coin flip is the mindset. But what does it actually look like in practice? Here are four techniques that come directly from it, each one is just a different way of doing less to get the same useful answer.

🍕 Pagination - the pizza slice approach

Nobody orders a million pizzas and eats them all at once. You order one, eat it, then decide if you want another. Pagination works the same way: load 50 rows, show them, wait for the user to ask for more. The other 999,950 rows sit quietly in the database not bothering anyone. Cursor-based pagination is even better, it stays fast no matter how deep you go, because it does not need to count all the rows before your current position.

🪟 Virtual rendering - the movie theatre trick

A movie theatre has 300 seats. It does not build new seats for every person who walks in; it reuses the same seats. Virtual rendering works the same way. If you are showing a list of a million items, you only create DOM elements for the 20 rows visible on screen right now. As the user scrolls, the same elements get recycled with new data. The list feels infinite. The browser thinks it only has 20 items. Everyone is happy except the million rows that never got their moment on screen.

🚿 Streaming - the shower not the bathtub

Loading a million rows into memory is like filling a bathtub to wash your hands. Streaming is the shower water flows through, does its job, and leaves. You process each chunk of data as it arrives and discard it. Memory stays flat no matter how large the dataset. This is how you process a 10GB file on a machine with 8GB of RAM without summoning any demons.

🎲 Reservoir sampling - the coin flip formalised

Think of it like a talent scout watching a parade of a million people walk past, they have to make their picks on the spot, without knowing how many more are coming. Reservoir sampling tells the scout exactly how to make those on-the-spot decisions so the final selection is still perfectly random. You get a representative sample of 1,000 rows from a million without ever loading the full dataset. This is the coin flip with a computer science degree.

The thing that surprised me most

After my conversation with my colleague, I went down a rabbit hole and found something that genuinely made me laugh out loud.

There is a data structure called HyperLogLog. It can estimate the number of unique items in a dataset of one billion rows using only a few kilobytes of memory. A few kilobytes. For a billion rows. The estimate is accurate to within 2%.

What is HyperLogLog? It is a probabilistic algorithm - fancy words for "a smart way of guessing that is almost always right." Instead of remembering every unique item it has seen, it uses a mathematical trick involving binary numbers to estimate the count. Think of it like a bouncer at a club who stops remembering individual faces after a while but can still give you a pretty accurate headcount. It trades a tiny bit of accuracy for a massive saving in memory. That trade-off is almost always worth it at scale.

The alternative counting exactly - would use gigabytes of memory and take minutes. Netflix, Reddit, Twitter, and Google all use HyperLogLog in production. The entire time you thought these platforms were being exact, they were flipping coins and getting close enough. "Approximately right" is doing an enormous amount of work in the real world.

We have been trained to think that approximate means sloppy. But in the world of large data, approximate delivered in milliseconds is almost always more valuable than exact delivered in four minutes.

The real lesson from the spinning loader

When your app dies at a million rows, the instinct is to optimise. Write better code. Add an index. Upgrade the server. Buy a more expensive laptop so the fan is quieter.

But the real lesson is simpler. Stop asking "how do I process all of this?" and start asking "how much of this do I actually need to process to get a useful answer?"

Usually, the answer is: a lot less than you think.

The spinner was never the problem. The spinner was just honest; it was telling you that you were trying to eat a million sandwiches at once by chewing faster. The fix was always to order less food.

The goal was never to process a million rows. The goal was to answer the question the million rows were hiding. Those are very different problems, and the coin flip is how you start telling them apart.

Have you ever solved a performance problem by doing less, not more? Drop the technique in the comments. I want to know if anyone else has a coin flip story. 👇

Top comments (6)

mote • Jun 8

HyperLogLog was my "wait, what?" moment too. I ran into this on edge devices where counting unique sensor readings was eating 200MB of RAM — switched to HLL and dropped to under 2KB with <1% error. The tradeoff feels wrong until you actually measure it, then you feel stupid for not doing it sooner.

The streaming section buried something I think deserves more emphasis: the hardest part isn't the code, it's unlearning the "load everything first" reflex. I've watched senior devs add ORDER BY RANDOM() to a 5M-row query "just to see a sample" instead of using TABLESAMPLE. The knowledge is there, the instinct isn't.

One thing I'd add: once you go cursor-based pagination, make sure your index covers all filter+sort columns. I've seen teams switch to keyset, see no improvement, and blame the pattern — when the real issue was a missing composite index that the old OFFSET query hid behind a full scan anyway.

Praveen Rajamani • Jun 17

Exactly .! HLL is one of those things that needs one real example to click, after that you start seeing where to use it everywhere
And "moves the problem around" is a perfect way to describe bad keyset pagination. Stealing that line 😅

mote • Jun 17

Ha, glad it clicked for you too! The "moves the problem around" line was something I picked up from a senior engineer years ago — been using it ever since.

What got me with keyset pagination wasn't the theory, it was watching a query go from 2ms to 800ms on the same dataset over 6 months. The offset was just hiding the problem until the table grew enough.

One extension to the pattern: if you're doing keyset pagination across multiple columns, make sure your composite index matches the exact column order. Partial index matches are another sneaky performance killer that looks fine on small datasets.

Bhupesh • Jun 16

If you are fetching 1,000,000 rows by default and getting the incident report then it's a bad design. This post seems way too much like AI generated and no original insight could be derived from this.

Praveen Rajamani • Jun 17

Thanks for taking the time to comment.

You are absolutely right that fetching 1,000,000 rows by default would be a poor design choice. The example was intentionally exaggerated to illustrate how performance issues can surface when systems scale and data access patterns are not revisited.

The purpose of the article was not to present a specific production implementation, but to discuss broader concepts around scalability, indexing, query optimization, and pagination. While you may not agree with the approach or find the insights valuable, the content reflects my own perspective and experience on the topic.

If there are specific technical points you believe are inaccurate or misleading, I would welcome a constructive discussion.

Madhesh Kumar • Jun 17

A lot of developers start thinking about microservices and distributed systems when an index or proper pagination would solve the problem. Well explained.