DEV Community

Cover image for 8 Million Followers in a Day: What a 'Glitchy' Instagram Counter Teaches About Distributed Systems
Adrian Knapp
Adrian Knapp

Posted on

8 Million Followers in a Day: What a 'Glitchy' Instagram Counter Teaches About Distributed Systems

Cape Verde just made history. In their first-ever World Cup appearance, the small West African nation held Spain, one of the tournament favorites, to a scoreless draw. The hero was Vozinha, a 40-year-old goalkeeper who pulled off save after save and was named player of the match by FIFA.

Off the pitch, something funny happened. The match was carried by CazéTV, a wildly popular Brazilian streaming operation that broadcasts football to millions of concurrent live viewers. During the broadcast, the audience launched a campaign to follow the goalkeeper on Instagram. In under a day, his profile jumped from around 50 thousand to more than 8 million followers.

And that is when the videos started spreading: people refreshing the profile and watching the follower count go up and down, as if Instagram had broken.

Spoiler: it did not break. What those people saw is one of the most classic concepts in distributed systems, and one of the most common topics in system design interviews at big tech companies. It is called eventual consistency.

Why the number went up and down

Think about the difference between two operations on Instagram:

  • Reading a profile's follower count (a read) happens constantly. Every single time someone opens the profile, that is a read.
  • Following someone (a write) happens at a tiny fraction of that frequency.

On a profile getting millions of visits, it would be impossible to recompute the total follower count on every single visit by scanning the entire "who follows whom" table. Instead, that number is precomputed and cached, replicated across many copies spread throughout the infrastructure. This buys two things that social networks prize above almost everything else: fast responses (low latency) and availability (if one replica goes down, the others answer).

The catch is that these replicas are not all updated at the same instant. When you refresh the profile, your request goes through a load balancer that routes you to one of the replicas. During a spike, like tens of thousands of follows per minute, each replica is at a different point in the update process: one has already counted the latest batch of followers, another is still a few thousand behind.

The result: each refresh hits a different copy, each one more or less stale. That is why the number seems to "go backward." You are not watching the real total bounce around. You are watching different snapshots, taken at different moments, of a number that has not converged yet.

At Meta, this counter lives in a system called TAO, a distributed caching layer built on top of MySQL, designed specifically to serve social graph reads at absurd scale. But the principle applies to any large social network.

Once the spike passes and the updates reach every replica, the number settles. It was not a bug. It was a well-designed system working exactly as intended.

diagram

Consistency is not all-or-nothing

Here is the part that earns you an article, and an interview: eventual consistency is just one end of a spectrum. Between "everyone always sees the most up-to-date value" and "everyone sees whatever they get," there are several levels in between. Each product picks the level that makes sense for its use case.

Let me go from weakest to strongest, with an everyday example for each.

Eventual consistency

If writes stop, at some point all replicas converge to the same value. No guarantee about ordering or timing. It is cheap and fast, but a client can read stale values or watch the number regress.

Example: Vozinha's follower count. Likes, view counts, and social media counters in general. Nobody files a complaint because they saw a few thousand more or fewer for half a second. It is the perfect use case for the cheapest model.

Read-your-writes

Guarantees that you always see your own changes immediately, even if other people still see the old value.

Example: a review you leave on a product on Amazon. When you rate the seller and refresh the page, your review needs to show up right away, otherwise you assume it did not save and you write it again. For other shoppers, it is fine if your review takes a few seconds to propagate.

Monotonic reads

Once you have read a value, subsequent reads never return anything older. The value only moves forward, never backward.

Example: the tracking status of a package. If the status already showed "out for delivery," it cannot revert to "in transit" on the next refresh. That would make the customer panic. Notice that it is exactly the absence of this guarantee that produced the oscillation in Vozinha's follower count.

Causal consistency

Operations that have a cause-and-effect relationship are seen in the correct order by everyone. Operations with no relationship between them can appear in different orders for different people.

Example: a thread of replies on Slack or X. If someone asks "is it going to rain today?" and another person replies "bring an umbrella then," nobody can see the reply appear before the question. The causal relationship has to be preserved. Meanwhile, two unrelated messages sent at the same time can show up in slightly different orders without anyone noticing.

Linearizability (strong consistency)

Every operation behaves as if it happened instantaneously, at a single point in time, and everyone sees the same global ordering. It is as if there were a single copy of the truth. In exchange, it is the most expensive model: it requires coordination between replicas, which costs latency and availability.

Example: a bank transfer. When you send $100, the debit from your account and the credit to the recipient's account need to reflect the real balance at the same time and in the same way for everyone. There cannot be one replica thinking the money is still yours while another thinks it is already gone, otherwise you could spend the same money twice (the famous double spending problem). That is why a bank is willing to pay the coordination cost that a follower counter refuses to pay. The same goes for selling tickets to a sold-out concert: the same seat cannot be sold to two people.

Quick reference

Model Guarantee Everyday example What goes wrong without it
Eventual consistency Converges over time Instagram followers and likes The number wobbles for a few seconds (fine)
Read-your-writes You see your own writes instantly A review you post on Amazon You think it did not save and post twice
Monotonic reads The value never goes backward Package tracking status The status "travels back in time" and alarms you
Causal consistency Cause before effect Threaded replies on Slack / X A reply shows up before the question
Linearizability One single source of truth A bank transfer, concert tickets Wrong balance, double spending

It is a business decision, not a measure of skill

The most important takeaway is this: the consistency level does not measure how good the engineer is. It is a product decision.

The Instagram team was not lazy for leaving the follower counter eventually consistent, and the bank's team is not superior for choosing linearizability. Each one paid the right price for its problem. Strong consistency costs latency and availability. Weak consistency costs temporary precision. Err on one side and the app is needlessly slow. Err on the other and you sell the same seat twice, or let someone spend their money twice.

The next time a number appears to "glitch" in front of you on a social network, remember Vozinha. It probably is not a bug. It is someone who decided, on purpose, that speed matters more than precision in that specific case. And that decision, made well, is what lets a system survive a traffic spike without falling over.


Now your turn. Have you ever run into (or built) one of these in your own work? A counter that wobbled, a balance that had to be exact, a cache that betrayed you right in the middle of a demo? Drop a comment with which consistency model you would reach for, and why. I want to hear your war stories.

Top comments (0)