ufraaan

Posted on May 1 • Originally published at ufraan.dev

How Twitter Served 300,000 Timelines Per Second

#architecture #distributedsystems #performance #systemdesign

This post is a plain breakdown of how Twitter (circa 2013) handled timeline scale, why the first approach broke, and why the final solution is a hybrid.

Based on concepts from Designing Data Intensive Applications.

The Problem

When I strip Twitter down to the basics, there are really only two product operations that matter:

Post Tweet: a user publishes a new message to their followers
Home Timeline: a user views tweets from the people they follow

Here are the numbers Twitter published (Nov 2012):

Operation	Average	Peak
Post Tweet (writes)	4,600 req/sec	12,000 req/sec
Home Timeline (reads)	300,000 req/sec	-

The read-to-write ratio is roughly 65x. People read way more than they write. That asymmetry is the whole story here.

What surprised me the first time I studied this: 12,000 writes/sec is not the scary part. A solid relational setup can handle that. The hard part is fan-out: one tweet may need to show up in millions of home timelines almost immediately.

The Schema

At the core, Twitter's model has three tables.

tweets: the content

id	sender_id	text	timestamp
1	12	"Excited to announce"	1000
2	5	"Grateful. Humbled. Dehydrated."	1001
3	12	"I didn't come this far to only come this far"	1002
4	8	"Rejected 12 times. Hired once. Now I speak at conferences."	1003

Every tweet lives here. Notice sender_id = 12 appears twice: rows 1 and 3 are both from alice. This table is just tweet content, nothing else.

users: the profiles

id	screen_name	profile_image
5	bob	bob.jpg
8	charlie	charlie.jpg
12	alice	alice.jpg

This is profile data only. No tweets, no follow graph. If I see sender_id = 12 in tweets, I resolve id = 12 here and get "alice".

follows: the relationships

follower_id	followee_id	meaning
100	5	User 100 follows bob
100	12	User 100 follows alice
101	12	User 101 follows alice

This table stores the follow graph. One row = one relationship.

Follower vs. Followee

Follower: the person doing the following. If you follow alice, you are the follower.
Followee: the person being followed. alice, in this case, is the followee.

So when alice tweets, the question is: "Who has followee_id = 12?" That result is everyone whose timeline might need an update.

Approach 1: Query at Read Time

Twitter's original approach was straightforward: writes go into tweets, and timelines are computed on demand at read time.

SELECT tweets.*, users.*
FROM tweets
  JOIN users ON tweets.sender_id = users.id
  JOIN follows ON follows.followee_id = users.id
WHERE follows.follower_id = current_user
ORDER BY tweets.timestamp DESC

Why this broke at scale

300,000 timeline reads/sec means hammering this multi-join query constantly. Even with indexes, this gets expensive fast. That pushed Twitter to move work away from reads.

Approach 2: Fan-Out on Write

The key idea: if reads outnumber writes by 65x, pay more at write time so reads are cheap.

Instead of building timelines on demand, build them when tweets are created. Each user gets a precomputed cached timeline (like a mailbox), ready to read.

The mailbox analogy

This is what happens when alice tweets:

The tweet is saved to the global tweets table
A background worker queries all of alice's followers
For each follower, it prepends the new tweet to their cached timeline

When I open the app, the timeline is basically a cache fetch. No heavy joins in the hot read path.

The math

Average Twitter user has approximately 75 followers.

4,600 tweets/sec × 75 followers = 345,000 cache writes/sec

A cache write is a list prepend in Redis: microseconds, no disk I/O. A JOIN query involves scanning indexed B-trees, merging results, sorting: milliseconds, disk-bound.

The Full Data Pipeline

Each follower has their own dedicated timeline cache (e.g., User 1: T7→T5→T3→T1, User 2: T8→T6→T5). The tweet IDs differ per user because each user follows a different set of people. When you request your timeline, it's served directly from your pre-built cache. Reads are fast because all the work happened at write time.

The tradeoff: one new tweet triggers N cache updates where N is the number of followers.

The Celebrity Problem and the Hybrid Solution

Fan-out on write has a ceiling. Celebrity accounts have tens of millions of followers. If one celebrity tweet triggers 30-80 million fan-out writes, queues back up and latency spikes.

So the production answer is a hybrid:

Normal users: fan-out on write (Approach 2)
Celebrities: no fan-out. Their tweets stay in the global tweets table.

When I request my home timeline, the system merges:

Your pre-built cache (tweets from normal users you follow)
A small real-time query for tweets from celebrities you follow (Approach 1)

This stays manageable because most users follow only a small number of celebrity accounts.

The Core Principle

Do more work at write time so the common path is trivially cheap.

Twitter optimized for reads because reads dominated writes.

When I design systems now, I start with one question: what is my read/write ratio, and where should I pay cost? That one answer often determines the rest of the architecture.

DEV Community