DEV Community

Cover image for LinkedIn Needed a Message Queue. They Built the One the Entire Internet Runs On.
TechLogStack
TechLogStack

Posted on • Originally published at techlogstack.com on

LinkedIn Needed a Message Queue. They Built the One the Entire Internet Runs On.

LinkedIn · Messaging · 19 May 2026

In 2010, LinkedIn was drowning in data it couldn't move. Every ML model, every recommendation engine, every real-time feature was starving because there was no reliable way to get activity data from the website into the systems that needed it. Jay Kreps, Jun Rao, and Neha Narkhede spent a year building a fix. They named it after Franz Kafka. The rest of the internet adopted it.

  • 1B events/day at launch (2011)
  • 1T messages/day by 2015
  • 7T messages/day by 2019
  • Named after Franz Kafka
  • Confluent founded 2014
  • 80%+ of Fortune 100 run it today

The Story

In 2010, LinkedIn was a growing professional network with a problem that every ambitious data-driven company eventually hits: a massive accumulation of valuable activity data that was effectively locked inside the systems that generated it. Every page view, every job click, every connection request, every search query was data. LinkedIn's ML engineers wanted that data to train recommendation models. LinkedIn's analytics engineers wanted it to understand user behavior. LinkedIn's search engineers needed it to keep the index fresh within seconds of updates. But the pipelines connecting these systems to their data sources were a fragile, inconsistent web of point-to-point integrations — each one custom-built, each one brittle, none of them sharing any infrastructure. Jay Kreps, who was leading data infrastructure engineering at LinkedIn, later described the root cause with characteristic directness: "Everyone wanted to build fancy machine-learning algorithms, but without the data, the algorithms were useless. Getting the data from source systems and reliably moving it around was very difficult."

📊

Before Kafka, LinkedIn's data architecture had an N×M integration problem : every data source needed a custom pipeline to every data destination. With dozens of source systems and dozens of consumer systems, engineers were maintaining hundreds of individual pipelines — each with its own error handling, schema management, and operational burden. Adding one new data source meant writing N new pipelines. Adding one new consumer meant updating M existing sources.

Kreps, alongside Jun Rao (who had joined from IBM's database group) and Neha Narkhede (who had come from Oracle), evaluated every existing solution. Traditional message queuesActiveMQ (an open-source message broker implementing the JMS specification, designed for reliable, ordered message delivery between enterprise applications), RabbitMQ (a message broker built around the AMQP protocol, designed for flexible routing, delivery guarantees, and complex messaging patterns) — were designed for a different problem. They offered rich delivery guarantees, complex routing, and transaction semantics, but they were built for the reliable delivery of individual task messages, not for the high-throughput streaming of millions of activity events. The broker in these systems tracked the delivery state of every message — consuming memory and CPU proportional to the number of outstanding messages. They were designed for near-immediate consumption. They could not handle the situation where a Hadoop job needed to replay yesterday's activity data. They could not scale to the throughput LinkedIn needed. Most critically: their per-message overhead was enormous. ActiveMQ's message format had 144 bytes of overhead per message. LinkedIn needed to process millions of messages per second.

THE INSIGHT: TREAT DATA MOVEMENT LIKE A LOG

The founding insight of Kafka was recognizing that LinkedIn's data movement problem was not a messaging problem — it was a log problem. Databases have used append-only logs for decades: the write-ahead log (a sequential record of all changes made to a database, written before the changes are applied — used for crash recovery, replication, and point-in-time restoration) is how MySQL, Postgres, and every serious database achieves durability and replication. Jay Kreps asked: what if the data pipeline itself was an append-only log? Producers append events. Consumers read them at their own pace. The log retains messages for a configured period. Any consumer can replay from any point. The broker doesn't track state. The simplicity unlocked everything.

Problem

LinkedIn's Data Was Locked in Silos

By 2010, LinkedIn had dozens of data source systems and dozens of consumer systems — ML models, analytics pipelines, search indexers, real-time features — all of which needed the same activity stream data. Point-to-point custom pipelines were the solution, but maintaining hundreds of them was unsustainable. The existing messaging systems (ActiveMQ, RabbitMQ) couldn't handle LinkedIn's throughput requirements and were designed for task queues, not event streams.


Cause

No Tool Existed for High-Throughput Real-Time Event Streaming

The problem in 2010 had two halves: batch systems (Hadoop) could handle large volumes but only hours later; traditional message queues could deliver in real-time but couldn't scale to LinkedIn's volume or support replay. There was no system that provided high throughput, low latency, durability, replayability, and horizontal scalability simultaneously. The three engineers concluded that the tool they needed did not exist.


Solution

One Year Building Kafka: The Append-Only Distributed Log

Kreps, Rao, and Narkhede spent approximately one year building the first version of Kafka. The core architectural decision was treating the message store as an append-only log rather than a queue. This single choice enabled sequential disk I/O (orders of magnitude faster than random I/O), stateless brokers (consumers track their own position), arbitrary replay (consumers read from any offset), and horizontal partitioning (each partition is an independent log that scales independently).


Result

1 Billion Events Per Day at Launch, 7 Trillion by 2019

Kafka went into production at LinkedIn in 2011 and immediately became the backbone of the company's real-time infrastructure. At launch it was ingesting over 1 billion events per day. LinkedIn open-sourced it in early 2011. It became an Apache Top-Level Project in October 2012. By 2015 it was processing 1 trillion messages per day. By 2019, 7 trillion. Kreps, Narkhede, and Rao left LinkedIn in November 2014 to found Confluent, building the commercial ecosystem around Kafka.


I thought that since Kafka was a system optimized for writing, using a writer's name would make sense. I had taken a lot of lit classes in college and liked Franz Kafka. Plus the name sounded cool for an open source project.

— — Jay Kreps — on naming Kafka after Franz Kafka, via Quora

The name was chosen when the project was being prepared for open-sourcing. Jay Kreps was inspired by Franz Kafka (the German-language novelist (1883–1924) known for works including The Metamorphosis, The Trial, and The Castle — exploring themes of alienation, bureaucracy, and transformation) — a writer whose work Kreps admired and whose name, he felt, suited a system built for writing. The practical truth is that the naming was an afterthought. The engineering came first. In the original academic paper published at the NetDB workshop in June 2011, the system is described without literary flourish: a distributed messaging system for log processing, designed for high throughput, low latency, and horizontal scalability. The paper's benchmarks were direct: Kafka produced messages at a rate that was orders of magnitude faster than ActiveMQ or RabbitMQ. The numbers were not close.

ℹ️

Why LinkedIn's Data Architecture Needed Real-Time

LinkedIn's core value proposition — showing you the right jobs, the right connections, the right content — required real-time signals. If you search for "software engineers in San Francisco" and connect with three of them, LinkedIn's recommendations should update within seconds to reflect what your new connections know and who they know. With Hadoop batch jobs, this update happened hours later. With Kafka feeding real-time stream processing, updates became searchable within seconds of being posted. The latency reduction from hours to seconds was not a technical nicety — it was the product feature that made LinkedIn's social graph feel alive.

⚠️

What Existing Systems Got Wrong at Scale

The original Kafka paper published the benchmark numbers without ceremony. LinkedIn configured a single producer to publish 10 million messages of 200 bytes each. Kafka with batch size 50: ~50MB/sec. ActiveMQ: ~2MB/sec. RabbitMQ: slightly better than ActiveMQ but far below Kafka. The gap was not 10% or even 2x — it was an order of magnitude. The performance difference traced directly to Kafka's design: sequential disk writes, zero per-message broker state, batched I/O, and a message format with 9 bytes of overhead versus ActiveMQ's 144 bytes.

Jay Kreps wrote one of the most-cited engineering essays of the last decade: "The Log: What Every Software Engineer Should Know About Real-Time Data's Unifying Abstraction" (LinkedIn Engineering, 2013). The essay argued that the append-only log was not just a Kafka implementation detail — it was a universal primitive for distributed systems. Databases use it for replication. Kafka uses it for messaging. Stream processors use it for state. The essay made the case that any system that needs to integrate data across multiple consumers should be built on a log, not on point-to-point integrations. At the time Kreps wrote the essay, LinkedIn was running over 60 billion unique message writes through Kafka per day — several hundred billion counting cross-datacenter replication. The argument was not theoretical.

ℹ️

LinkedIn's Real-Time Feature Hunger

LinkedIn's product ambitions in 2010 were fundamentally real-time: who viewed your profile in the last 24 hours? Which of your connections just updated their job title? When a recruiter posts a job matching your skills, how fast does it appear in your feed? These features required that activity data flowing from the website into the recommendation and notification systems be fresh — not hours old. The batch pipeline to Hadoop was adequate for weekly model training but useless for features that needed sub-minute freshness. Kafka was not just a performance improvement over existing infrastructure; it was the prerequisite for an entire class of real-time product features that didn't yet exist.

1 BILLION EVENTS PER DAY — IMMEDIATELY

When Kafka went into production at LinkedIn in 2011, it was immediately processing more than 1 billion events per day. This was not a gradual ramp — the scale was there from day one because LinkedIn's existing activity volume was already at that level; Kafka simply replaced the fragile point-to-point pipelines that had been handling it. The immediate billion-event scale validated the architecture under real production conditions within weeks of launch. It also meant the open-source release in mid-2011 came with a credibility that mattered: this was not a research prototype. It was a system already running at significant scale.


The Fix

Five Design Decisions That Made Kafka Fast

Kafka's performance advantage over existing systems was not the result of clever optimization of a standard architecture. It was the result of choosing a fundamentally different architecture, where every key design decision reinforced the same goal: maximize throughput for streaming event data. Five decisions stand out as architecturally defining — and each was a deliberate rejection of how existing messaging systems had been built.

  • ~50MB/s — Kafka producer throughput in the original 2011 benchmark — versus ~2MB/s for ActiveMQ at the same message size (200 bytes, 10M messages)
  • 9 bytes — Per-message overhead in Kafka — versus 144 bytes in ActiveMQ. The storage efficiency difference meant Kafka could handle 16x more messages in the same disk space
  • Stateless — Kafka brokers — consumer offset tracking is done by the consumer, not the broker, eliminating the broker memory pressure that crippled traditional queues at scale
  • Sequential — Disk access pattern for both writes and reads — append-only to the log means no random I/O, allowing Kafka to push disk throughput to near hardware limits
// The five key Kafka design decisions in code form

// DECISION 1: Append-only log storage (not a queue)
// Each partition is a directory of segment files, appended to sequentially
// /kafka-logs/my-topic-0/00000000000000000000.log
// /kafka-logs/my-topic-0/00000000000000100000.log (new segment at 100k messages)
// → Sequential writes: disk seeks are expensive; sequential I/O is 100x faster

// DECISION 2: Consumer tracks its own offset
// The broker doesn't care what consumers have read — it just serves bytes
long consumerOffset = consumer.position(topicPartition); // consumer owns this
// → Brokers are stateless: no per-consumer memory, no ack tracking overhead
// → Consumers can replay any time: just reset the offset
consumer.seek(topicPartition, 0); // replay from the beginning

// DECISION 3: Topics are partitioned for horizontal scale
// Each partition is an independent log — producers and consumers parallelise
ProducerRecord record =
    new ProducerRecord<>("user-activity",
        userId, // partition key: same user → same partition = ordered
        eventJson // the message
    );
// → Topic with N partitions can be consumed by N consumers in parallel
// → Add brokers, add partitions: linear throughput scaling

// DECISION 4: Batch I/O from client to broker
props.put("batch.size", 16384); // batch up to 16KB before sending
props.put("linger.ms", 5); // or wait up to 5ms for the batch
// The original paper: batch size 50 improved throughput by ~10x vs batch size 1

// DECISION 5: Zero-copy transfer using sendfile()
// When a consumer fetches data, Kafka uses OS sendfile() syscall:
// data goes directly disk → network socket, bypassing userspace entirely
// → No data copy into JVM heap → no GC pressure → consistent low latency
// This is why Kafka can deliver data nearly as fast as the network allows
Enter fullscreen mode Exit fullscreen mode

THE STATELESS BROKER: THE COUNTERINTUITIVE MASTERSTROKE

The most counterintuitive decision in Kafka's design is making the broker stateless — the broker doesn't track which consumers have read which messages. In ActiveMQ and RabbitMQ, the broker maintains delivery state for every message: who acknowledged it, who hasn't, what needs to be retried. At scale, this per-message state tracking consumes enormous memory and creates a bottleneck. Kafka's solution was radical: let consumers track their own position (their offset in each partition). The broker just stores bytes in a log. Consumers read at their own pace, commit their offset to Zookeeper (later to a Kafka topic itself), and can reset to any offset to replay. The broker's memory footprint is constant regardless of consumer count or message backlog.

Kafka vs Traditional Message Queues: Architectural Comparison (2011 original benchmarks and design properties)

Property ActiveMQ / RabbitMQ Kafka
Storage model Queue (messages deleted after ack) Append-only log (messages retained by time/size)
Broker state Tracks ack state per message per consumer Stateless — consumers track own offset
Producer throughput (bench) ~2 MB/sec (ActiveMQ) ~50 MB/sec (batch size 50)
Message overhead 144 bytes (ActiveMQ, JMS header) 9 bytes
Consumer replay Not supported (message gone after ack) Supported — seek to any offset
Horizontal scale Limited (complex cluster configs) Native — add partitions, add consumers
Use case fit Task queues, guaranteed delivery, routing Event streaming, log aggregation, activity tracking

What LinkedIn Actually Used Kafka For

By 2019, Kafka was the circulatory system of LinkedIn's infrastructure. Activity tracking (the original use case): every page view, search, ad impression fed to both Hadoop and real-time processors. Real-time search indexing : profile updates searchable within seconds. Database replication : Espresso CDC via Kafka replaced MySQL replication. Inter-service messaging : microservices decoupled through Kafka topics. Stream processing : Apache Samza (LinkedIn's open-source stream processor) used Kafka as both input and durable state store. Every part of LinkedIn's data plane ran on Kafka.

ℹ️

Zero-Copy: The OS Kernel Trick That Doubled Throughput

One of Kafka's most impactful performance optimizations is invisible to application code: zero-copy data transfer using the OS-level sendfile() syscall. In a traditional data transfer, data moves from disk to kernel buffer, kernel buffer to userspace, userspace to socket buffer, socket buffer to network. In Kafka's consumer path, the OS sendfile() call transfers data directly from the page cache (disk buffer) to the network socket, bypassing userspace entirely. This means no data is copied into the JVM heap — no GC pressure, no object allocation overhead. At LinkedIn's throughput rates, this optimization alone is responsible for significant throughput gains and, more importantly, for Kafka's consistent low latency even under high load.

The Open-Source Flywheel

LinkedIn open-sourced Kafka in early 2011 — before it was even an Apache project. The decision to share the core infrastructure was not philanthropic; it was strategic. LinkedIn's engineers knew that the data pipeline problem they had solved was universal. By open-sourcing Kafka, they attracted contributions from engineers at Netflix, Uber, Twitter, and hundreds of other companies — all of whom had the same problem. The community built tooling LinkedIn would never have had resources to build alone: Kafka Streams, Kafka Connect, ksqlDB, MirrorMaker, Schema Registry. The open-source flywheel turned a LinkedIn internal tool into the internet's standard real-time data infrastructure.


Architecture

Kafka's architecture has three layers. The storage layer is a set of partitioned, replicated append-only log files on disk — each partition is an independent, totally ordered sequence of records. The broker layer is a cluster of server processes that manage partition assignment, replication, and client connections — but hold no consumer state. The client layer is producers writing to partitions and consumer groups reading from them, each consumer group maintaining its own independent offset position per partition. Understanding why this architecture outperforms traditional queues requires visualizing the data flow.

Before Kafka: N×M Integration Spaghetti

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

After Kafka: The Centralized Log Hub

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

Inside Kafka: Topics, Partitions, Offsets, and Consumer Groups

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

THE LOG/TABLE DUALITY: JAY KREPS' DEEPER INSIGHT

In his 2013 essay "The Log," Kreps articulated a concept that went beyond Kafka's implementation: the log/table duality (a mathematical relationship where any database table can be derived by replaying a log of changes from the beginning, and any log can be materialized into a table by applying each event as a state update — they are two views of the same underlying truth). Every database table can be derived by replaying the change log from the beginning. Every stream of events can be materialized into a table by accumulating state. This duality means a Kafka topic is simultaneously a stream and a database — you can query it as a stream in motion (stream processing) or materialize it as a snapshot (a table). This insight later became the foundation for Kafka Streams, ksqlDB, and the entire stream-processing ecosystem.

ℹ️

LinkedIn's Kafka by 2019: The Numbers

By 2019, LinkedIn's Kafka deployment had become one of the largest publicly documented distributed systems in existence: 7 trillion messages per day , spread across 100+ clusters , 4,000+ brokers , 100,000+ topics , and 7 million partitions. Each message was consumed by approximately four consumer groups on average. The cross-datacenter replication system (Brooklin) was itself mirroring over 7 trillion messages per day. From 1 billion events per day at launch in 2011 to 7 trillion by 2019: a 7,000x growth in eight years on the same fundamental architecture.


Lessons

Kafka is fifteen years old and it powers a majority of the world's real-time data infrastructure. Its success is not luck — it emerged directly from the architectural decisions Jay Kreps, Jun Rao, and Neha Narkhede made in 2010. The lessons here are about identifying the right abstraction, challenging assumptions baked into existing tools, and the compounding returns of open-sourcing infrastructure.

  1. 01. Before building, verify no existing tool solves your problem at your scale. The Kafka team evaluated ActiveMQ, RabbitMQ, and existing log aggregation systems before building. Their conclusion — existing tools were designed for the wrong problem — was evidence-based. The benchmark comparison (50 MB/sec vs 2 MB/sec) made the decision concrete. Never rebuild what can be adopted; never adopt what demonstrably can't serve your workload.
  2. 02. The append-only log (a data structure where records are only ever added to the end, never modified in place — enabling sequential I/O, arbitrary consumer replay, and stateless brokers) is the universal data integration primitive. Any system that moves data between producers and consumers is implementing a log, whether it knows it or not. The explicit recognition of this pattern — and building directly on it — is what gave Kafka its performance advantage and its flexibility.
  3. 03. Stateless brokers make systems horizontally scalable in ways stateful brokers cannot match. When the broker tracks delivery state per consumer per message, broker memory scales with consumers × outstanding messages. When consumers track their own offsets, broker memory scales with partitions only. This seemingly small architectural choice is why Kafka can serve hundreds of consumer groups without broker degradation.
  4. 04. Sequential I/O is dramatically faster than random I/O on both HDDs and SSDs. An append-only log turns a bursty stream of writes into sequential disk operations , allowing Kafka to approach disk hardware throughput limits. Systems that update records in-place pay random I/O costs on every write. Kafka writes append-only and leverages the OS page cache for reads, achieving throughput that surprised the entire industry.
  5. 05. Open-sourcing infrastructure that solves a universal problem creates compounding returns. LinkedIn open-sourced Kafka in 2011 because the team recognized it solved a problem every data-intensive company had. The community contributions, ecosystem tools (Kafka Streams, Connect, ksqlDB), and widespread adoption that followed made Kafka better than LinkedIn could have built alone. Netflix, Uber, Goldman Sachs, and thousands of other companies now run Kafka — and improvements they contribute flow back to LinkedIn. The return on open-sourcing infrastructure is measured in ecosystem, not just code.

From 1 Billion to 7 Trillion: The Same Architecture

The most remarkable fact about Kafka's growth is that the core architecture described in the 2011 paper — append-only partitioned log, stateless brokers, consumer-side offsets — is still the architecture running at 7 trillion messages per day in 2019. The system scaled 7,000x on the same fundamental design. Operational complexity grew (Cruise Control for rebalancing, Burrow for consumer lag monitoring, Brooklin for cross-datacenter replication), but the append-only log at the center of it all never needed to be replaced. Good architecture ages well.

THE PLATFORM THAT MADE KAFKA A COMPANY

In November 2014, three years after Kafka's public launch, Jay Kreps, Neha Narkhede, and Jun Rao left LinkedIn to found Confluent — a company built to provide enterprise Kafka services, managed Kafka infrastructure, and the commercial ecosystem around the open-source project. Confluent went public in 2021 at a $4.5 billion valuation. The path from LinkedIn internal tool to billion-dollar company in thirteen years is one of the most compelling data points for the value of open-sourcing well-designed infrastructure. The tool built to solve LinkedIn's data pipeline problem had become the data pipeline solution for most of the internet.

Jay Kreps named Kafka after Franz Kafka because it was 'a system optimized for writing' — and then built something that the entire internet writes 7 trillion messages through per day, which is exactly the kind of outcome Franz Kafka would have found deeply, cosmically absurd.

TechLogStack — built at scale, broken in public, rebuilt by engineers


This case is a plain-English retelling of publicly available engineering material.

Read the full case on TechLogStack → (interactive diagrams, source links, and the full reader experience).

Top comments (0)