DEV Community

Cover image for Kafka Architecture - The Complete Mental Model ๐Ÿง 
Ajinkya Singh
Ajinkya Singh

Posted on

Kafka Architecture - The Complete Mental Model ๐Ÿง 

How all the pieces fit together to create a powerful streaming platform


The Goal

Understand the "Big Picture" - How events, topics, partitions, producers, consumers, brokers, and consumer groups all work together as one cohesive system.

Think of this as getting a bird's eye view of the entire Kafka ecosystem! ๐Ÿฆ…


Building Block #1: The Event (Foundation)

What It Is

The fundamental unit - an immutable fact representing something that happened.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           EVENT/RECORD              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Key: user_456                       โ”‚
โ”‚ Value: {"action": "purchase"}       โ”‚
โ”‚ Timestamp: 2025-11-18 14:30:00     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Everything in Kafka revolves around these!
Enter fullscreen mode Exit fullscreen mode

Building Block #2: The Kafka Cluster (Infrastructure)

What It Is

A collection of servers working together - NOT just one server!

        KAFKA CLUSTER
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                                 โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”‚
โ”‚  โ”‚Broker 1โ”‚  โ”‚Broker 2โ”‚  ...   โ”‚
โ”‚  โ”‚Server 1โ”‚  โ”‚Server 2โ”‚        โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ”‚
โ”‚                                 โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”‚
โ”‚  โ”‚Broker 3โ”‚  โ”‚Broker 4โ”‚  ...   โ”‚
โ”‚  โ”‚Server 3โ”‚  โ”‚Server 4โ”‚        โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ”‚
โ”‚                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Network of powerful servers!
Enter fullscreen mode Exit fullscreen mode

What Brokers Do

  • Store your events
  • Handle requests from applications
  • Ensure the system stays available even if one fails

Why Multiple Brokers?

  1. Scalability โ†’ Handle massive amounts of data
  2. Fault Tolerance โ†’ Keep running even if servers fail

Modern Kafka (4.0+)

  • Brokers are self-managing using KRaft protocol
  • They coordinate with each other internally
  • No external ZooKeeper needed! ๐ŸŽ‰

Visualize: A resilient network of powerful servers ready to handle your data streams.


Building Block #3: Topics (Organization)

What It Is

A logical name/category for a stream of related events.

KAFKA CLUSTER
โ”œโ”€โ”€ Topic: "user-signups" ๐Ÿ‘ค
โ”œโ”€โ”€ Topic: "payment-transactions" ๐Ÿ’ฐ
โ”œโ”€โ”€ Topic: "sensor-readings" ๐ŸŒก๏ธ
โ””โ”€โ”€ Topic: "order-events" ๐Ÿ“ฆ
Enter fullscreen mode Exit fullscreen mode

Key Characteristics

1. Distributed Across Brokers

Single topic doesn't live on just ONE broker:

Topic: "orders"
โ”œโ”€โ”€ Partition 0 โ†’ Broker 1
โ”œโ”€โ”€ Partition 1 โ†’ Broker 2
โ””โ”€โ”€ Partition 2 โ†’ Broker 3

This distribution = SCALE! ๐Ÿš€
Enter fullscreen mode Exit fullscreen mode

2. Durable Storage

  • Events stored for configurable retention period
  • Can be re-read multiple times
  • Not deleted after consumption

Building Block #4: Partitions (Parallelism)

What It Is

Each topic is divided into ordered lanes called partitions.

The Multi-Lane Highway Analogy ๐Ÿ›ฃ๏ธ

Topic: "orders" (3 partitions)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚               MULTI-LANE HIGHWAY                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                    โ”‚
โ”‚  Lane 0 (Partition 0): Order1 โ†’ Order2 โ†’ Order3  โ”‚
โ”‚  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ–บ   โ”‚
โ”‚                                                    โ”‚
โ”‚  Lane 1 (Partition 1): Order4 โ†’ Order5 โ†’ Order6  โ”‚
โ”‚  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ–บ   โ”‚
โ”‚                                                    โ”‚
โ”‚  Lane 2 (Partition 2): Order7 โ†’ Order8 โ†’ Order9  โ”‚
โ”‚  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ–บ   โ”‚
โ”‚                                                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Each lane (partition) processes traffic (events) 
independently but IN ORDER within that lane!
Enter fullscreen mode Exit fullscreen mode

Key Properties

1. Ordered Within Partition โœ…

Partition 0:
Event A (offset 0) โ†’ Event B (offset 1) โ†’ Event C (offset 2)

Consumer always sees: A, then B, then C
ORDER GUARANTEED within the partition!
Enter fullscreen mode Exit fullscreen mode

2. NO Order Across Partitions โŒ

Partition 0: Event A (time: 10:00)
Partition 1: Event B (time: 09:59)

Consumer might see B before A
NO ORDER GUARANTEE across different partitions!
Enter fullscreen mode Exit fullscreen mode

3. Each Partition Lives on a Broker

Topic: "payments" (3 partitions)

Partition 0 โ†’ Broker 1 (Server 1)
Partition 1 โ†’ Broker 2 (Server 2)
Partition 2 โ†’ Broker 3 (Server 3)

Load is DISTRIBUTED across servers! โš–๏ธ
Enter fullscreen mode Exit fullscreen mode

Why Partitions?

  • Enable parallelism โ†’ Multiple producers/consumers work simultaneously
  • Distribute load โ†’ Spread data across multiple servers
  • Scale horizontally โ†’ Add more partitions = more throughput

Building Block #5: Producers (Data Writers)

What It Is

Your application code that sends/publishes events to Kafka topics.

         PRODUCERS (Entry Ramps)

Mobile App ๐Ÿ“ฑ โ”€โ”€โ”
                โ”‚
Web Server ๐ŸŒ โ”€โ”€โ”ผโ”€โ”€โ–บ Kafka Topic: "events"
                โ”‚      โ”œโ”€โ–บ Partition 0
IoT Device ๐ŸŒก๏ธ โ”€โ”€โ”˜      โ”œโ”€โ–บ Partition 1
                       โ””โ”€โ–บ Partition 2
Enter fullscreen mode Exit fullscreen mode

How Producers Work

Option 1: Automatic Partition Selection (No Key)

Producer sends events WITHOUT key:

Event 1 โ†’ Partition 0 (round-robin)
Event 2 โ†’ Partition 1 (round-robin)
Event 3 โ†’ Partition 2 (round-robin)
Event 4 โ†’ Partition 0 (round-robin)
...

Result: EVEN DISTRIBUTION across partitions
Enter fullscreen mode Exit fullscreen mode

Option 2: Key-Based Routing (With Key)

Producer sends events WITH key:

Event (key: user_123) โ†’ Partition 1
Event (key: user_123) โ†’ Partition 1 (SAME!)
Event (key: user_456) โ†’ Partition 2
Event (key: user_456) โ†’ Partition 2 (SAME!)
Event (key: user_123) โ†’ Partition 1 (SAME!)

Result: ALL events with SAME KEY go to SAME PARTITION
        This maintains ORDER for related events! ๐ŸŽฏ
Enter fullscreen mode Exit fullscreen mode

Visual Example: Key-Based Routing

Producer: E-commerce Website

Order from user_123:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Key: user_123        โ”‚
โ”‚ Value: Order details โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ†“
    Kafka hashes key
         โ†“
    Always โ†’ Partition 1

Another order from user_123:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Key: user_123        โ”‚
โ”‚ Value: Order details โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ†“
    Kafka hashes key
         โ†“
    Always โ†’ Partition 1 (SAME!)

โœ… All user_123 orders processed IN ORDER!
Enter fullscreen mode Exit fullscreen mode

Producer Behavior

  • Asynchronous โ†’ Send and move on (don't wait for consumer)
  • High throughput โ†’ Can send thousands of events per second
  • Fire and forget โ†’ Ensures speed

Visualize: Entry ramps onto a highway, directing traffic into specific lanes.


Building Block #6: Consumers (Data Readers)

What It Is

Your application code that reads/subscribes to events from topics.

         Kafka Topic: "orders"
                 โ†“
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚               โ”‚
    Consumer A      Consumer B
         โ†“               โ†“
   Analytics App    Email Service

Each reads INDEPENDENTLY with its own position (offset)
Enter fullscreen mode Exit fullscreen mode

Key Properties

1. Pull-Based Model

Traditional Systems:        Kafka:
Server โ†’ PUSHES โ†’ Client   Client โ† PULLS โ† Server

Benefits of Pull:
โœ… Consumer controls pace
โœ… Can process at own speed
โœ… Can pause/resume
Enter fullscreen mode Exit fullscreen mode

2. Independent Reading

Multiple consumers can read SAME topic:

Topic: "transactions"
     โ†“
     โ”œโ”€โ”€โ–บ Consumer A (reads everything)
     โ”œโ”€โ”€โ–บ Consumer B (reads everything)
     โ””โ”€โ”€โ–บ Consumer C (reads everything)

Each maintains its OWN offset (reading position)
Nobody affects anyone else! ๐ŸŽญ
Enter fullscreen mode Exit fullscreen mode

3. Offset Tracking

Partition 0:
โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”
โ”‚ 0  โ”‚ 1  โ”‚ 2  โ”‚ 3  โ”‚ 4  โ”‚ 5  โ”‚ ...
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”˜
              โ†‘
         Consumer's
         current offset
         (remembers position)

If consumer stops and restarts:
โœ… Resumes from last offset (position 2)
โœ… No messages skipped
โœ… No messages duplicated
Enter fullscreen mode Exit fullscreen mode

Building Block #7: Consumer Groups (Team Work)

What It Is

A collection of consumer instances working together as a team to process events.

The Team Analogy ๐Ÿ‘ฅ

Team A (Consumer Group "analytics"):
Worker 1, Worker 2, Worker 3

Team B (Consumer Group "email"):
Worker 4, Worker 5

Team C (Consumer Group "archiving"):
Worker 6, Worker 7, Worker 8

Each TEAM gets its own FULL COPY of the event stream!
Enter fullscreen mode Exit fullscreen mode

How Consumer Groups Work

Rule: One Partition = One Consumer (within group)

Topic: "orders" (3 partitions)

Consumer Group "order-processors" (3 consumers):

Partition 0 โ”€โ”€โ–บ Consumer A โ”
Partition 1 โ”€โ”€โ–บ Consumer B โ”œโ”€ Group "order-processors"
Partition 2 โ”€โ”€โ–บ Consumer C โ”˜

โœ… Each partition assigned to EXACTLY ONE consumer
โœ… Work is DIVIDED among team members
โœ… Parallel processing! โšก
Enter fullscreen mode Exit fullscreen mode

Example: Load Distribution

Scenario 1: More partitions than consumers

Topic: 4 partitions
Group: 2 consumers

Partition 0 โ”€โ”€โ”
Partition 1 โ”€โ”€โ”ผโ”€โ”€โ–บ Consumer A
               โ”‚
Partition 2 โ”€โ”€โ”ค
Partition 3 โ”€โ”€โ”ดโ”€โ”€โ–บ Consumer B

Each consumer handles 2 partitions
Enter fullscreen mode Exit fullscreen mode
Scenario 2: More consumers than partitions

Topic: 2 partitions
Group: 3 consumers

Partition 0 โ”€โ”€โ–บ Consumer A
Partition 1 โ”€โ”€โ–บ Consumer B
                Consumer C (IDLE - no partition assigned)

Extra consumers sit idle (but ready for failover!)
Enter fullscreen mode Exit fullscreen mode
Scenario 3: Perfect match

Topic: 3 partitions
Group: 3 consumers

Partition 0 โ”€โ”€โ–บ Consumer A
Partition 1 โ”€โ”€โ–บ Consumer B
Partition 2 โ”€โ”€โ–บ Consumer C

Perfectly balanced! โš–๏ธ
Enter fullscreen mode Exit fullscreen mode

Multiple Consumer Groups (Independent Processing)

Topic: "news-feed"
     โ”‚
     โ”œโ”€โ”€โ–บ Group A "website-updates"
     โ”‚    โ”œโ”€ Consumer 1 โ†’ Partition 0
     โ”‚    โ”œโ”€ Consumer 2 โ†’ Partition 1
     โ”‚    โ””โ”€ Consumer 3 โ†’ Partition 2
     โ”‚
     โ”œโ”€โ”€โ–บ Group B "archiving"
     โ”‚    โ”œโ”€ Consumer 1 โ†’ Partition 0
     โ”‚    โ”œโ”€ Consumer 2 โ†’ Partition 1
     โ”‚    โ””โ”€ Consumer 3 โ†’ Partition 2
     โ”‚
     โ””โ”€โ”€โ–บ Group C "sentiment-analysis"
          โ””โ”€ Consumer 1 โ†’ All partitions

โœ… Each group processes SAME data INDEPENDENTLY
โœ… Each group maintains its OWN offsets
โœ… Groups don't affect each other
Enter fullscreen mode Exit fullscreen mode

Automatic Failover (Self-Healing)

Before failure:
Partition 0 โ”€โ”€โ–บ Consumer A โœ…
Partition 1 โ”€โ”€โ–บ Consumer B โœ…
Partition 2 โ”€โ”€โ–บ Consumer C โœ…

Consumer B fails! ๐Ÿ’ฅ

After automatic rebalancing (seconds):
Partition 0 โ”€โ”€โ–บ Consumer A โœ…
Partition 1 โ”€โ”€โ–บ Consumer A โœ… (took over!)
Partition 2 โ”€โ”€โ–บ Consumer C โœ…

Or:
Partition 0 โ”€โ”€โ–บ Consumer A โœ…
Partition 1 โ”€โ”€โ–บ Consumer C โœ… (took over!)
Partition 2 โ”€โ”€โ–บ Consumer C โœ…

โœ… No data loss!
โœ… Processing continues!
Enter fullscreen mode Exit fullscreen mode

Visualize: Teams of workers where each team processes the full stream, but within each team, workers divide up the lanes (partitions) to work in parallel.


THE GRAND PICTURE: How Everything Works Together ๐ŸŽฏ

Complete Data Flow

STEP 1: PRODUCERS CREATE EVENTS
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Mobile App, Website, IoT Devices, etc. โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ†“
          Generate Events

STEP 2: EVENTS SENT TO TOPICS
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Event with key "user_123"              โ”‚
โ”‚  โ†’ Kafka hashes key                     โ”‚
โ”‚  โ†’ Routes to specific partition         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ†“
           Topic: "orders"

STEP 3: PARTITIONS STORE EVENTS
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Partition 0 (Broker 1): [E1, E2, E3]   โ”‚
โ”‚ Partition 1 (Broker 2): [E4, E5, E6]   โ”‚
โ”‚ Partition 2 (Broker 3): [E7, E8, E9]   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ†“
        Ordered, Immutable Log

STEP 4: CONSUMER GROUPS PULL EVENTS
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Group "analytics":                      โ”‚
โ”‚   Consumer A reads Partition 0          โ”‚
โ”‚   Consumer B reads Partition 1          โ”‚
โ”‚   Consumer C reads Partition 2          โ”‚
โ”‚                                         โ”‚
โ”‚ Group "email":                          โ”‚
โ”‚   Consumer D reads Partition 0          โ”‚
โ”‚   Consumer E reads Partition 1, 2       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ†“
         Process in parallel
         at their own pace
Enter fullscreen mode Exit fullscreen mode

Visual: Complete System Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    KAFKA CLUSTER                            โ”‚
โ”‚                                                             โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚   โ”‚Broker 1 โ”‚  โ”‚Broker 2 โ”‚  โ”‚Broker 3 โ”‚  โ”‚Broker 4 โ”‚     โ”‚
โ”‚   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค     โ”‚
โ”‚   โ”‚ P0 (L)  โ”‚  โ”‚ P1 (L)  โ”‚  โ”‚ P2 (L)  โ”‚  โ”‚ P3 (L)  โ”‚     โ”‚
โ”‚   โ”‚ P1 (F)  โ”‚  โ”‚ P2 (F)  โ”‚  โ”‚ P3 (F)  โ”‚  โ”‚ P0 (F)  โ”‚     โ”‚
โ”‚   โ”‚ P2 (F)  โ”‚  โ”‚ P3 (F)  โ”‚  โ”‚ P0 (F)  โ”‚  โ”‚ P1 (F)  โ”‚     โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚        โ†‘                           โ†“                       โ”‚
โ”‚    WRITE                         READ                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚                           โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”               โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚PRODUCERSโ”‚               โ”‚CONSUMER    โ”‚
    โ”‚         โ”‚               โ”‚GROUPS      โ”‚
    โ”‚๐Ÿ“ฑ App   โ”‚               โ”‚            โ”‚
    โ”‚๐ŸŒ Web   โ”‚               โ”‚Group A:    โ”‚
    โ”‚๐ŸŒก๏ธ IoT   โ”‚               โ”‚ C1, C2, C3 โ”‚
    โ”‚         โ”‚               โ”‚            โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜               โ”‚Group B:    โ”‚
                              โ”‚ C4, C5     โ”‚
                              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Legend:
P0 = Partition 0
(L) = Leader
(F) = Follower (replica)
Enter fullscreen mode Exit fullscreen mode

Real-World Example: E-Commerce Order System

The Complete Flow

SCENARIO: Customer places an order on website

1๏ธโƒฃ PRODUCER (Website) creates event:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Key: customer_789                โ”‚
โ”‚ Value: {                         โ”‚
โ”‚   order_id: "ORD-456",          โ”‚
โ”‚   items: ["laptop", "mouse"],   โ”‚
โ”‚   total: 1200                    โ”‚
โ”‚ }                                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ†“

2๏ธโƒฃ Kafka routes to TOPIC and PARTITION:
Topic: "orders"
Key "customer_789" โ†’ Partition 1 (always same partition!)
         โ†“

3๏ธโƒฃ BROKERS store in partition:
Broker 2 (Leader for Partition 1):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Partition 1:                   โ”‚
โ”‚ Offset 100: ORD-453           โ”‚
โ”‚ Offset 101: ORD-454           โ”‚
โ”‚ Offset 102: ORD-456 โ† NEW!    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Broker 3 (Follower):           Broker 4 (Follower):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Partition 1 (copy):  โ”‚       โ”‚ Partition 1 (copy):  โ”‚
โ”‚ Offset 102: ORD-456  โ”‚       โ”‚ Offset 102: ORD-456  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ†“                              โ†“
    REPLICATED for durability!

4๏ธโƒฃ MULTIPLE CONSUMER GROUPS process independently:

Group "payment-processing":
  Consumer A reads Partition 1 โ†’ Charges credit card

Group "inventory":
  Consumer B reads Partition 1 โ†’ Updates stock

Group "email":
  Consumer C reads Partition 1 โ†’ Sends confirmation

Group "analytics":
  Consumer D reads Partition 1 โ†’ Updates dashboard

โœ… All process SAME order
โœ… All work INDEPENDENTLY
โœ… Each at their own pace
Enter fullscreen mode Exit fullscreen mode

Key Principles That Make It All Work

1. Distribution

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Work spread across many servers  โ”‚
โ”‚ โœ… Scalability                   โ”‚
โ”‚ โœ… Parallel processing           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Enter fullscreen mode Exit fullscreen mode

2. Immutability

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Events never change or deleted   โ”‚
โ”‚ โœ… Can be replayed               โ”‚
โ”‚ โœ… Multiple consumers can read   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Enter fullscreen mode Exit fullscreen mode

3. Parallelism

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Multiple partitions processed    โ”‚
โ”‚ simultaneously                    โ”‚
โ”‚ โœ… High throughput               โ”‚
โ”‚ โœ… Efficient resource use        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Enter fullscreen mode Exit fullscreen mode

Fault Tolerance in Action

When Broker Fails

Before:
Broker 1 (P0-Leader) โœ…
Broker 2 (P0-Follower) โœ…
Broker 3 (P0-Follower) โœ…

Broker 1 fails! ๐Ÿ’ฅ

After (2-3 seconds):
Broker 1 (P0-Leader) ๐Ÿ’€
Broker 2 (P0-Leader) โญ Promoted!
Broker 3 (P0-Follower) โœ…

โœ… System keeps running
โœ… No data loss
Enter fullscreen mode Exit fullscreen mode

When Consumer Fails

Before:
Partition 0 โ†’ Consumer A โœ…
Partition 1 โ†’ Consumer B โœ…
Partition 2 โ†’ Consumer C โœ…

Consumer B fails! ๐Ÿ’ฅ

After (seconds):
Partition 0 โ†’ Consumer A โœ…
Partition 1 โ†’ Consumer A โœ… Took over!
Partition 2 โ†’ Consumer C โœ…

โœ… Processing continues
โœ… No events missed
Enter fullscreen mode Exit fullscreen mode

Summary: The Mental Model Checklist

The 7 Components

โœ… Events - The data (immutable facts)
โœ… Cluster - Network of servers
โœ… Brokers - Individual servers in cluster
โœ… Topics - Categories for events
โœ… Partitions - Ordered lanes within topics
โœ… Producers - Apps that write events
โœ… Consumers - Apps that read events
โœ… Consumer Groups - Teams that work together

The Flow

Producers โ†’ Topics โ†’ Partitions โ†’ Brokers
                                    โ†“
                            Consumer Groups
Enter fullscreen mode Exit fullscreen mode

The Guarantees

  • โœ… Order within a partition
  • โœ… Scalability through distribution
  • โœ… Durability through replication
  • โœ… Fault tolerance through automatic failover
  • โœ… Parallel processing through partitions and consumer groups

Your Mental Model

Think of Kafka as:

๐Ÿญ A highly organized factory where:
   โ€ข Multiple assembly lines (partitions) run in parallel
   โ€ข Workers (producers) add items to lines
   โ€ข Quality checkers (consumers) inspect items
   โ€ข Teams (consumer groups) divide the work
   โ€ข Multiple facilities (brokers) ensure continuity
   โ€ข Everything is tracked and never lost
Enter fullscreen mode Exit fullscreen mode

You now have a complete bird's eye view of Apache Kafka! ๐Ÿฆ…

This mental model will be invaluable as you build applications and dive deeper into Kafka's capabilities. Every detail you learn will fit into this bigger picture! ๐ŸŽฏ

Top comments (1)

Collapse
 
mateorod_ profile image
Mateo Rodriguez

This post really solidified my understanding of Kafka as a system of cooperating pieces rather than just โ€œa message queue.โ€ The highway and factory analogies clarified how partitions, consumer groups, and brokers interact. It reinforced my view of Kafka as log-centric, but shifted my thinking toward treating consumer groups as independent, parallel โ€œteamsโ€ over the same stream.