DEV Community

Cover image for Apache Kafka Interview Questions for Data Engineers: Topics, Partitions, Consumer Groups & Exactly-Once Semantics
Gowtham Potureddi
Gowtham Potureddi

Posted on

Apache Kafka Interview Questions for Data Engineers: Topics, Partitions, Consumer Groups & Exactly-Once Semantics

apache kafka interview questions show up in nearly every senior data engineering loop because Kafka is the messaging substrate underneath modern streaming pipelines, change data capture, and real-time analytics. Interviewers don't stop at "what is Kafka?" — they probe whether you understand kafka partitions as the unit of parallelism, kafka isr as the durability gate, kafka consumer group rebalance as the operational pain point, and kafka exactly once semantics as the producer-to-consumer atomic contract.

This guide walks through the eight Kafka primitives that show up most often in data engineering interview questions at FAANG and streaming-heavy shops (Uber, Netflix, LinkedIn, Stripe, DoorDash). Each section pairs a teaching block with a Solution-Tail interview answer — code, a step-by-step trace, an output table, then a concept-by-concept breakdown of why it works. By the end you'll be able to defend acks=all with min.insync.replicas=2, trace a cooperative-sticky rebalance across four consumers, and explain transactional consume-transform-produce with isolation.level=read_committed — the exact shape kafka data engineer interview rounds reward when kraft kafka and kafka transactions come up.

PipeCode blog header for an Apache Kafka interview prep guide for data engineers — bold white headline 'Apache Kafka · Interview Questions' with subtitle 'topics · partitions · consumer groups · exactly-once' and a minimal Kafka topic/partition diagram on a dark gradient with purple, green, and orange accents and a small pipecode.ai attribution.

When you want hands-on reps immediately after reading, browse streaming practice library →, drill real-time analytics drills →, and rehearse on Python streaming problems →.


On this page


1. Why Kafka shows up in every senior data engineering interview

Kafka is the durable log between every producer and every consumer in a modern pipeline

The one-sentence invariant: Kafka is a distributed, partitioned, replicated commit log — producers append records, consumers read them at their own pace, and the log persists for a configurable retention window so multiple consumers can replay it independently. Once you internalise that mental model — log first, queue and pub-sub as derived behaviours — the rest of the apache kafka interview questions family becomes a sequence of trade-off discussions: durability vs latency, ordering vs throughput, exactly-once vs raw speed.

Topic map of the eight Apache Kafka primitives every data engineering interview probes — eight numbered cards laid out in two rows: Architecture and primitives, Partitions and ordering, Replication and ISR, Producer semantics, Consumer groups, Exactly-once semantics, Kafka Connect, Kafka Streams; each card has a tiny icon strip and a one-line teaching label; on a light PipeCode card.

Kafka vs traditional message queue in five bullets.

  • Lifecycle. RabbitMQ / SQS delete a message after it is acknowledged. Kafka keeps the message for the topic's retention window (hours, days, or forever) — multiple consumer groups read the same log independently.
  • Pull vs push. Kafka is pull-based — consumers fetch at their own pace. Traditional MQs push, which forces backpressure to be solved at the broker.
  • Throughput. Kafka is built around sequential disk writes and zero-copy network transfers — it sustains millions of writes per second on commodity hardware.
  • Replay. Resetting an offset is a first-class operation; replaying a day of events is one CLI command. RabbitMQ would require a re-publish.
  • Fanout patterns. One topic, N consumer groups → each group sees every message. Same topic, one consumer group with M consumers → partitions are sharded across M readers (queue semantics on a log substrate).

The three primitives every Kafka interview opens with.

  • Broker — a Kafka server. A cluster has 3+ brokers for production; one is elected controller for cluster-wide metadata operations.
  • Topic — a logical name for a stream of records. A topic is a directory of one or more partitions on disk.
  • Partition — an ordered, immutable, append-only log. Partitions are the unit of parallelism (consumer count is bounded by partition count) and the unit of ordering (records inside one partition are strictly ordered).

The 2026 reality (Kafka 4.x).

  • KRaft mode is the only mode. As of Kafka 4.0 ZooKeeper is fully removed. Metadata is managed by a Raft consensus quorum inside the brokers — KAFKA_PROCESS_ROLES=broker,controller for combined nodes, KAFKA_CONTROLLER_QUORUM_VOTERS for the voter set.
  • Cooperative-sticky rebalancing is the default consumer protocol. No more stop-the-world consumer pauses on every group change.
  • Share Groups (Kafka 4.2) introduce true queue semantics — partitions can be read by multiple consumers concurrently, no per-partition ownership. Different problem, different tool — covered briefly in §5.

What interviewers listen for.

  • Do you reach for acks=all + min.insync.replicas=2 the moment durability comes up? — senior signal.
  • Do you explain ordering as per-partition only, not topic-wide? — required answer.
  • Do you mention idempotent producer and transactions when "exactly-once" is asked? — required answer.
  • Do you reach for cooperative-sticky when explaining rebalancing? — current-default signal.

Worked example — design the ingest layer for a clickstream pipeline

Detailed explanation. Realistic Kafka design starts from the throughput target ("100k events/sec at peak"), the ordering requirement ("per-session ordering matters"), and the downstream consumer count ("two real-time consumers, one warehouse sink").

Question. Design the Kafka topic for a clickstream pipeline that ingests 100,000 events/sec at peak, requires per-session ordering, replicates across 3 brokers, and survives one broker failure without data loss. Sketch the CLI command, the producer config, and the partition-count math.

Code (CLI + producer config).

# Create the topic — 20 partitions, RF=3, min.insync.replicas=2
kafka-topics.sh --bootstrap-server kafka-1:9092 \
  --create --topic clickstream \
  --partitions 20 \
  --replication-factor 3 \
  --config min.insync.replicas=2 \
  --config retention.ms=604800000   # 7 days
Enter fullscreen mode Exit fullscreen mode
# producer.py (Python, confluent-kafka)
from confluent_kafka import Producer

producer = Producer({
    "bootstrap.servers": "kafka-1:9092,kafka-2:9092,kafka-3:9092",
    "acks": "all",                       # wait for full ISR
    "enable.idempotence": True,          # dedup producer retries
    "compression.type": "zstd",
    "linger.ms": 10,                     # batch for 10ms
    "batch.size": 65536,
})

def publish(event):
    key = str(event["session_id"]).encode()   # session_id keeps order per session
    producer.produce("clickstream", key=key, value=json.dumps(event).encode())
producer.flush()
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. Partition count math. Single consumer sustains ≈ 5,000 events/sec → 100,000 ÷ 5,000 = 20 partitions. Over-provision lightly so future growth doesn't force a partition-count change (which would re-shuffle every key's home).
  2. Replication factor 3 + min.insync.replicas 2. A write commits when 2 of the 3 replicas have it. Survives one broker loss without data loss; rejects writes if only 1 ISR remains (caller observes failure instead of silent loss).
  3. Partition key = session_id. Per-session ordering is preserved because hash(session_id) % 20 routes all events for one session to the same partition.
  4. acks=all + enable.idempotence=true. Strongest durability with retry safety — duplicate appends on retry are deduplicated by the broker using the producer's PID + sequence number.
  5. linger.ms=10 + batch.size=64KB. Producer batches up to 10ms or 64KB per partition before sending — trades 10ms of producer latency for ~5–10× throughput.

Output.

Decision Value Why
Partitions 20 100k/sec ÷ 5k per consumer
Replication factor 3 survive one broker loss
min.insync.replicas 2 block writes when < 2 ISR
Partition key session_id per-session ordering
acks all full-ISR durability
Idempotence on dedup producer retries
Compression zstd ~3× wire & disk savings

Rule of thumb. Start at partitions ≈ peak_throughput / single_consumer_throughput, then over-provision 2–3× for growth. Don't go wild — every partition costs file handles, controller metadata, and rebalance time.

Kafka interview question on ingest sizing

A senior interviewer often shapes this round as: "Walk me through how you'd size the topic, the producer, and the durability config for a clickstream feed at 100k events/sec." It blends partition-count math, replication semantics, and producer tuning — the three muscles every kafka data engineer interview opens with.

Solution Using acks=all + min.insync.replicas=2 + 20 partitions

# Create the topic
kafka-topics.sh --bootstrap-server kafka-1:9092 \
  --create --topic clickstream \
  --partitions 20 --replication-factor 3 \
  --config min.insync.replicas=2 --config retention.ms=604800000
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Event Producer action Broker action ISR after
session=42, click=home hash(42)%20 = P7 → send to P7 leader (Broker 2), acks=all Broker 2 writes, replicates to Broker 3, Broker 1 — all 3 acks come back {1,2,3}
session=42, click=cart same partition P7 → leader Broker 2 leader writes, replicates {1,2,3}
Broker 3 network blip — falls behind by replica.lag.time.max.ms producer continues with acks=all Broker 3 removed from ISR {1,2}
session=42, click=checkout same partition P7 leader Broker 2 + follower Broker 1 ack → commit (min.insync.replicas=2 satisfied) {1,2}
Broker 1 also drops out producer attempts send leader Broker 2 alone → < min.insync.replicas → broker rejects write with NotEnoughReplicasException → producer surfaces error to caller {2}

Output:

Metric Steady state After Broker 3 dropped After Broker 1 also dropped
Writable yes yes no (write rejected)
Reads up to HW latest latest latest
Data loss risk none none none — writes rejected, no silent loss

Why this works — concept by concept:

  • Partition as unit of parallelism — 20 partitions = up to 20 consumer threads doing parallel reads of clickstream. Adding a 21st consumer wouldn't help.
  • Partition key for per-session orderinghash(session_id) % 20 keeps every event for one session on the same partition, where order is strictly preserved.
  • acks=all — the producer waits for every ISR member to acknowledge the write before considering it committed. Combined with min.insync.replicas=2, this guarantees that any acknowledged write survives at least one broker loss.
  • min.insync.replicas as a durability gate — if the ISR shrinks below the threshold, the broker prefers to reject writes (surfacing the error upstream) rather than silently commit something that could be lost.
  • Idempotent producerenable.idempotence=true tags each produce request with a PID + sequence number; the broker rejects duplicates from retries, so the producer can retry safely without creating duplicates downstream.
  • Cost — write latency = O(max ISR replica RTT); cluster throughput = O(brokers × disk bandwidth × partition spread); storage = O(events × retention_ms × replication_factor).

Streaming
Topic — streaming pipelines
Streaming pipeline problems (Kafka, partitions, ordering)

Practice →


2. Topics, partitions, and ordering — the unit of parallelism

kafka partitions are the only thing Kafka guarantees ordering on — the rest is a routing decision

The mental model in one line: a topic is a logical name; a partition is the physical, ordered, append-only log; the partition key decides which partition every record lands on. Once you say "ordering is per-partition, never topic-wide," half of the kafka partitions consumer groups interview question family collapses to obvious answers.

The partition decisions every interview probes.

  • Partition count. Upper bound on consumer parallelism, lower bound on file-handle and metadata overhead. Typical range: 6–100 per topic; rarely 1, rarely 1,000.
  • Partition key. Default partitioner is hash(key) % numPartitions. Same key → same partition. No key → sticky round-robin (favours batching).
  • Hot-key risk. One key generates 80% of the traffic → that partition is a bottleneck. Mitigations: composite key (session_id + bucket), custom partitioner, salting.
  • Re-partitioning is expensive. Changing partition count rehomes every key; for ordered topics this is a major operation. Plan partition count up-front.

Three ordering questions interviewers always ask.

  • "How do I guarantee global ordering across a topic?" — One partition. Trade-off: throughput caps at single-partition speed.
  • "How do I preserve per-key ordering?" — Use the key as the partition key. All records with that key route to one partition.
  • "How do I handle hot keys?" — Composite key, salting, or custom partitioner that distributes the hot tenant across N partitions and reassembles downstream.

Worked example — choose partition key for an orders topic

Detailed explanation. Order events for the same customer must be processed in sequence; orders for different customers can flow in parallel. The natural key is customer_id. With 12 partitions, every event for one customer routes to exactly one partition, so the consumer for that partition sees them in produce order.

Question. Three orders arrive for customer_id=42. With 12 partitions and default partitioner, on which partition do they land, and what does the consumer for that partition observe?

Code (Java-style pseudocode).

def partition_for(key, num_partitions):
    return hash(key) & 0x7fffffff % num_partitions

partition_for("42", 12)   # → P3 (illustrative)
Enter fullscreen mode Exit fullscreen mode
producer.produce("orders", key=b"42", value=b'{"order_id":1001,"status":"placed"}')
producer.produce("orders", key=b"42", value=b'{"order_id":1001,"status":"paid"}')
producer.produce("orders", key=b"42", value=b'{"order_id":1001,"status":"shipped"}')
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. The producer hashes the key "42" once per record. Result is deterministic — same key → same partition.
  2. All three records route to partition P3.
  3. The consumer assigned to P3 reads them in offset order, which is the produce order: placed → paid → shipped.
  4. Consumers for other partitions never see these records — they don't need to, because per-customer ordering is local to one partition.

Output.

order_id status partition offset
1001 placed P3 4711
1001 paid P3 4712
1001 shipped P3 4713

Rule of thumb. When in doubt about ordering, ask the interviewer: "Is per-key ordering enough, or do we need global ordering?" The answer is almost always per-key.

Kafka interview question on partition key choice

The probe usually sounds like: "How would you choose the partition key for a payment event stream where the same merchant_id may issue thousands of events per second?" — testing whether the candidate recognises a hot-key situation and reaches for a composite key or salting.

Solution Using composite key merchant_id + bucket

import hashlib, json

NUM_BUCKETS = 16   # increases parallelism for hot merchants

def composite_key(merchant_id, payment_id):
    bucket = int(hashlib.md5(payment_id.encode()).hexdigest(), 16) % NUM_BUCKETS
    return f"{merchant_id}:{bucket}".encode()

for event in payment_stream:
    producer.produce(
        "payments",
        key=composite_key(event["merchant_id"], event["payment_id"]),
        value=json.dumps(event).encode(),
    )
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Event merchant_id payment_id bucket composite key partition
1 7 pay_001 4 "7:4" P5
2 7 pay_002 11 "7:11" P9
3 7 pay_003 4 "7:4" P5
4 7 pay_004 0 "7:0" P2

Output:

Partition Count of merchant=7 events
P2 1
P5 2
P9 1

Compared to a naive key=merchant_id strategy, where all four events would have landed on a single partition, the load is now spread across three partitions while still preserving ordering per (merchant_id, bucket) slice.

Why this works — concept by concept:

  • Hot-key dilution — adding a bucket suffix to the key splits one merchant's traffic across NUM_BUCKETS partitions instead of one. The hottest partition's load drops by ≈ 1 / NUM_BUCKETS.
  • Local ordering preserved — events with the same payment_id always hash to the same bucket, so per-payment ordering is still strict.
  • Downstream re-assembly — when a consumer needs all events for merchant_id=7, it reads from all partitions and groups by merchant_id. Slightly more code, vastly more headroom.
  • Cost — partition spread = O(NUM_BUCKETS); ordering scope shrinks from "per merchant" to "per (merchant_id, bucket)" — usually an acceptable trade-off.

Streaming
Topic — real-time analytics
Real-time analytics problems (hot keys, partitioning)

Practice →


3. Replication, ISR, and the durability gate

kafka isr is the contract — only in-sync replicas can become leaders and only ISR-acked writes are committed

For each partition Kafka stores R copies — one leader, R−1 followers. A follower is in-sync if it has caught up with the leader within replica.lag.time.max.ms (default 30 seconds). The In-Sync Replicas (ISR) is the set of replicas eligible to become leader if the current leader fails, and the set whose acknowledgements acks=all waits for.

Diagram of a Kafka topic with 3 partitions × replication factor 3, showing leader/follower assignments and the In-Sync Replicas (ISR) set for each partition — a 3-row table where each row is a partition and each cell shows the broker holding the replica with badges for leader (purple) and follower (green) and an ISR membership column to the right; below the table a small min.insync.replicas + acks=all annotation explains the durability gate; on a light PipeCode card.

Replication concepts in five bullets.

  • Replication factor (RF). Number of copies per partition. Typical production value: 3.
  • Leader / follower. Producers and consumers only talk to the leader. Followers replicate the log asynchronously, ack to the leader once they've persisted each record.
  • High Watermark (HW). The highest offset replicated to all ISR members. Consumers can only read up to HW — guaranteeing they only see committed data.
  • Log End Offset (LEO). The highest offset written to a specific replica (may be > HW for the leader before followers catch up).
  • min.insync.replicas. Lower bound on the ISR size for acks=all writes to succeed. With RF=3 and min.insync.replicas=2, a single broker outage is tolerated; two outages reject new writes (visible failure beats silent loss).

When does a replica leave the ISR?

  • Falls behind. Lags the leader by more than replica.lag.time.max.ms of replication.
  • Network partition or broker crash. The follower stops fetching for a sustained period.
  • Manual reassignment. Administrative operation moves the replica off the broker.

Unclean leader election (the dangerous knob).

  • unclean.leader.election.enable=false (default since Kafka 0.11) — Kafka refuses to elect a non-ISR replica even if the leader is dead and no ISR is left. Prefers downtime over silent loss.
  • Set it to true only when availability strictly trumps durability (non-critical telemetry, idempotent downstream sinks). For ledgers, payments, anything compliance-bound — leave it off.

Worked example — what happens when an ISR shrinks below min.insync.replicas

Detailed explanation. A common interview probe is: "Broker B3 died — what happens to writes?" The answer depends on the original ISR, acks, and min.insync.replicas.

Question. A topic has RF=3 with brokers {B1, B2, B3}, min.insync.replicas=2, and acks=all. Initially ISR = {B1, B2, B3} and B1 is the leader. B3 crashes, then 40 seconds later B2 also experiences a network blip. What does each producer write observe?

Input (timeline).

t (s) Event ISR Leader
0 All healthy {1,2,3} B1
5 B3 crashes {1,2,3} → {1,2} after replica.lag.time.max.ms B1
35 ISR collapses to {1,2} {1,2} B1
70 B2 network blip {1,2} → {1} B1

Code (producer behaviour).

producer = Producer({
    "bootstrap.servers": "B1:9092,B2:9092,B3:9092",
    "acks": "all",
    "enable.idempotence": True,
    "delivery.timeout.ms": 120000,
})

# At t=10 — write succeeds
producer.produce("orders", key=b"42", value=b'{"o":1}')   # ✓
# At t=40 (ISR={1,2}, still ≥ min.insync.replicas) — write succeeds
producer.produce("orders", key=b"42", value=b'{"o":2}')   # ✓
# At t=75 (ISR={1}, below min.insync.replicas=2) — write FAILS
producer.produce("orders", key=b"42", value=b'{"o":3}')   # ✗ NotEnoughReplicasException
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. t=10 — ISR is full. acks=all waits for B1 + B2 + B3 ack; all three deliver; offset is committed; HW advances.
  2. t=40 — B3 has dropped out of ISR, ISR is now {B1, B2}. acks=all waits for B1 + B2 ack — both deliver. ISR size (2) ≥ min.insync.replicas (2), so the write commits.
  3. t=75 — B2 has also dropped. ISR is now {B1}. ISR size (1) < min.insync.replicas (2). Broker rejects the produce request with NotEnoughReplicasException. Producer surfaces the error to the caller; the application must handle it (retry with backoff, alert, dead-letter, etc.).
  4. Why is this the correct behaviour? Silent acceptance with only one replica would mean a subsequent B1 crash erases the write. The system prefers a visible failure to a hidden loss.

Output.

Write t (s) Result Reason
{"o":1} 10 committed ISR=3, full ack
{"o":2} 40 committed ISR=2 = min.insync.replicas
{"o":3} 75 rejected ISR=1 < min.insync.replicas

Rule of thumb. With RF=3, set min.insync.replicas=2 and acks=all. You trade some availability (writes can be rejected when 2 brokers are simultaneously unhealthy) for the guarantee that any acknowledged write survives a single broker loss.

Kafka interview question on durability tuning

A senior interviewer might frame this as: "Your team wants acks=all + min.insync.replicas=2 to satisfy the ledger SLO, but pages every time a single broker is replaced. How would you reason about availability vs durability here?"

Solution Using durability-first config with operational guardrails

kafka-topics.sh --bootstrap-server kafka-1:9092 \
  --alter --topic ledger \
  --config min.insync.replicas=2 \
  --config unclean.leader.election.enable=false
Enter fullscreen mode Exit fullscreen mode
producer = Producer({
    "bootstrap.servers": "kafka-1:9092,kafka-2:9092,kafka-3:9092",
    "acks": "all",
    "enable.idempotence": True,
    "retries": 2147483647,
    "max.in.flight.requests.per.connection": 5,
    "delivery.timeout.ms": 120000,
})
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Step Action Effect
1 Schedule rolling broker replacements one at a time ISR briefly drops to {2}, then re-grows to {3} as replica catches up
2 min.insync.replicas=2 + acks=all accepts writes throughout the single-broker swap
3 enable.idempotence=true + high retries producer transparently retries transient errors without creating duplicates
4 unclean.leader.election=false refuses to promote a non-ISR replica during the swap — no silent loss

Output:

Operation Producer-side observation Cluster-side state
Steady state writes succeed ISR = {1, 2, 3}
Broker B3 replaced writes succeed ISR = {1, 2} for ~30s, then {1, 2, 3}
Two brokers down (rare) writes throw NotEnoughReplicasException ISR = {1} — broker prefers reject over silent loss
Single broker permanent loss writes succeed ISR = {1, 2}; replacement rebuilds the third replica

Why this works — concept by concept:

  • ISR as the durability set — only ISR members can become leader, so any acks=all-committed write survives the loss of any single non-leader ISR member.
  • min.insync.replicas as the lower bound — by refusing writes when ISR drops to 1, the broker eliminates the window in which a single subsequent failure could erase a freshly-committed record.
  • Idempotent producer — combined with large retries, transient broker errors (leader election, ISR shuffle) heal automatically without duplicate appends.
  • unclean.leader.election=false — refuses to elect a divergent replica as leader, which would otherwise serve as a backdoor for data loss.
  • Cost — write latency = O(slowest ISR replica RTT); rejected-write rate during incidents ≈ small but visible — a deliberate trade for "no silent loss."

Streaming
Topic — streaming pipelines
Replication and durability problems

Practice →


4. Producer semantics — acks, idempotence, batching, compression

Producer config is a four-axis trade-off — durability, latency, throughput, and dedup

The producer config is where senior kafka data engineer interview rounds spend the most time, because every knob has a sharp trade-off. Once you can articulate the relationship between acks, enable.idempotence, linger.ms/batch.size, and compression.type, you can defend any production producer profile.

The four axes.

  • acks — durability.
    • acks=0 — fire-and-forget. Lowest latency, weakest durability. Use only for telemetry where loss is acceptable.
    • acks=1 — leader ack. Loses any write where the leader crashed before followers replicated.
    • acks=all — full ISR ack. The default for production data.
  • enable.idempotence — duplicates.
    • true (default in modern clients) — the broker tags each record with PID + sequence number; retries don't create duplicates.
    • false — retries CAN double-append. Almost never the right answer.
  • Batching — throughput vs latency.
    • batch.size (bytes per partition batch) and linger.ms (max wait time before send).
    • Bigger batches → higher throughput, more producer-side memory, higher latency.
  • Compression — bandwidth and disk.
    • compression.type=snappy — balanced. Default-good.
    • zstd — best ratio, more CPU.
    • lz4 — fastest decompress.
    • gzip — densest historically, more CPU.

The five most-asked producer interview questions.

  • "What's the difference between acks=1 and acks=all?" — leader-only vs full-ISR durability.
  • "How does idempotence prevent duplicates?" — PID + sequence number deduplication at the broker.
  • "What's the trade-off of increasing batch.size?" — throughput up, latency up, memory up.
  • "When is compression.type=zstd worth it?" — when bandwidth or disk dominates and CPU has headroom.
  • "What's a QueueFullException?" — the producer accumulator filled because the broker can't keep up; back off or scale brokers.

Worked example — producer ships duplicate records, debug it

Detailed explanation. A team reports duplicate orders in the downstream warehouse. The producer is sending to Kafka with retries on and idempotence off — a classic interview trap.

Question. Given a producer with acks=all, retries=10, enable.idempotence=false, and max.in.flight.requests.per.connection=5, explain how a transient network error can produce duplicate records and how to fix it.

Code (the broken producer).

producer = Producer({
    "bootstrap.servers": "kafka-1:9092",
    "acks": "all",
    "retries": 10,
    "enable.idempotence": False,          # ⚠️ trap
    "max.in.flight.requests.per.connection": 5,
})
producer.produce("orders", key=b"42", value=b'{"order_id":1001}')
producer.flush()
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. The producer sends record R1 to the leader.
  2. The leader writes R1, replicates to ISR, and sends an ack back.
  3. The ack is lost in the network (TCP RST, broker restart at the wrong moment).
  4. The producer doesn't see the ack, so it retries with a fresh record. The broker sees this as a new record — there's no PID/sequence number to dedupe on.
  5. R1 is now committed twice. Two records, same payload, different offsets.

Output (before fix).

Offset Record Notes
4711 {"order_id":1001} original send
4712 {"order_id":1001} duplicate from retry

The fix.

producer = Producer({
    "bootstrap.servers": "kafka-1:9092",
    "acks": "all",
    "enable.idempotence": True,           # ✓ dedup retries
    "max.in.flight.requests.per.connection": 5,
    "retries": 2147483647,
})
Enter fullscreen mode Exit fullscreen mode

Output (after fix).

Offset Record Notes
4711 {"order_id":1001} only one append — broker detected duplicate PID+seq

Rule of thumb. enable.idempotence=true should be the default for every production producer. The cost is negligible (a producer ID handshake at startup); the safety is large.

Kafka interview question on tuning producer throughput

A scaling-focused round often asks: "Throughput is at 60% of broker capacity but producers complain about latency. How would you tune?"

Solution Using linger.ms + batch.size + compression.type=zstd

producer = Producer({
    "bootstrap.servers": "kafka-1:9092,kafka-2:9092,kafka-3:9092",
    "acks": "all",
    "enable.idempotence": True,
    "compression.type": "zstd",
    "linger.ms": 20,                      # batch up to 20ms
    "batch.size": 131072,                 # 128KB per-partition batch
    "buffer.memory": 67108864,            # 64MB accumulator
})
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Step Knob Effect
1 linger.ms=20 producer waits up to 20ms after the first record before sending the batch
2 batch.size=128KB once the batch reaches 128KB, send immediately even if linger.ms hasn't elapsed
3 compression.type=zstd the entire batch is compressed once; broker stores it compressed; consumers decompress on read
4 acks=all latency = max RTT to the slowest ISR replica; for the batch as a whole, not per-record
5 enable.idempotence=true PID + sequence number per partition prevent duplicates on retry

Output:

Metric Before After
Producer throughput 60% capacity 95%+ capacity
Per-record latency p50 8 ms 24 ms
Per-record latency p99 80 ms 120 ms
Wire bytes 1.0× ~0.35× (zstd)

Why this works — concept by concept:

  • Batchinglinger.ms + batch.size group records destined for the same partition into one network call. Fewer round-trips per record → broker handles more work per CPU cycle.
  • Compression — zstd compresses the whole batch once; producer pays a small CPU cost, network bytes drop 3×+, broker disk usage drops correspondingly.
  • Idempotent producer — preserves at-most-once-per-record semantics under retry; required to safely run with retries cranked up.
  • acks=all — committed bytes must reach the full ISR; not loosened to gain the throughput.
  • Cost — latency rises ~3× at p50; throughput rises ~60%. Acceptable when bytes/sec is the bottleneck, not p99 latency.

Streaming
Topic — streaming pipelines
Producer tuning and dedup problems

Practice →


5. Consumer groups, offsets, and cooperative-sticky rebalancing

kafka consumer group rebalance is the operational story senior interviewers probe most

A consumer group is a set of consumers that share the partitions of a topic. Kafka assigns each partition to exactly one consumer in the group, so the maximum parallelism equals the partition count. Two consumer groups reading the same topic each get the full message stream independently — that's how Kafka delivers both queue and pub-sub semantics from one substrate.

Before/after trace of a Kafka cooperative-sticky rebalance — a 4-consumer × 8-partition assignment table on the left labelled 'Before' showing P0–P1 → C1, P2–P3 → C2, P4–P5 → C3, P6–P7 → C4; on the right an 'After' table where C3 has died and its partitions P4–P5 have been redistributed to C1 and C2 while C4's assignment is untouched, demonstrating the sticky property; below the tables a small heartbeat-timeout annotation explains the trigger; on a light PipeCode card.

The four concepts every consumer-group interview probes.

  • Offset. Position in a partition. Stored in the internal __consumer_offsets topic, partitioned by group id.
  • Heartbeat + session.timeout.ms. Consumer sends heartbeats to the group coordinator; missing them for session.timeout.ms (default 45s in Kafka 4.x) marks the consumer dead.
  • max.poll.interval.ms. Covers processing time between polls. If poll() isn't called within this window (default 5 min), the consumer is considered stuck and is kicked out.
  • Rebalance. Triggered by consumer join/leave or topic metadata change. The group coordinator redistributes partitions.

Assignment strategies.

  • RangeAssignor — contiguous ranges per consumer.
  • RoundRobinAssignor — even distribution across all consumers.
  • StickyAssignor — even + minimal movement.
  • CooperativeStickyAssignor (default, Kafka 4.x) — sticky + only the affected partitions revoke during rebalance (no stop-the-world).

Manual offset commit (the durability story).

  • enable.auto.commit=true — periodic commits in the background. Easy, but at-least-once is harder to reason about.
  • enable.auto.commit=false + consumer.commit() after processing succeeds — the common production choice. Commit only after the side effect (warehouse insert, downstream produce, etc.) completes.

Worked example — trace a cooperative-sticky rebalance

Detailed explanation. Four consumers (C1..C4) share an 8-partition topic. C3 dies. The default cooperative-sticky assignor moves only C3's partitions, leaving the others untouched — every other consumer keeps processing.

Question. Group has 4 consumers and the topic has 8 partitions. Walk the assignment before and after C3 dies under cooperative-sticky.

Input (before).

Consumer Partitions
C1 P0, P1
C2 P2, P3
C3 P4, P5
C4 P6, P7

Code (consumer config).

consumer = Consumer({
    "bootstrap.servers": "kafka-1:9092",
    "group.id": "order-pipeline",
    "enable.auto.commit": False,
    "isolation.level": "read_committed",
    "partition.assignment.strategy": "cooperative-sticky",
    "session.timeout.ms": 45000,
    "max.poll.interval.ms": 300000,
})
consumer.subscribe(["orders"])
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. t=0 — Steady state. Each consumer polls every 1–2 seconds; heartbeats arrive every 3 seconds.
  2. t=10C3 crashes (process killed). Heartbeats stop.
  3. t=10 + session.timeout.ms — Group coordinator marks C3 as dead and triggers a rebalance.
  4. Cooperative-sticky decisionC1, C2, C4 do not revoke their partitions; only C3's partitions need a new home. The coordinator redistributes {P4, P5} to the surviving consumers (one each).
  5. t=10 + ~5sC1 is now assigned {P0, P1, P4} and C2 is now assigned {P2, P3, P5}. C4 stays at {P6, P7} — completely unaffected.
  6. t=10 + ~5s — All three surviving consumers continue polling; total downtime ≈ a few seconds for the affected partitions only.

Output.

Consumer Partitions before Partitions after
C1 P0, P1 P0, P1, P4
C2 P2, P3 P2, P3, P5
C3 P4, P5 (dead)
C4 P6, P7 P6, P7 (untouched)

Rule of thumb. Under cooperative-sticky, a single-consumer failure only pauses the 2 partitions the dead consumer owned, for a few seconds. Under the older Eager protocol, all 8 partitions would have paused for the rebalance — multi-second stop-the-world on the whole group.

Kafka interview question on at-least-once consumer

A common probe: "Walk me through an at-least-once consumer that writes to BigQuery. Where does the commit go?"

Solution Using manual commitSync() after successful write

consumer = Consumer({
    "bootstrap.servers": "kafka-1:9092",
    "group.id": "orders-to-bq",
    "enable.auto.commit": False,
    "auto.offset.reset": "earliest",
    "isolation.level": "read_committed",
})
consumer.subscribe(["orders"])

BATCH = []
BATCH_SIZE = 500

while True:
    msg = consumer.poll(1.0)
    if msg is None or msg.error():
        continue
    BATCH.append(json.loads(msg.value()))
    if len(BATCH) >= BATCH_SIZE:
        write_to_bigquery(BATCH)         # ① side effect first
        consumer.commit(asynchronous=False)   # ② then commit offsets
        BATCH.clear()
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Event Action Committed offset Notes
poll batch 1 (500 records) write to BigQuery BQ write succeeds
commit commitSync offset advanced by 500 safe checkpoint
poll batch 2, BQ write fails (no commit) unchanged retry on next poll cycle
poll batch 2 again (BQ healthy) write to BigQuery BQ write succeeds
commit commitSync offset advanced by 500 next 500 records check-pointed
consumer crash mid-batch (no commit) unchanged restart re-reads from last commit

Output:

Metric Value
Delivery semantics at-least-once (duplicates possible if BQ retry resends)
Required BQ guard idempotent UPSERT on order_id
Pause-on-failure behaviour yes — consumer retries the same batch until BQ accepts

Why this works — concept by concept:

  • Process-then-commit ordering — committing after the side effect succeeds means a crash between the two operations re-delivers the batch on restart. No data loss.
  • Idempotent sink — BigQuery MERGE keyed on order_id deduplicates any re-delivered batch. The pipeline is end-to-end at-least-once with the warehouse handling the dedup.
  • cooperative-sticky — a single consumer crash only re-distributes its partitions; the rest of the group keeps moving.
  • isolation.level=read_committed — if upstream producers use transactions, the consumer skips uncommitted records (no half-written reads).
  • Cost — checkpoint frequency = 500 records / batch → bounded re-processing on crash; commit latency on commitSync is one broker RTT.

Streaming
Topic — streaming (Python)
Consumer group rebalance + offset problems

Practice →


6. Exactly-once semantics and Kafka transactions

kafka exactly once semantics is idempotent producer + transactions + read_committed — all three

Exactly-once isn't magic — it's the composition of three orthogonal features. The senior apache kafka interview questions litmus test is "can you name all three and explain what each one solves?"

Diagram of a Kafka exactly-once consume-transform-produce loop — an input topic 'raw-events' on the left, a transactional worker box in the middle (showing 'producer.beginTransaction()', 'producer.produce(enriched-events)', 'producer.sendOffsetsToTransaction()', 'producer.commitTransaction()'), an output topic 'enriched-events' on the right, and the internal '__consumer_offsets' + '__transaction_state' topics drawn as small ledger icons underneath; arrows show that all three writes commit atomically inside the same transaction; on a light PipeCode card.

The three components of EOS.

  • Idempotent producer (enable.idempotence=true) — eliminates producer-side duplicates from retries.
  • Transactions (transactional.id=…) — atomically commit writes to multiple partitions plus consumer offsets in one all-or-nothing operation.
  • isolation.level=read_committed on consumers — only see records from committed transactions; skip the in-flight ones.

The transactions API in five calls.

  • producer.init_transactions() — register the transactional.id with the transactional coordinator (a broker).
  • producer.begin_transaction() — open a new transaction.
  • producer.produce(...) — buffer writes to the participating partitions.
  • producer.send_offsets_to_transaction(...) — pass the consumer offsets into the transaction.
  • producer.commit_transaction() (or abort_transaction()) — atomically commit (or roll back) every write plus the offset advance.

Zombie fencing.

  • Each transactional.id has a producer epoch. When a new instance starts with the same transactional.id, the coordinator increments the epoch and the previous instance's writes are rejected. Even if a stuck old worker comes back, it cannot produce duplicates.

Where EOS doesn't extend.

  • External sinks (BigQuery, S3, Snowflake) are outside the Kafka boundary. EOS guarantees end at the output Kafka topic.
  • For end-to-end exactly-once into an external warehouse, you still need an idempotent UPSERT keyed on a business id.

Worked example — transactional consume-transform-produce

Detailed explanation. An ETL worker reads from raw-events, enriches each record, and writes to enriched-events. Without EOS, a crash between the produce and the offset commit would re-deliver the same record after restart and the enriched output topic would contain duplicates. With EOS, the produce and the offset advance commit atomically.

Question. Three input records (k=42, v={"a":1}), (k=42, v={"a":2}), (k=42, v={"a":3}) arrive on raw-events. Walk through a transactional consume-transform-produce loop showing what lands on enriched-events, what is recorded in __consumer_offsets, and what is recorded in __transaction_state.

Code.

from confluent_kafka import Consumer, Producer
import json

consumer = Consumer({
    "bootstrap.servers": "kafka-1:9092",
    "group.id": "etl-transformer",
    "enable.auto.commit": False,
    "isolation.level": "read_committed",
})

producer = Producer({
    "bootstrap.servers": "kafka-1:9092",
    "transactional.id": "etl-transformer-001",
    "acks": "all",
    "enable.idempotence": True,
})

producer.init_transactions()
consumer.subscribe(["raw-events"])

while True:
    msg = consumer.poll(1.0)
    if msg is None or msg.error():
        continue

    producer.begin_transaction()
    try:
        raw = json.loads(msg.value())
        enriched = {**raw, "enriched": True, "ts": "2026-05-27T00:00:00Z"}

        producer.produce(
            "enriched-events",
            key=msg.key(),
            value=json.dumps(enriched).encode(),
        )

        producer.send_offsets_to_transaction(
            consumer.position(consumer.assignment()),
            consumer.consumer_group_metadata(),
        )
        producer.commit_transaction()
    except Exception:
        producer.abort_transaction()
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Msg Step enriched-events (P0) __consumer_offsets (etl-transformer) __transaction_state
1 begin txn txn=etl-001-7, status=PREPARED
1 produce + sendOffsets {"a":1,"enriched":true} (uncommitted marker) offset=101 (uncommitted) (still PREPARED)
1 commit txn mark COMMITTED mark COMMITTED txn=etl-001-7, status=COMMITTED
2 full loop {"a":2,"enriched":true} COMMITTED offset=102 COMMITTED txn=etl-001-8, status=COMMITTED
3 full loop {"a":3,"enriched":true} COMMITTED offset=103 COMMITTED txn=etl-001-9, status=COMMITTED

Output:

Topic / Internal Final state
raw-events (input) unchanged — 3 records
enriched-events (output) 3 COMMITTED records, each with "enriched":true
__consumer_offsets etl-transformer.raw-events.P0 → offset 103
__transaction_state three COMMITTED markers (epoch 7, 8, 9)

Why this works — concept by concept:

  • Idempotent producer — PID + sequence number means a retry of the same produce is deduplicated at the broker. No duplicate appends from network blips.
  • Atomic transaction — the produce to enriched-events and the offset commit to __consumer_offsets succeed together or roll back together. There is no window where the output exists but the offset is unmoved (which would cause re-delivery on restart and duplicates downstream).
  • read_committed consumers — anyone downstream of enriched-events only sees committed records, not the in-flight ones. The transaction marker is what flips them from "visible" to "skip."
  • Producer epoch / zombie fencing — if the worker crashes and a replacement starts with the same transactional.id, the coordinator bumps the epoch and the old worker's pending writes are rejected. No "I came back to life" duplicates.
  • Cost — one transactional coordinator RTT per transaction; throughput remains high if each transaction batches many records, which the loop above does implicitly via poll batching.

Streaming
Topic — streaming (Python)
Exactly-once transactional pipelines (Python)

Practice →


7. Kafka Connect, Schema Registry, and Kafka Streams — the ecosystem

The integration surface — Connect for source/sink, Schema Registry for evolution, Streams for in-Kafka processing

Most Kafka interview rounds end with one ecosystem question — "When would you reach for Kafka Connect vs writing a custom consumer?" or "Streams or Flink?" The right answer is a one-line decision tree.

Kafka Connect (read this section as a deferral pointer too).

  • A standalone framework + cluster for moving data between Kafka and external systems.
  • Source connectors — JDBC (PostgreSQL, MySQL), Debezium (log-based CDC), MongoDB, S3 (file).
  • Sink connectors — BigQuery, Snowflake, S3, Elasticsearch.
  • Single Message Transforms (SMTs) for inline routing, casting, masking.
  • Detailed Debezium CDC patterns are covered in the dedicated CDC for Data Engineering Interviews guide — here we use Connect as the substrate only.

Schema Registry.

  • Stores Avro / JSON Schema / Protobuf schemas keyed by subject (typically topic-value or topic-key).
  • Producer attaches a 4-byte schema id to each message; consumer fetches the schema by id.
  • Compatibility modesBACKWARD (consumers handle new + old; default), FORWARD, FULL, NONE. Pick BACKWARD for log-style topics where consumers upgrade after producers.

Kafka Streams.

  • JVM library (Java / Scala / Kotlin) for stateful stream processing on top of Kafka.
  • Stateful operators — count, reduce, aggregate, windowed joins.
  • KStream (stream of changes) vs KTable (latest value per key) duality.
  • Runs inside your application — no extra cluster.
  • For richer event-time semantics, watermarks, and out-of-order handling at scale, reach for Apache Flink (covered in the dedicated Flink interview guide).

ksqlDB.

  • A standalone server that exposes Kafka Streams primitives as SQL. Same engine, easier surface for SQL-only teams.

Worked example — choose Connect vs custom consumer for a sink

Detailed explanation. A team needs to land every event from enriched-events into Snowflake. Options: write a custom consumer that batches and uses Snowflake's COPY, or run the Confluent Snowflake Sink Connector.

Question. Compare a custom Python consumer to Kafka Connect for a sink to Snowflake on five axes: code, offset management, retries, schema evolution, observability.

Code (the Connect config — no custom code).

{
  "name": "enriched-to-snowflake",
  "config": {
    "connector.class": "com.snowflake.kafka.connector.SnowflakeSinkConnector",
    "topics": "enriched-events",
    "snowflake.url.name": "myorg-myaccount.snowflakecomputing.com",
    "snowflake.user.name": "kafka_loader",
    "snowflake.private.key": "${file:/secrets/sf.key:key}",
    "snowflake.database.name": "PROD",
    "snowflake.schema.name": "RAW",
    "key.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://schema-registry:8081",
    "tasks.max": "4",
    "buffer.flush.time": "60",
    "buffer.count.records": "10000",
    "buffer.size.bytes": "100000000"
  }
}
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. The connector is a long-running task inside the Connect worker cluster.
  2. It consumes from enriched-events via Kafka's normal consumer-group mechanism (group id is the connector name).
  3. Records are buffered up to buffer.count.records=10000 or buffer.flush.time=60s, whichever comes first.
  4. Buffered records are written to a Snowflake-managed internal stage, then COPYd into the target table.
  5. Connect manages offsets — commits only after each COPY completes successfully.
  6. Schema evolution: when a producer publishes a new schema id, the connector fetches it from the Schema Registry and propagates the column changes (additive) to Snowflake.

Output (comparison).

Axis Custom consumer Kafka Connect
Code several hundred lines of Python zero, just JSON config
Offset management manual commitSync built-in
Retries / dead-letter manual built-in DLQ topic
Schema evolution manual automatic via Schema Registry
Observability custom metrics JMX + REST metrics out of the box

Rule of thumb. Reach for Kafka Connect first for source-to-Kafka and Kafka-to-sink. Write custom consumer code only when you need something Connect's connectors don't cover — bespoke business logic, exotic protocols, or unusual transaction boundaries.

Kafka interview question on choosing Streams vs Flink

A senior probe: "We need windowed rolling counts of clickstream events per user, joined against a user-profile changelog. Streams or Flink?"

Solution Using Kafka Streams DSL when state fits in-app

// orders-per-user-windowed.java (Kafka Streams DSL — illustrative)
StreamsBuilder builder = new StreamsBuilder();

KStream<String, ClickEvent> clicks = builder.stream("clicks",
    Consumed.with(Serdes.String(), clickSerde));

KTable<String, UserProfile> profiles = builder.table("user-profiles",
    Consumed.with(Serdes.String(), profileSerde));

KStream<String, EnrichedClick> enriched =
    clicks.join(profiles, EnrichedClick::new,
        Joined.with(Serdes.String(), clickSerde, profileSerde));

enriched
    .groupByKey()
    .windowedBy(TimeWindows.of(Duration.ofMinutes(5)).advanceBy(Duration.ofMinutes(1)))
    .count(Materialized.as("clicks-per-user-5m"))
    .toStream()
    .to("clicks-per-user-5m-output");
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

Stage Operator What it does
1 stream("clicks") reads click events from Kafka
2 table("user-profiles") reads profile changelog as a KTable (latest value per user_id)
3 clicks.join(profiles) stream-table join — enriches each click with the current profile
4 groupByKey().windowedBy(5m advance 1m) hopping window — 5-minute window every 1 minute
5 count() maintains a state store of rolling counts per (user_id, window)
6 to("output") publishes the windowed counts as a Kafka topic

Output:

user_id window_start window_end count
u_42 12:00 12:05 17
u_42 12:01 12:06 19
u_99 12:00 12:05 4

Why this works — concept by concept:

  • KStream + KTable duality — the changelog table is naturally modelled as a KTable; joining it with a stream gives "click events enriched with the user's profile at the time of join."
  • Hopping windowwindowedBy(5m, advanceBy=1m) gives a smoother rolling-count series than tumbling windows; useful for dashboards.
  • State store + changelog topic — Streams persists the rolling count locally (RocksDB) and to a compacted changelog topic in Kafka; on restart the local store is rebuilt from the changelog.
  • No extra cluster — Streams runs as threads inside your application; one fewer system to operate.
  • When to switch to Flink — if you need rich event-time semantics, very large state (TB-scale), or watermark-driven late-event handling, Flink is the right next step (covered in the Flink interview guide).
  • Cost — local state size = O(active_window_keys × window_count); compute = O(events) per join.

Streaming
Topic — real-time analytics
Windowed analytics on streams (Kafka Streams / Flink)

Practice →


Choosing the right Kafka primitive (cheat sheet)

  • Need ordering across many readers? Use a partition key that matches the ordering scope (per-user, per-session, per-tenant). Global ordering = one partition (and cap on throughput).
  • Need durability against one broker loss? replication.factor=3 + min.insync.replicas=2 + acks=all + unclean.leader.election=false.
  • Need to safely retry producer sends? enable.idempotence=true — turn it on and forget it.
  • Need to maximise producer throughput? Increase linger.ms and batch.size; enable compression.type=zstd.
  • Need to recover an at-least-once consumer cleanly? enable.auto.commit=false; commitSync() after the downstream side effect succeeds.
  • Need exactly-once across topics? Idempotent producer + transactions + isolation.level=read_committed. Don't skip any of the three.
  • Need to land Kafka data in a warehouse? Use Kafka Connect with a sink connector before writing a custom consumer.
  • Need windowed aggregations on streams? Kafka Streams if state fits in-app; Flink if state is huge or you need rich event-time semantics.

Frequently asked questions

How is Apache Kafka different from a traditional message queue like RabbitMQ?

Kafka is a distributed commit log; RabbitMQ is a queue with delete-on-consume semantics. Kafka persists messages for the entire retention window, so multiple consumer groups can read the same data independently and replay history is a first-class operation. Kafka is pull-based and uses sequential disk writes + zero-copy network transfers — it sustains millions of writes per second on commodity hardware. RabbitMQ shines for low-latency task queues and RPC patterns; Kafka shines for event streaming, CDC, and real-time analytics.

How does Kafka achieve exactly-once semantics, and what does it not cover?

Exactly-once in Kafka is the composition of three features: idempotent producer (enable.idempotence=true) deduplicates producer retries via a producer ID and sequence number; transactions (transactional.id) atomically commit writes to multiple partitions plus consumer offsets in one all-or-nothing operation; isolation.level=read_committed on consumers makes them skip in-flight transaction records. EOS guarantees end at the boundary of the output Kafka topic — for external sinks like Snowflake or BigQuery you still need an idempotent UPSERT keyed on a business id.

What is an In-Sync Replica (ISR) and why does min.insync.replicas matter?

The ISR is the set of replicas that have caught up with the leader within replica.lag.time.max.ms. Only ISR members can be elected leader, and acks=all waits for every ISR member to acknowledge the write. min.insync.replicas=2 (with RF=3) tells the broker to reject writes when the ISR shrinks to 1 — this is a deliberate trade: you'd rather see a visible write failure than have a single subsequent broker loss erase recently-committed data.

What triggers a Kafka consumer group rebalance and how does cooperative-sticky help?

A rebalance is triggered when a consumer joins or leaves the group, or when topic metadata changes (e.g. partitions added). Kafka 4.x defaults to cooperative-sticky rebalancing — only the partitions belonging to the affected consumer are reassigned; surviving consumers keep their existing partitions and don't pause. This is dramatically less disruptive than the older eager protocol, which paused every consumer in the group during every rebalance.

When should I use Kafka Connect instead of a custom consumer?

Reach for Kafka Connect first when you're moving data between Kafka and any common external system (databases via JDBC or Debezium, S3, Snowflake, BigQuery, Elasticsearch). You get offset management, retries with dead-letter topics, parallelism, schema evolution via the Schema Registry, and JMX/REST observability without writing code. Write a custom consumer only when you need business logic that doesn't fit a connector's SMT model or you need exotic transaction boundaries.

Should I use Kafka Streams or Apache Flink for stream processing?

Use Kafka Streams when your state fits comfortably in-app (low GBs), you're already on the JVM, and you want one fewer cluster to operate — it runs as threads inside your application and persists state to RocksDB + a Kafka changelog topic. Use Apache Flink when you need very large state (TB-scale), rich event-time semantics with watermarks, late-event handling, or you want stream processing decoupled from the producer applications. Flink is more powerful; Streams is simpler to deploy.

Practice on PipeCode

Pipecode.ai is Leetcode for Data Engineering — every Kafka concept above ships with hands-on practice rooms where you partition real data, configure real producers, and trace real rebalances. Start with the streaming library and work outward; PipeCode pairs every reading with 450+ DE-focused problems and a real-time scoring engine.

Practice streaming now →
Real-time analytics drills →

Top comments (0)