andrewlegacci

Posted on Mar 9

Kafka partitioning: the part that decides whether your design works

#kafka

Once you know the basics of Kafka, partitioning is where the real design work starts. Topics, consumers, offsets, and producer APIs are straightforward enough. Partitioning is where correctness, throughput, and operational behavior meet. It is also where teams make decisions that look harmless early on and become expensive later.

The first thing to keep in mind is that partitioning is not just a scaling mechanism. It is also Kafka’s ordering model. Kafka only guarantees order within a single partition. That means every decision about how records are assigned to partitions is also a decision about which records are ordered relative to each other, and which are not.

How records get routed to partitions

When a producer writes a record to Kafka, the partition is chosen in one of a few ways.

The common case is keyed routing. If the record has a key, the producer hashes that key and uses the result to choose a partition. The exact hashing details depend on the client and partitioner implementation, but the general idea is stable: the same key maps to the same partition, as long as the topic’s partition count stays the same. That is what gives you per-key ordering. All events for account-123 or order-987 land in one partition, so consumers see them in the order they were written.

If the record has a null key, Kafka does not have any domain signal to preserve. In that case producers typically distribute records across partitions in a round-robin or sticky way, depending on the client implementation. The practical outcome is the same for design purposes: records are spread out to balance load, but there is no ordering guarantee between related records because Kafka has no notion of which records are related.

There is also the option of a custom partitioner. This lets you override the default routing logic and choose partitions yourself. That can be useful when the built-in behavior does not match your needs, such as routing based on a derived field, adding affinity rules, or implementing a load-spreading strategy for skewed keys. But a custom partitioner is not a free optimization knob. It becomes part of your correctness model. Once you own partition routing, you also own the consequences.

The partition key is really a domain decision

It is easy to talk about partition keys as a technical setting. In practice, key choice is a domain modeling decision disguised as infrastructure.

The key defines your unit of ordering. That is the real rule. If two events must be observed in order relative to each other, they need to land in the same partition. If they do not need a shared order, they can be partitioned independently.

Take bank account events. Suppose you emit AccountDebited, AccountCredited, and AccountClosed. If a consumer builds account balances or checks business invariants, order matters per account. A debit followed by a close is different from a close followed by a debit. In that case, the obvious partition key is account ID. That keeps all events for a single account in one partition and gives you a clean per-account sequence.

Now imagine choosing customer region as the partition key instead. That may look reasonable from a load distribution perspective, but it breaks the ordering model you actually need. Two events for the same account could end up in different partitions if the routing logic is not aligned with account identity, and now consumers no longer have a reliable per-account event stream. The problem is not that Kafka failed. The problem is that the topic’s partitioning scheme encoded the wrong domain boundary.

The same thing shows up with order processing. If the business rule is that all events for one order must remain ordered, key by order ID. If what actually needs ordering is the aggregate of all orders for a warehouse or a merchant, then one of those may be the correct key instead. There is no universally correct answer. The correct answer is whichever entity defines the ordering boundary your system depends on.

That is why “just use user ID” or “just use tenant ID” is not a design principle. It is a guess. Sometimes it is the right guess. Often it is not.

Bad key choices fail in subtle ways

The tricky part is that a poor partition key does not always fail loudly. The system still runs. Records still arrive. Consumers still process them. The failure shows up as rare inconsistencies, impossible state transitions, or compensating logic that seems to grow for no clear reason.

A classic symptom is when a downstream service starts needing extra reads, locks, or deduplication tricks to recover order that Kafka never promised in the first place. Another is when teams start saying things like “this event is usually earlier, but sometimes it arrives later.” That is often not a transport delay problem. It is a partitioning problem.

If your correctness depends on a particular order, that order must be encoded in the partition key. There is no middleware setting later that will restore it.

Hotspots and disproportionately popular keys

Even with a correct key, load distribution may still be poor. Some keys are much hotter than others.

Suppose you key events by merchant ID and one merchant handles half the traffic in the system. All of that merchant’s records map to one partition. That partition becomes a hotspot. Producer throughput for that key is bounded by one partition’s write path, and consumer throughput for that key is bounded by one active consumer in the group. The rest of the partitions may be idle while one partition is overloaded.

This is the main tradeoff in Kafka partitioning. Ordering and even distribution often pull in opposite directions. The more you concentrate related records to preserve order, the more you risk skew.

There is no universal fix. Sometimes the hotspot is acceptable because correctness matters more than balanced throughput. Sometimes the domain lets you shard further. If order only matters per account, and a merchant contains many independent accounts, then account ID may distribute better than merchant ID. In other cases teams deliberately split a hot key into subkeys such as merchant-123#0, merchant-123#1, and so on. That can spread load, but it also weakens ordering from “per merchant” to “per merchant shard.” If you do that, you are changing the contract, not just tuning performance.

Another mitigation is separating workloads. A topic used for strict transactional processing may need one key strategy, while an analytics-oriented topic can use a different one because its ordering requirements are weaker. That is often cleaner than trying to make one topic satisfy incompatible requirements.

Partition count is a mostly one-way door

Partition count looks like a capacity knob, but it is closer to a design commitment.

You can increase the number of partitions later, and teams often do. The problem is that changing partition count changes the key-to-partition mapping for hashed keys. A key that used to land in partition 3 may now land in partition 11. From that point onward, new records for the same key go to a different partition than older records.

That matters because ordering is only meaningful within a partition. Once the mapping changes, you no longer have one continuous ordered stream for that key across old and new data. During migration windows or with in-flight data, consumers may observe sequences that are ordered within each partition but not globally across the old and new placement. If your application assumes a single uninterrupted per-key stream, increasing partitions is not operationally neutral.

This is why partition count is a mostly one-way door. You can raise it, but the consequences are real. Reducing partitions is even harder and generally treated as impractical. So plan ahead. Pick a partition count that gives you room for growth, not just today’s traffic.

Partitions are also your unit of parallelism

Kafka scales consumers at the partition level. Within one consumer group, a partition is assigned to only one active consumer at a time. That means partitions define the maximum parallelism of that group.

If a topic has twelve partitions, the most useful number of active consumers in one group is twelve. If you run fewer, some consumers will process multiple partitions. If you run more, the extras sit idle because there is no partition left to own. Adding consumers beyond the partition count does not increase throughput. It just increases noise, resource use, and group management churn.

This is worth keeping in mind when people say they want “more consumer scalability.” Often what they really need is more partitions, assuming the key design can tolerate it. But as already noted, more partitions are not a free change. Parallelism, ordering, and future flexibility are all tied together.

Rebalancing and the pause you eventually notice

Consumer groups are dynamic. Instances join, leave, crash, restart, and deploy. Whenever group membership changes, Kafka reassigns partitions among consumers. That process is called rebalancing.

A rebalance also happens in other situations, such as topic metadata changes or partition count changes. During a rebalance, consumers briefly stop normal processing while the group agrees on a new assignment. In healthy systems this is usually short, but it is still visible. You may see brief processing pauses, lag spikes, or latency jumps around deployments and failures.

This is one reason noisy consumer fleets can hurt stability. Frequent restarts mean frequent rebalances. Kafka has improved this over time, and cooperative rebalancing reduces disruption by avoiding full stop-the-world reassignment in many cases. It helps, but it does not eliminate the fact that partition ownership is a coordinated group activity with short pauses built into the model.

The design question that matters

Partitioning is where Kafka stops being generic infrastructure and starts reflecting your domain. The partition key decides which records stay together, which ones can be processed in parallel, where hotspots appear, and what kind of ordering your consumers can rely on. Partition count decides how much parallelism you can get and how much future flexibility you have without remapping keys and complicating ordering.

Before creating a topic, the most important question is simple: what is my unit of ordering, and does my partition key reflect that?

DEV Community