nishaant dixit

Posted on May 6 • Originally published at sivaro.in

Avro vs Protobuf for Kafka Schema Registry: The Real-World Trade-Offs

I spent three years building data pipelines at SIVARO before I learned this lesson the hard way: choosing between Avro and Protobuf for Kafka Schema Registry isn't about serialization speed. It's about your team's future.

Here's the truth most tutorials won't tell you.

Everyone jumps into Schema Registry because they want data quality guarantees. They pick Avro because "that's what Kafka was built with." They pick Protobuf because "Google uses it." They're missing the real question.

What is Schema Registry? It's a centralized service that stores and manages schemas for your Kafka topics. Every message gets validated against a schema. Producers and consumers agree on the data contract. No more mysterious null fields. No more broken parsers at 3 AM.

I've watched teams waste months migrating between formats. I've seen production outages caused by backward-incompatible schema changes. I've debugged serialization bugs that took three engineers a week to fix.

This guide will save you from those mistakes.

We'll cover:

Real performance numbers from production systems
Schema evolution patterns that actually work
Code examples you can steal tomorrow
The hard truths about team expertise and tooling

Let's cut through the noise.

The Schema Registry sits between your producers and consumers. It enforces contracts. But the format you choose determines everything about how those contracts evolve.

Schema Registry stores versions of schemas. Producers register schemas before writing data. Consumers fetch schemas to deserialize. The broker never sees the schema—just binary data.

According to Confluent's guide on Decoupling Systems with Schema Registry, the magic happens because "schema information is stored separately from the data itself." This means your messages stay small while your schema evolves.

Avro: Born in Hadoop's ecosystem. Uses JSON for schema definitions. Tight integration with Confluent Schema Registry.

Protobuf: Google's battle-tested serialization format. Requires code generation. Strong typing.

JSON Schema: Human-readable. Easy debugging. Larger payloads.

The uncomfortable truth: JSON Schema is winning for developer experience. Avro is winning for data engineering teams. Protobuf is winning for polyglot systems.

I've seen this play out across dozens of clients. The choice isn't technical—it's organizational.

Let's talk numbers. Not benchmarks. Real production data.

According to Avro vs Protobuf from Conduktor, Protobuf typically serializes 20-30% faster than Avro for complex nested structures. But here's the catch: serialization speed rarely becomes your bottleneck.

What actually matters: Deserialization latency variation. Avro's dynamic deserialization can have unpredictable latency spikes. Protobuf's generated code produces consistent microsecond-level deserialization.

Avro wins for primitive-heavy schemas. Protobuf wins for nested structures.

The research from SoftwareMill's comparison shows that for a typical e-commerce order schema with 15 fields:

Avro: 87 bytes
Protobuf: 92 bytes
JSON: 215 bytes

The difference is negligible. But compound it over 100 million messages per day—that's gigabytes of network bandwidth.

Avro has first-class support. Confluent Schema Registry was built around Avro. Every Confluent feature—schema references, rule sets, compatibility checks—works best with Avro.

Protobuf support was added later. It works. But some edge cases around references are clunky.

The hard truth from the front lines: If you're using Confluent Cloud, stick with Avro. If you're self-hosting, Protobuf is more flexible.

This is the part that keeps me up at night. Schema evolution breaks production systems. Here's how to avoid that.

Most teams start with backward compatibility checks. This means a new schema can read data written with an old schema.

Avro handles this beautifully. You can add optional fields. Remove fields with defaults. Change field types if the conversion is safe.

Protobuf requires discipline. You must use optional keyword. You must reserve field numbers when removing fields. Miss one field number reuse? Your consumer breaks silently.

According to the ClearStreet engineering blog, their team migrated from Avro to Protobuf specifically because "Protobuf's generated code made schema evolution more explicit in our CI/CD pipeline." The trade-off: more boilerplate, fewer runtime surprises.

Forward compatibility means old consumers can read new messages. This is harder.

Avro struggles here. Its schema resolution rules require defaults for new fields. Without defaults, forward compatibility fails.

Protobuf excels here. Field-level forward compatibility is baked into the wire format. Unknown fields are preserved and re-serialized.

I learned this the hard way at SIVARO. We had a data lake consuming from Kafka with a three-month lag. Forward compatibility wasn't optional—it was survival.

Here's what Avro actually looks like in practice.

{
  "type": "record",
  "name": "OrderEvent",
  "namespace": "com.sivaro.ecommerce",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "userId", "type": "string"},
    {"name": "totalAmount", "type": "double"},
    {"name": "items", "type": {"type": "array", "items": "string"}},
    {"name": "couponCode", "type": ["null", "string"], "default": null}
  ]
}

from confluent_kafka import avro
from confluent_kafka.avro import AvroProducer

def delivery_report(err, msg):
    if err is not None:
        print(f'Delivery failed: {err}')

producer = AvroProducer({
    'bootstrap.servers': 'localhost:9092',
    'schema.registry.url': 'http://localhost:8081'
}, default_key_schema=key_schema_str,
   default_value_schema=value_schema_str)

order_data = {
    "orderId": "ORD-12345",
    "userId": "USR-67890",
    "totalAmount": 299.99,
    "items": ["SKU-001", "SKU-002"],
    "couponCode": "WELCOME10"
}

producer.produce(
    topic='order-events',
    key={"orderId": "ORD-12345"},
    value=order_data,
    callback=delivery_report
)
producer.flush()

The elegance is in the defaults. Notice "default": null for the coupon code field. This lets you add the field without breaking existing consumers.

Protobuf demands more upfront structure. The payoff is compiler-enforced correctness.

syntax = "proto3";

package com.sivaro.ecommerce;

message OrderEvent {
  string order_id = 1;
  string user_id = 2;
  double total_amount = 3;
  repeated string items = 4;
  optional string coupon_code = 5;
}

Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
    "io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer");
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
    "io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer");
props.put("schema.registry.url", "http://localhost:8081");

KafkaProducer<ProtobufSchema, OrderEventOuterClass.OrderEvent> producer = 
    new KafkaProducer<>(props);

OrderEventOuterClass.OrderEvent order = OrderEventOuterClass.OrderEvent.newBuilder()
    .setOrderId("ORD-12345")
    .setUserId("USR-67890")
    .setTotalAmount(299.99)
    .addItems("SKU-001")
    .addItems("SKU-002")
    .build();

ProducerRecord<ProtobufSchema, OrderEventOuterClass.OrderEvent> record = 
    new ProducerRecord<>("order-events", order);

producer.send(record);
producer.close();

Notice the optional keyword on coupon_code. Without it, Protobuf treats missing fields as zero values (empty string, 0, false). That distinction is critical for schema evolution.

The controversial take: Protobuf's generated code is better for enterprise teams. It forces you to handle every field type explicitly. No silent nulls. No runtime reflection.

This section is where I earn my credibility. I've made both choices and paid for them.

You're all-in on Confluent Cloud. The integration is seamless. Every new feature works first with Avro.
Your data engineering team owns the pipeline. Avro's JSON-like schema definitions are easier for data engineers who don't write Java.
You need dynamic schema resolution. Avro lets you read data without regenerating code.

You have multiple service teams. Protobuf's compiled languages give compile-time guarantees across Java, Go, Python, and Rust.
You need forward compatibility. The field-level versioning prevents data loss from old consumers.
Performance on nested data matters. Protobuf's encoding is more efficient for deeply nested structures.

According to a discussion on r/dataengineering, one engineer summarized it perfectly: "Avro is the path of least resistance with Confluent. Protobuf is the path of least surprise with your team."

I've found that to be exactly right.

You've picked your format. Now you need to migrate. Here's the playbook.

Add new fields as optional with defaults. Never remove fields in the same deployment.
Use compatibility mode BACKWARD or FULL. This catches breaking changes before they hit production.
Deploy consumers first, then producers. This ensures consumers can handle both old and new formats.

The research from Simon Aubury's comparison recommends a dual-schema approach:

Create a new topic with Protobuf schema
Write a bridge service that reads Avro, writes Protobuf
Migrate consumers one at a time
Finally migrate producers

We used this pattern at SIVARO for a client migrating 200 topics. Took three months. Zero data loss.

Is Avro faster than Protobuf for Kafka?
No. Protobuf serializes 20-30% faster for complex structures. Avro is marginally faster for simple flat records. Network I/O is usually the bottleneck, not serialization.

Can I use both Avro and Protobuf in the same Kafka cluster?
Yes. Schema Registry supports both simultaneously. Different topics can use different formats. The registry tracks format type per schema subject.

Does Schema Registry work with Protobuf?
Yes, since Confluent Schema Registry 5.5. You need to use the Protobuf serializer from Confluent's Maven repository. The registry handles compatibility checks for both Avro and Protobuf schemas.

Which is better for schema evolution: Avro or Protobuf?
Protobuf for forward compatibility (old consumers reading new data). Avro for backward compatibility (new consumers reading old data). Choose based on your consumer deployment order.

Does Protobuf require code generation?
Yes, in most implementations. You define .proto files, then compile them into Java, Go, or Python classes. Some languages support dynamic Protobuf parsing without generation.

Can I use JSON Schema instead of Avro or Protobuf?
Yes. JSON Schema is the easiest to debug and has the best developer experience. Payload sizes are 2-3x larger. According to AutoMQ's comparison, JSON Schema is gaining adoption for its simplicity.

What happens if I change a field name in Avro?
Avro resolves fields by position, not name. Renaming a field without changing its position works for backward compatibility. Consumers will read data with the old name but get the new field's value.

Does Protobuf support schema references like Avro?
Yes, since Confluent Schema Registry 7.0. Protobuf supports import statements that reference other schemas in the registry. This enables reusable type definitions across topics.

Is there a performance cost to using Schema Registry?
Negligible. Schema Registry adds 1-3 milliseconds per message for schema lookups. Results are cached locally after the first retrieval.

Should I migrate from Avro to Protobuf?
Only if your team is struggling with Avro's code generation or needs Protobuf's forward compatibility. The migration cost is high. Start new topics in Protobuf, then migrate existing ones gradually.

Here's what I want you to remember.

Schema Registry isn't optional at scale. Neither is picking between Avro and Protobuf. The right choice depends on your team's composition, not your infrastructure.

If you're a small team shipping fast: Use JSON Schema. Debug faster. Pay the performance tax.

If you're a data engineering team: Use Avro. The Confluent integration is worth the lock-in.

If you're building microservices: Use Protobuf. The compile-time guarantees prevent production fires.

My final advice: Start with whatever your team knows. Migrate when it hurts. The best schema format is the one your team can actually maintain.

Nishaant Dixit is the founder of SIVARO, a product engineering company specializing in data infrastructure and production AI systems. Since 2018, I've built systems processing 200K events per second across fintech, healthcare, and e-commerce platforms. I've debugged more serialization errors than I care to count. If you're building data-intensive systems, let's talk.

Connect with me on LinkedIn

Originally published at https://sivaro.in/articles/avro-vs-protobuf-for-kafka-schema-registry-the-real-world.

DEV Community

Avro vs Protobuf for Kafka Schema Registry: The Real-World Trade-Offs

Top comments (0)