zaina ahmed

Posted on Apr 1

Getting Started with Apache Kafka: What I Learned Building Event-Driven Microservices at Ericsson

#kafka #eventdriven #microservices #career

When I first heard the word "Kafka" in a technical meeting at Ericsson, I nodded confidently while quietly Googling it under the table. A few months later, I was designing Avro schemas, building consumer groups, and debugging lag metrics in production.

This is everything I wish someone had told me when I started.

🎯 What Is Apache Kafka ?

Imagine a post office. Instead of letters, services send events things that happened, like "user logged in" or "notification dispatched". Instead of delivering directly to recipients, every event goes into a central post box (Kafka). Any service that cares about that event can pick it up whenever it's ready.

This is event-driven architecture. And Kafka is the most battle-tested way to build it.

The technical version:
Apache Kafka is a distributed event streaming platform that lets you:

Publish events from producer services
Store events reliably and durably
Subscribe to events from consumer services
Process streams of events in real time

At Ericsson, every customer notification: SMS, push, email, in-app is flown through a Kafka-based pipeline. Getting this right was critical. Getting it wrong meant missed messages at scale.

🏗️ Core Concepts You Must Understand

Topics

A topic is a named category for events. Think of it like a database table, but append-only.

notification-events     ← all notification events
user-activity-events    ← all user activity events
payment-events          ← all payment events

Producers

A producer is any service that writes events to a topic.

@Service
public class NotificationProducer {

    private final KafkaTemplate<String, NotificationEvent> kafkaTemplate;
    private static final String TOPIC = "notification-events";

    public NotificationProducer(KafkaTemplate<String, NotificationEvent> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void sendNotification(NotificationEvent event) {
        kafkaTemplate.send(TOPIC, event.getUserId(), event);
        log.info("Notification event sent for userId: {}", event.getUserId());
    }
}

Consumers

A consumer is any service that reads events from a topic.

@Service
public class NotificationConsumer {

    private final NotificationProcessor processor;

    @KafkaListener(topics = "notification-events", groupId = "notification-service")
    public void consume(ConsumerRecord<String, NotificationEvent> record) {
        log.info("Received event for userId: {}", record.key());
        processor.process(record.value());
    }
}

Consumer Groups

Multiple consumers can form a group to share the work of processing events. Kafka automatically distributes partitions across group members, this is how you scale horizontally.

notification-events (3 partitions)
        |
Consumer Group: notification-service
        |
    ┌───┴───┐
Consumer1  Consumer2  Consumer3
(partition 0) (partition 1) (partition 2)

Partitions

Topics are split into partitions, this is what enables parallelism. Events with the same key always go to the same partition, preserving order.

// Events for the same userId always land on the same partition
kafkaTemplate.send(TOPIC, event.getUserId(), event);
//                         ↑ this is the partition key

📦 Avro Schemas — Why They Matter

Raw JSON Kafka messages are flexible but dangerous at scale. One typo in a field name can break every downstream consumer silently.

Avro schemas solve this by defining the exact structure of every message, enforced at the Schema Registry level before a message is even sent.

Define your schema (notification-event.avsc):

{
  "type": "record",
  "name": "NotificationEvent",
  "namespace": "com.ericsson.notifications",
  "fields": [
    {
      "name": "eventId",
      "type": "string",
      "doc": "Unique identifier for this event"
    },
    {
      "name": "userId",
      "type": "string",
      "doc": "Target user identifier"
    },
    {
      "name": "channel",
      "type": {
        "type": "enum",
        "name": "Channel",
        "symbols": ["EMAIL", "SMS", "PUSH", "IN_APP"]
      }
    },
    {
      "name": "message",
      "type": "string"
    },
    {
      "name": "timestamp",
      "type": "long",
      "logicalType": "timestamp-millis"
    },
    {
      "name": "priority",
      "type": ["null", "string"],
      "default": null,
      "doc": "Optional priority level — backward compatible field"
    }
  ]
}

⚡ Setting Up Kafka Locally

The fastest way to run Kafka locally for development:

docker-compose.yml:

version: '3.8'

Start everything:

docker-compose up -d

Verify Kafka is running:

docker ps
# You should see zookeeper, kafka, and schema-registry all running

🔍 Monitoring Kafka Lag

Consumer lag is the number of unprocessed messages sitting in a partition. High lag = your consumers are falling behind.

At Ericsson we monitored this constantly.

Output:

GROUP                TOPIC                PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
notification-service notification-events  0          1250            1250            0
notification-service notification-events  1          980             985             5
notification-service notification-events  2          1100            1100            0

Partition 1 has a lag of 5 minor, but worth watching. If lag grows consistently, you need more consumer instances.

Add a second consumer instance by simply running another instance of your service, Kafka automatically rebalances partitions across the group.

🐛 Common Mistakes I Made (So You Don't Have To)

1. Not setting a partition key

// ❌ Wrong — no key, random partition assignment
kafkaTemplate.send(TOPIC, event);

// ✅ Correct — keyed by userId, preserves order per user
kafkaTemplate.send(TOPIC, event.getUserId(), event);

2. Creating ObjectMapper inside the consumer loop

// ❌ Wrong — new ObjectMapper on every message = slow
public void consume(String message) {
    ObjectMapper mapper = new ObjectMapper();
    Event event = mapper.readValue(message, Event.class);
}

// ✅ Correct — inject as Spring Bean
@Autowired
private ObjectMapper mapper;

3. Not handling deserialization errors

// ✅ Always configure an error handler
@Bean
public DefaultErrorHandler errorHandler() {
    return new DefaultErrorHandler(
        new DeadLetterPublishingRecoverer(kafkaTemplate),
        new FixedBackOff(1000L, 3)
    );
}

4. Forgetting idempotency

Kafka guarantees at least once delivery, your consumer may receive the same message twice. Always make processing idempotent:

public void process(NotificationEvent event) {
    // Check if already processed before doing work
    if (processedEventRepository.exists(event.getEventId())) {
        log.warn("Duplicate event ignored: {}", event.getEventId());
        return;
    }
    // Process and mark as done
    notificationService.send(event);
    processedEventRepository.save(event.getEventId());
}

📊 What I Learned in Production

After months of running Kafka in production at Ericsson:

Lesson	Detail
Partition key matters	Always key by the entity that needs ordering (userId, orderId)
Monitor lag daily	Lag growth is an early warning sign
Dead letter queues	Always have a DLQ for failed messages
Schema evolution	Add fields as nullable with defaults — never remove fields
Consumer group naming	Use descriptive group IDs — `notification-service` not `group1`
Replication factor	Always 3 in production for fault tolerance

🚀 Getting Started Checklist

If you're setting up Kafka for the first time:

[ ] Run Kafka locally with Docker Compose (see above)
[ ] Create your first topic
[ ] Write a simple producer in Spring Boot
[ ] Write a simple consumer with @KafkaListener
[ ] Define an Avro schema for your messages
[ ] Add error handling and a dead letter queue
[ ] Monitor consumer lag
[ ] Test with multiple consumer instances

🔮 What's Next

Kafka is one of those technologies where the basics are approachable but the depth is enormous. Once you're comfortable with the fundamentals, explore:

Kafka Streams — real-time stream processing
KSQL — SQL-like queries over Kafka streams
Exactly-once semantics — guaranteeing no duplicates
Kafka Connect — integrating Kafka with databases and external systems

The investment in learning Kafka properly pays dividends across your entire engineering career. Event-driven architecture is how modern distributed systems are built and Kafka is at the center of it.

Thanks for reading! I'm Zaina, a Software Engineer based in Perth, Australia, working with Java microservices, Apache Kafka, and cloud-native technologies at Ericsson. Connect with me on LinkedIn or check out my portfolio.

Found this useful? Drop a ❤️ and share it with a fellow engineer who's just getting started with Kafka!

DEV Community