Mahesh Nikam

Posted on Aug 2 • Updated on Aug 5

Apache Kafka with .NET Core: A Comprehensive Guide

#netcore #kafka #schemaregistry

What is Kafka?
Apache Kafka is a distributed event streaming platform designed to handle high-throughput, fault-tolerant, and scalable data pipelines. It excels in real-time data streaming, enabling developers to build robust data pipelines and streaming applications.

Core Concepts

Topics: Categories for streaming data, where each topic is a logical channel to which data is sent and from which data is received.
Partitions: Distributed, ordered logs within topics that allow Kafka to scale horizontally and provide fault tolerance.
Producers: Applications that send data to Kafka topics.
Consumers: Applications that read data from Kafka topics.
Brokers: Servers that host topics and partitions, managing the storage and retrieval of data.
Consumer Groups: Clusters of consumers that work together to read data from Kafka topics, providing scalability and fault tolerance.

Key APIs

Producer API: Used to publish streams of records to Kafka topics.
Consumer API: Used to subscribe to topics and process streams of records.
Streams API: Used for building stream processing applications that transform input streams into output streams.
Connect API: Used to build and run reusable producers or consumers that connect Kafka topics to existing applications or data systems.

Advanced Features

Exactly-once semantics: Ensures that data is processed exactly once, preventing data duplication.
Transactional writes: Allows for atomic writes to multiple Kafka partitions.
Idempotent producers: Prevents duplicate records during network retries.
KRaft (Kafka Raft): A consensus protocol replacing ZooKeeper for metadata management, enhancing Kafka's reliability and scalability.

When to Use Kafka

Real-time data pipelines
Activity tracking
Metrics collection
Log aggregation
Stream processing
Event-sourcing
Commit log service

Alternatives

Apache Pulsar: Offers multi-tenancy and geo-replication. Ideal when tiered storage and multiple namespaces are needed.
RabbitMQ: Supports complex routing and low latency. Best for priority queues and request-reply patterns.
Amazon Kinesis: Fully managed service, perfect for integration with the AWS ecosystem.
Google Pub/Sub: A global message bus, suitable for multi-region deployments with exactly-once semantics.

Kafka Shines In

Large-scale data pipelines
Microservices event backbone
Real-time analytics and ML feature stores

When to Reconsider Kafka

Small-scale applications (the overhead may outweigh the benefits)
Strict ordering requirements across all messages
Need for complex message routing

Setting Up Kafka with ZooKeeper

Step 1: Download and Extract Kafka
Download Kafka from the official website. Extract the downloaded archive to your desired directory.

Step 2: Start ZooKeeper
Kafka uses ZooKeeper to manage distributed brokers. Start ZooKeeper using the following command:

bin/zookeeper-server-start.sh config/zookeeper.properties

Step 3: Start Kafka Server
Once ZooKeeper is running, start the Kafka server:

bin/kafka-server-start.sh config/server.properties

Step 4: Create a Topic
Create a Kafka topic named "test-topic":

bin/kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

Building a .NET Core Application with Kafka

Setting Up .NET Core
Create a .NET Core Application: Use the .NET CLI to create a new console application.

dotnet new console -n KafkaDemo
cd KafkaDemo

Add Kafka NuGet Package: Add the Confluent.Kafka package to your project.

dotnet add package Confluent.Kafka

Producer Example

using Confluent.Kafka;
using System;
using System.Threading.Tasks;

class Producer
{
    public static async Task Main(string[] args)
    {
        var config = new ProducerConfig { BootstrapServers = "localhost:9092" };

        using (var producer = new ProducerBuilder<Null, string>(config).Build())
        {
            try
            {
                var deliveryResult = await producer.ProduceAsync("test-topic", new Message<Null, string> { Value = "Hello Kafka" });
                Console.WriteLine($"Delivered '{deliveryResult.Value}' to '{deliveryResult.TopicPartitionOffset}'");
            }
            catch (ProduceException<Null, string> e)
            {
                Console.WriteLine($"Delivery failed: {e.Error.Reason}");
            }
        }
    }
}

Consumer Example

using Confluent.Kafka;
using System;

class Consumer
{
    public static void Main(string[] args)
    {
        var config = new ConsumerConfig
        {
            GroupId = "test-consumer-group",
            BootstrapServers = "localhost:9092",
            AutoOffsetReset = AutoOffsetReset.Earliest
        };

        using (var consumer = new ConsumerBuilder<Ignore, string>(config).Build())
        {
            consumer.Subscribe("test-topic");

            try
            {
                while (true)
                {
                    var cr = consumer.Consume();
                    Console.WriteLine($"Consumed message '{cr.Value}' at: '{cr.TopicPartitionOffset}'.");
                }
            }
            catch (OperationCanceledException)
            {
                consumer.Close();
            }
        }
    }
}

Schema Registry with .NET

Setting Up Schema Registry
Schema Registry is a tool that provides a serving layer for your metadata. It provides a RESTful interface for managing Avro schemas. You need to download and run the Schema Registry provided by Confluent.

Step 1: Download and Extract Schema Registry
Download the Schema Registry from the Confluent website. Extract the downloaded archive to your desired directory.

Step 2: Start Schema Registry
Start the Schema Registry using the following command:

bin/schema-registry-start config/schema-registry.properties

Using Schema Registry in .NET Core

Schema Registry in Kafka is a component that provides a centralized repository for managing and validating schemas for data produced and consumed via Kafka. Here's a brief overview of its uses:

Schema Management: Stores Avro, JSON, and Protobuf schemas for Kafka topics, ensuring consistent data structure across producers and consumers.

Version Control: Maintains a version history of schemas, allowing for schema evolution and backward/forward compatibility checks.

Validation: Ensures that data written to Kafka topics adheres to predefined schemas, preventing schema mismatches and data corruption.

Interoperability: Facilitates smooth integration between different applications and services by standardizing data formats.

Decoupling: Separates schema management from application logic, simplifying the development and maintenance process.

By using Schema Registry, developers can ensure data consistency, streamline data processing, and improve overall data quality within their Kafka-based data pipelines.

Set up Schema Registry:
Ensure you have Confluent's Schema Registry running. You can use Docker for this:

docker run -d --name schema-registry -p 8081:8081 -e SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=PLAINTEXT://localhost:9092 confluentinc/cp-schema-registry:latest

Create an Avro Schema:
Define your Avro schema and register it in the Schema Registry.

{
  "type": "record",
  "name": "User",
  "namespace": "com.example",
  "fields": [
    {"name": "name", "type": "string"},
    {"name": "age", "type": "int"}
  ]
}

Define Avro Schema: Define your Avro schema and compile it using tools like avrogen.

You can register this schema using the Schema Registry API:

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\":\"record\",\"name\":\"User\",\"namespace\":\"com.example\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"age\",\"type\":\"int\"}]}"}' \
http://localhost:8081/subjects/user-value/versions

Add Confluent.SchemaRegistry NuGet Package: Add the Confluent.SchemaRegistry package to your project.

dotnet add package Confluent.Kafka
dotnet add package Confluent.SchemaRegistry
dotnet add package Confluent.SchemaRegistry.Serdes

Producer Example with Avro

using System;
using System.Threading.Tasks;
using Confluent.Kafka;
using Confluent.SchemaRegistry;
using Confluent.SchemaRegistry.Serdes;
using Avro.Generic;

class Program
{
    static async Task Main(string[] args)
    {
        var schemaRegistryConfig = new SchemaRegistryConfig { Url = "http://localhost:8081" };
        var producerConfig = new ProducerConfig { BootstrapServers = "localhost:9092" };

        using var schemaRegistry = new CachedSchemaRegistryClient(schemaRegistryConfig);
        using var producer = new ProducerBuilder<string, GenericRecord>(producerConfig)
            .SetValueSerializer(new AvroSerializer<GenericRecord>(schemaRegistry))
            .Build();

        var schema = (await schemaRegistry.GetLatestSchemaAsync("user-value")).SchemaString;
        var parser = new Avro.IO.Parser();
        var avroSchema = parser.Parse(schema);

        var user = new GenericRecord(avroSchema);
        user.Add("name", "John Doe");
        user.Add("age", 30);

        var message = new Message<string, GenericRecord> { Key = "user1", Value = user };

        var deliveryResult = await producer.ProduceAsync("users", message);

        Console.WriteLine($"Delivered '{deliveryResult.Value}' to '{deliveryResult.TopicPartitionOffset}'");
    }
}

Consumer Example with Avro

using System;
using Confluent.Kafka;
using Confluent.SchemaRegistry;
using Confluent.SchemaRegistry.Serdes;
using Avro.Generic;

class Program
{
    static void Main(string[] args)
    {
        var schemaRegistryConfig = new SchemaRegistryConfig { Url = "http://localhost:8081" };
        var consumerConfig = new ConsumerConfig
        {
            BootstrapServers = "localhost:9092",
            GroupId = "test-consumer-group",
            AutoOffsetReset = AutoOffsetReset.Earliest
        };

        using var schemaRegistry = new CachedSchemaRegistryClient(schemaRegistryConfig);
        using var consumer = new ConsumerBuilder<string, GenericRecord>(consumerConfig)
            .SetValueDeserializer(new AvroDeserializer<GenericRecord>(schemaRegistry).AsSyncOverAsync())
            .Build();

        consumer.Subscribe("users");

        while (true)
        {
            var consumeResult = consumer.Consume();
            var user = consumeResult.Value;

            Console.WriteLine($"Consumed record with name: {user["name"]}, age: {user["age"]}");
        }
    }
}

Happy coding!!!

DEV Community

Apache Kafka with .NET Core: A Comprehensive Guide

Building a .NET Core Application with Kafka

Schema Registry with .NET

Top comments (0)

Read next

Understanding Open Source Software-(OSS)🚀

Introducing SakshCart: A Comprehensive Node.js Shopping Cart Solution

SQLSTATE[42S02]: Base table or view not found: 1146 Table 'planner_v2_be.role_user' doesn't exist

Quick Tip: Secret Scopes For Azure Key Vaults