DEV Community

Cover image for Building a Mini Kafka in Go — My Journey Creating go-pub-sub
Satyam Shree
Satyam Shree

Posted on

Building a Mini Kafka in Go — My Journey Creating go-pub-sub

🌀 Building a Mini Kafka in Go — My Journey Creating go-pub-sub

A hands-on dive into distributed messaging, gRPC streams, and pluggable storage — all in Go.


👋 Intro

Over the past few weeks, I’ve been exploring how systems like Kafka and Pulsar handle messaging, partitioning, and consumer groups. Instead of just reading papers, I wanted to build one myself — from scratch — to truly understand how Pub/Sub systems work under the hood.

That’s how go-pub-sub was born — a lightweight, gRPC-based Publish–Subscribe broker written entirely in Go.

It started as a learning project but grew into a deployable, extensible system that can run locally or in containers.


⚙️ What is go-pub-sub?

go-pub-sub is a mini distributed broker that lets you:

  • Publish messages over gRPC
  • Stream them to consumers (with acknowledgments)
  • Organize them by topics, partitions, and consumer groups
  • Plug in different storage layers — in-memory, Redis, or Postgres
  • Expose Admin APIs to manage topics dynamically

It’s meant to be small, clear, and hackable — a Kafka-like learning platform.


🧩 Architecture Overview

Publisher(s) → gRPC (Publish)
Consumer(s)  ← gRPC Stream (Subscribe)
↳ Ack → gRPC (Ack)
↳ Admin APIs → CreateTopic, ListTopics, TopicInfo
Enter fullscreen mode Exit fullscreen mode

The broker exposes two main services:

  1. PubSub — handles Publish, Subscribe, Ack
  2. Admin — topic creation, inspection, listing

Internally, the broker consists of:

  • Broker layer → core logic for pub/sub, offsets, consumer groups
  • Storage layer → in-memory / Redis / Postgres adapters
  • Middleware → logging, recovery, optional auth
  • Config layer → YAML-based configuration
  • Clients → simple Go-based producer and consumer binaries

💡 Why I Built It

Like most Go devs, I’d read about channels, goroutines, and concurrency patterns — but hadn’t seen them used in distributed systems.

Building a Pub/Sub broker was the perfect playground:

  • gRPC streams → long-lived connections
  • Consumer groups → concurrency and coordination
  • Storage adapters → interfaces and dependency inversion
  • Message acknowledgment → reliable delivery patterns
  • Docker-based testing → reproducibility

And honestly, nothing teaches you distributed thinking like debugging your own broker 😅.


🧱 Core Design Concepts

Topics and Partitions

Each topic is split into one or more partitions for parallelism.

Messages are appended to a partition and assigned an incremental offset.

Consumer Groups

Multiple consumers in the same group share partitions.

Each group tracks committed offsets per partition, ensuring at-least-once delivery.

Storage Backends

  • Memory: great for tests and local development
  • Redis: fast ephemeral storage (supports consumer state tracking)
  • Postgres: durable persistence with offset commits and retention

🧰 Running It

1️⃣ Run the broker (in-memory)

go run ./cmd/server --config=config/config.yaml
Enter fullscreen mode Exit fullscreen mode

2️⃣ Start a consumer

go run ./client/consumer --addr=localhost:50051 --topic=events --group=g1
Enter fullscreen mode Exit fullscreen mode

3️⃣ Publish messages

go run ./client/producer --addr=localhost:50051 --topic=events --msg="hello world"
Enter fullscreen mode Exit fullscreen mode

Output:

INFO  2025/11/10 22:31:12 📤 published topic=events partition=0 offset=42
INFO  2025/11/10 22:31:12 ➡ received topic=events partition=0 offset=42 value=hello world
INFO  2025/11/10 22:31:12 ✅ acked offset=42
Enter fullscreen mode Exit fullscreen mode

🧪 Smoke Testing (Docker Style)

I wrote a full bash smoke test that spins up a broker container, waits for readiness using logs, and runs a real consumer + producer pair locally.

./scripts/smoke_with_docker.sh -n 100 -i go-pub-sub:local -p 50051
Enter fullscreen mode Exit fullscreen mode

It verifies that all messages are received and acknowledged, even across partitioned topics.

When it prints ✅ SUCCESS, you know your broker is behaving like a pro.


🗄️ Production-Ready Direction

Right now, go-pub-sub supports three storage types:
| Backend | Type | Use Case |
|----------|------|----------|
| memory | In-memory | Fast testing |
| redis | Ephemeral | Temporary persistence |
| postgres | Durable | Production use |

I’m currently building a Postgres-backed store that supports:

  • atomic offset assignment via SQL sequences
  • durable message and offset persistence
  • configurable retention cleanup
  • per-topic partition tables for scalability

This will make the broker durable and crash-safe while keeping it minimal.


🔍 Lessons Learned

  1. gRPC streams are powerful but tricky — they demand careful connection management.
  2. Consumer group coordination can get complex even with small datasets.
  3. Interfaces are your friend — swapping backends was easy once interfaces were clean.
  4. Dockerized smoke testing saved hours of debugging inconsistencies.
  5. Go’s concurrency model shines for pub/sub patterns.

🚀 What’s Next

  • 🗄️ Complete Postgres backend (durable persistence)
  • 🔎 Add /health and /metrics endpoints
  • 🧹 Implement retention cleanup goroutine
  • 🌐 Add gRPC-Web / WebSocket gateway for browser-based apps
  • 🧰 Build a simple chat app demo using go-pub-sub as backend

📚 Tech Stack

  • Language: Go 1.25
  • Transport: gRPC
  • Storage: In-memory / Redis / Postgres
  • Orchestration: Docker Compose
  • Testing: grpcurl, smoke scripts, Go integration tests

💬 Wrap-up

go-pub-sub started as a weekend curiosity and ended up teaching me more about distributed coordination than any book could.

It’s small, educational, and — with Postgres — production-capable for internal systems or side projects.

If you’re learning gRPC, Go interfaces, or concurrency patterns, this is one of those projects that will click everything into place.


🧠 Repo: github.com/satya-sudo/go-pub-sub

📦 Docker image (coming soon): docker pull satya-sudo/go-pub-sub


Built with ❤️ in Go — because understanding systems is better than just using them.


💡 Tags

#go #grpc #opensource #pubsub


If you’ve ever wanted to understand Kafka’s internals by building one, check out the repo and try running the smoke test!

Drop your thoughts, feature ideas, or PRs in the comments 👇

Top comments (0)