DEV Community

c0d3l0v3r
c0d3l0v3r

Posted on

From a Simple Auth Service to a Distributed Authentication Platform with Kafka, Debezium, and Observability

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

I built a Distributed Authentication System as a long-term learning project to explore real-world backend and distributed systems concepts through a familiar use case: user authentication.

The project started as a simple authentication service with signup and login functionality. Over time, it evolved into a distributed architecture that incorporates event-driven communication, change data capture (CDC), observability, horizontal scaling, and performance testing.

The current system consists of:

  • PostgreSQL as the primary source of truth for user credentials
  • Redis for refresh token storage and session management
  • Kafka as the event streaming platform
  • Debezium for Change Data Capture (CDC) from PostgreSQL
  • MongoDB for materialized user profile documents
  • Nginx for load balancing across multiple service instances
  • Prometheus and Grafana for monitoring and observability
  • k6 for load testing and performance analysis

When a user signs up, the authentication service writes user data to PostgreSQL. Debezium captures the database change and publishes it to Kafka. A separate profile service consumes the event and creates a corresponding profile document in MongoDB. This allows services to communicate asynchronously while keeping the architecture loosely coupled.

Beyond implementing features, the main goal of this project was to understand the trade-offs involved in building distributed systems. Throughout development I conducted load-testing experiments, measured replication lag, analyzed bottlenecks, and documented the architectural decisions that shaped the system.

This project has become my personal distributed systems playground where I can experiment with new ideas, evaluate design decisions, and learn how real systems behave under load.

Demo

GitHub Repository

Repository:
https://github.com/c0d3l0v3r-HeHe/distributed-auth-system

Signup Flow

Image showing the signup flow in the distributed system

Login Flow

Login architectural diagram

The repository contains additional architecture diagrams, performance experiments, load-testing results, observability metrics, and detailed documentation covering the design decisions and trade-offs explored throughout the project.

Rather than reproducing all of those results here, I've kept this submission focused on the project's journey and evolution. If you're interested in the deeper technical details, bottleneck analysis, replication lag measurements, scaling experiments, or observability setup, you'll find them documented in the repository.

The Comeback Story

This project was not originally intended to be a standalone distributed authentication platform.

In May, I started building it as the authentication backend for a larger job portal project. The initial goal was fairly straightforward: implement signup, login, token management, and the supporting infrastructure needed for user authentication.

After building the core authentication service and setting up the initial infrastructure, I shifted my focus to other work and the project was left unfinished. While the foundation existed, many of the ideas I wanted to explore—distributed systems patterns, observability, scalability, and performance analysis—were still missing.

A few weeks later, I came across the GitHub Finish-Up-A-Thon Challenge and decided to revisit the project instead of letting it remain another abandoned repository.

Rather than simply cleaning up old code, I used the opportunity to significantly expand the project and turn it into a distributed systems playground.

During the revival, I:

  • Added Redis-based refresh token management
  • Implemented a CDC pipeline using PostgreSQL, Debezium, and Kafka
  • Added a dedicated profile service backed by MongoDB
  • Introduced Prometheus metrics and Grafana dashboards
  • Added k6 load-testing infrastructure
  • Scaled the authentication service horizontally behind Nginx
  • Conducted multiple performance experiments and documented the results
  • Created architecture diagrams and expanded the project documentation

One of the most valuable outcomes of revisiting the project was discovering bottlenecks that only became visible under load. While scaling the authentication service improved CPU utilization, load testing revealed that MongoDB and the CDC pipeline became the primary bottlenecks. Investigating these trade-offs taught me far more than simply building the original authentication service.

By the end of the challenge, the project had evolved from an unfinished backend component into a fully documented distributed authentication system that I can continue using to explore distributed systems concepts, scalability patterns, and performance engineering.

My Experience with GitHub Copilot

GitHub Copilot acted as an implementation and exploration partner throughout the revival of this project.

For many components, I first designed the architecture and wrote the interfaces, function signatures, and high-level implementation plan. I then used Copilot to generate boilerplate code, suggest implementations, and accelerate repetitive development tasks.

Copilot was particularly useful for:

  • Generating service scaffolding and repetitive CRUD logic
  • Assisting with Docker Compose configuration
  • Helping configure Prometheus metrics collection
  • Suggesting Kafka consumer and producer implementations
  • Writing integration and infrastructure tests
  • Explaining configuration options for Debezium and Kafka Connect
  • Speeding up refactoring and cleanup work

One workflow I found especially effective was defining the architecture and API contracts myself, leaving implementation placeholders, and then using Copilot to generate an initial implementation. This allowed me to focus more on system design decisions and less on repetitive coding.

I also used Copilot while setting up and validating the distributed architecture. It helped me troubleshoot configuration issues, understand service interactions, and quickly iterate on infrastructure changes during development.

The biggest benefit wasn't code generation itself—it was the ability to move from an idea to an experiment much faster. Since this project is intended as a distributed systems learning playground, that rapid feedback loop allowed me to spend more time investigating architectural trade-offs, performance bottlenecks, and scalability challenges.

Feedback Welcome

This project started as a learning exercise and has evolved into my personal distributed systems playground.

If you're an experienced backend or distributed systems engineer and happen to read this submission, I'd genuinely appreciate any feedback on the architecture, design decisions, bottlenecks, or trade-offs discussed throughout the project.

Some areas I'm currently thinking about include:

  • Improving the CDC pipeline under higher load
  • Reducing replication lag and read amplification
  • Better approaches to profile materialization
  • Scaling strategies beyond the current setup
  • Observability improvements and production-readiness considerations

Constructive criticism, alternative approaches, and architecture suggestions are all welcome. One of the main goals of this project is to learn from engineers who have solved these problems in real systems.

Cover Image

Top comments (0)