DEV Community

Shreyas Hosadurga
Shreyas Hosadurga

Posted on

Building a Scalable, Real-Time Tech News System with Python, RabbitMQ, and Kubernetes.

Hey,

I wanted to share a project I've been working on: a Distributed Tech News Delivery System. The goal was to build a scalable and fault-tolerant system that could collect tech news from various sources, process it, and deliver it to users in real-time based on their subscribed topics.

You can check out the full source code on my GitHub:
https://github.com/Shreyas2409/Distributed-Tech-news-delivery-system

The Core Architecture

The system is built using a microservices architecture, which keeps everything decoupled and scalable. The two main components are the Publisher and the Subscriber.

  • Publisher Service: This service is responsible for fetching news articles (using the NewsAPI) and publishing them into a message queue. To handle fault tolerance, it uses a leader election mechanism to ensure only one instance is actively fetching news.
  • Subscriber Service: This service manages user subscriptions to different topics (e.g., "AI", "kubernetes", "python"). It provides a REST API for clients to subscribe, unsubscribe, and retrieve news.
  • RabbitMQ: This is the backbone of the whole system. It acts as the message broker, decoupling the publishers from the subscribers. I used a fanout exchange to broadcast news efficiently to all active subscribers.

The Tech Stack

Here are the main technologies I used to bring this project to life:

  • Python: The core language for both microservices.
  • Flask: Used to create the simple REST API for the Subscriber service.
  • RabbitMQ: The message broker for handling asynchronous communication.
  • Docker: To containerize both the Publisher and Subscriber services.
  • Kubernetes: To orchestrate the containerized services, making the system scalable and resilient.
  • NewsAPI: The external data source for fetching tech news articles.
  • Locust: For load testing the system to see how it performs under pressure.

Key Features

Building this as a distributed system allowed me to incorporate some cool features:

  • Fault Tolerance: Using a leader election algorithm and heartbeat monitoring, the system can handle nodes going down.
  • Scalability: Both the publisher and subscriber services can be scaled horizontally using Kubernetes to handle increased load.
  • Data Consistency: The system uses a version-based conflict resolution mechanism to manage news articles.

How It Works (Simplified)

  1. A Publisher instance (the elected leader) fetches new articles from NewsAPI.
  2. It publishes these articles to a RabbitMQ fanout exchange.
  3. RabbitMQ broadcasts the articles to all connected Subscriber services.
  4. Subscribers receive the articles and store them.
  5. A user can then call the Subscriber's API (e.g., POST /subscribe to a topic or GET /news) to receive the news they care about.

This was a fantastic project for learning the ins and outs of distributed systems, message queues, and container orchestration.

I'd love for you to check out the repo on GitHub! Let me know what you think.

Top comments (0)