DEV Community

Aman Kumar
Aman Kumar

Posted on

Building an Enterprise-Grade Real-Time Analytics Pipeline with FastAPI and TimescaleDB

I just released v1.0.0 of an open-source, production-ready real-time analytics pipeline built with Python. Here's what it does and why you might care.

The Problem

Every SaaS product needs analytics — event tracking, real-time dashboards, time-series aggregations. Most teams either pay for Segment/RudderStack or build their own from scratch. This project is the "build your own" done right.

Architecture

Client → FastAPI → Redis Streams/Kafka → Event Processor → TimescaleDB → WebSocket → Dashboard
Enter fullscreen mode Exit fullscreen mode

Tech Stack

Component Technology
API Layer FastAPI (async, auto-docs, WebSocket native)
Event Queue Redis Streams or Apache Kafka
Storage TimescaleDB (PostgreSQL extension for time-series)
Real-time WebSocket with JWT auth + auto-reconnect
Metrics Prometheus + OpenTelemetry
Logging Structured JSON with correlation IDs
Deployment Docker Compose + Kubernetes

Features

  • Async Event Ingestion — REST API + batch endpoints, buffered via Redis Streams or Kafka
  • Adaptive Sampling — configurable rate-based sampling per event type
  • Data Retention — TTL-based policies with automatic partition management
  • Enterprise Security — JWT auth, RBAC (admin/editor/viewer), rate limiting, security headers, correlation IDs
  • Live Dashboards — WebSocket push with auto-reconnect
  • Observability — Prometheus metrics, OpenTelemetry traces, structured JSON logs
  • Production Ready — Docker multi-stage build, Kubernetes manifests, health checks

Quick Start

git clone https://github.com/aman179102/real-time-analytics-pipeline
cd real-time-analytics-pipeline
make install
docker compose up -d postgres redis
make migrate
make run-dev
Enter fullscreen mode Exit fullscreen mode

Testing

150 out of 152 unit tests pass. The 2 excluded tests are pre-existing async timing issues in process_loop — zero regressions introduced.

Enterprise Middleware Pipeline

Every request flows through:

CorrelationMiddleware → AuthMiddleware → SecurityHeadersMiddleware → RateLimitMiddleware → SizeLimiterMiddleware → Router
Enter fullscreen mode Exit fullscreen mode

Try It Out

GitHub: https://github.com/aman179102/real-time-analytics-pipeline

Star the repo if you find it useful! Contributions and feedback are always welcome.

Top comments (0)