Rocio Baigorria

Posted on Mar 16

From Kafka to the Cloud: Designing a Real-Time Event-Driven Data Pipeline on AWS

#aws #dataengineering #kafka #eventdriven

Modern data platforms are increasingly built around event-driven architectures. Instead of systems constantly polling databases or relying on synchronous APIs, services react to events as they happen.

In this article I’ll walk through the design of a real-time streaming pipeline capable of processing 15,000+ events per second with sub-50ms latency.

The project started as a distributed system built with open-source technologies and later evolved into a cloud-native architecture on AWS.

The key idea is simple:

Understand the fundamentals first, then move the architecture to managed cloud services.

The Original Architecture (Local Distributed System)

The first version of the project was implemented using the following stack:

Apache Kafka for event streaming
Kafka Streams for real-time processing
Spring Boot for the processing services
PostgreSQL for durable storage
Redis for low-latency read projections
Prometheus and Grafana for monitoring

Event Flow

The pipeline follows a typical streaming architecture.

Producer → Schema Registry → Kafka → Stream Processing → Storage → Analytics

A producer publishes transaction events to Kafka
Each event is serialized using Avro and validated against Schema Registry
Kafka partitions allow parallel consumption
A streaming service processes events using Kafka Streams
Results are stored in PostgreSQL and Redis

This architecture enables real-time anomaly detection by applying sliding-window aggregations to the event stream.

Performance Benchmarks

The system was designed with performance and reliability in mind.

Metric Result
Throughput 15K+ events/sec
P99 Latency <50ms
Availability 99.95%
Data Loss 0% (exactly-once processing)

Several optimizations helped achieve these results:

Producer batching (32KB batch size)
Snappy compression
Parallel consumers
Connection pooling
Transactional event processing

Distributed Systems Patterns Implemented

This project demonstrates several architectural patterns commonly used in modern data platforms.

Event Sourcing

Kafka acts as the immutable event log. Every state change is stored as an event.

CQRS

Write operations store events while Redis maintains optimized read models.

Outbox Pattern

Ensures reliable event publishing from the database.

Saga Pattern

Coordinates distributed workflows without synchronous transactions.

Circuit Breaker

Improves resilience by isolating failing components.

Moving the Architecture to AWS

After implementing the pipeline locally, the next step was mapping the same design to managed cloud services on AWS.

The goal was not to redesign the system, but to replace infrastructure with managed services.

Cloud Architecture
Producer
↓
EventBridge / MSK
↓
Lambda processing
↓
Step Functions orchestration
↓
DynamoDB / RDS
↓
CloudWatch monitoring

Event Ingestion

Events can be published to:

Amazon EventBridge for event routing

Amazon MSK for managed Kafka streaming

Processing Layer

Events are processed by AWS Lambda, which allows the pipeline to scale automatically based on event volume.

Workflow Orchestration

Complex workflows are coordinated using AWS Step Functions, which define the pipeline as a series of steps such as:

event validation
enrichment
anomaly detection
persistence

Storage

Data can be stored depending on the access pattern:

DynamoDB for high-scale key-value access

Amazon RDS for relational workloads

Observability

Monitoring and logs are handled by Amazon CloudWatch, allowing engineers to track:

throughput
errors
latency
workflow executions

The Key Insight

The most important lesson from this project is that the architecture itself does not change when moving to the cloud.

The same principles remain:

events are immutable
services react asynchronously
systems scale through partitioned streams
state is derived from event logs

Cloud services simply remove the burden of managing infrastructure.

Final Thoughts

Understanding how streaming systems work internally makes it much easier to design reliable cloud-native data platforms.

Instead of thinking only in terms of tools, focus on the system flow:

Event → Stream → Process → Persist → Observe

Once those fundamentals are clear, migrating the system to cloud platforms like AWS becomes a natural evolution.

Design, therefore I exist.

DEV Community