π Table of Contents
- Purpose
- Event Lifecycle
-
Event Streams (Topics)
- events.raw (ingestion layer)
- events.cleaned (standardized stream)
- trip.events (business-critical stream)
- driver.location (high-frequency updates)
- dispatch.commands
- billing.events
- analytics.events (optional fan-out)
- Processing Patterns
- Fault Tolerance & Reliability
- Trade Offs
- Summary Flow Example
Uber Style Trip Event Pipeline (Detailed)
π― Purpose
- Handle millions of trip lifecycle events per second.
- Support real-time matching, billing, surge pricing, ETA calculations, fraud detection, and analytics.
- Guarantee low latency, reliability, and correct billing even with failures.
Event Lifecycle
Imagine a rider books a trip β driver accepts β trip starts β trip ends β payment β receipt.
Each step generates events, and Kafka is the backbone that transports and processes them.
Event Streams (Topics)
1. events.raw
(ingestion layer)
-
What it is:
- All raw events (user requests, driver pings, trip updates, payment events, logs) land here first.
- Similar to a staging area.
-
Why:
- Raw events may be incomplete, malformed, duplicated, or carry noise (e.g., multiple GPS pings per second).
- By first writing to
events.raw
, you donβt lose anything. - Acts as a data lake inside Kafka β good for replay, debugging, reprocessing.
-
Partitioning:
- Usually partition by eventType (trip, location, billing).
- Keeps ingestion scalable across multiple consumers.
2. events.cleaned
(standardized stream)
-
What it is:
- Events from
events.raw
get validated, deduplicated, schema-validated, enriched β written here. - Example:
- GPS events consolidated to 1 ping per X seconds.
- Trip request events checked for valid rider/driver IDs.
- Payment events tagged with transaction IDs.
- Events from
-
Why:
- Ensures downstream systems get consistent, structured data.
- Clean separation of dirty data handling vs business logic.
-
Partitioning:
- By entity ID (e.g.,
tripId
,driverId
) β preserves ordering for that entity.
- By entity ID (e.g.,
3. trip.events
(business-critical stream)
-
What it is:
- Cleaned trip lifecycle events only (request β accept β start β end β cancel).
- Every tripβs state transitions must be ordered correctly.
-
Why:
- Dispatch engine uses this for matching riders β drivers.
- Billing service uses this to compute fare.
- Analytics uses this to understand demand/supply.
-
Partitioning:
- By tripId β all trip updates land in same partition, preserving ordering.
4. driver.location
(high-frequency updates)
-
What it is:
- Continuous driver GPS updates (every few seconds).
-
Why:
- Needed for:
- Nearest driver search.
- ETA calculation.
- Surge pricing (supply-demand ratio).
-
Partitioning:
- By driverId (to keep updates for one driver ordered).
- Or by geoHash region (group drivers by location cells β helps load balance).
-
Additional Handling:
- Raw driver pings β cleaned β then aggregated (e.g., one location every 5 sec instead of every 1 sec).
5. dispatch.commands
-
What it is:
- Commands from dispatch/matching system β drivers.
- Example: βDriver 123, go pick up Rider 456β.
-
Why:
- Ensures low-latency, reliable delivery of matching decisions.
- Acts as the control plane of the system.
-
Partitioning:
- By driverId β ensures commands to a driver are in correct order.
6. billing.events
-
What it is:
- Financially critical events β trip started, trip ended, payment initiated, payment confirmed.
-
Why:
- Exactly-once processing required here (no double charges).
- Must be atomic: offset commit + DB update + event publishing.
-
Partitioning:
- By tripId or paymentId.
- Keeps billing updates for one trip ordered.
7. analytics.events
(optional fan-out)
-
What it is:
- Non-critical stream of enriched events β BI dashboards, ML pipelines.
- Example: demand trends, heat maps, revenue reports.
-
Why:
- Offloads heavy analytical processing from core dispatch/billing topics.
- Can tolerate slight delays.
Processing Patterns
- Validation & Cleaning
-
events.raw
βevents.cleaned
. - Deduplication, schema checks, enrichment.
- Matching & Dispatch
-
events.cleaned
+driver.location
β dispatch system. - Kafka Streams KTable to store active drivers.
- Rider requests matched to nearest available driver.
- Trip Lifecycle Management
-
trip.events
consumed by trip state machine. - Ensures transitions are valid (no βtrip.endβ without βtrip.startβ).
- Billing & Payments
-
trip.events
+billing.events
. - Kafka transactions ensure exactly-once writes to DB + Kafka.
- Analytics & ML
-
analytics.events
used for demand prediction, surge pricing, fraud detection.
Fault Tolerance & Reliability
-
Replication Factor = 3 for critical topics (
trip.events
,billing.events
). - Idempotent Producers = avoid duplicates.
- Kafka Transactions = atomic billing writes.
- Multi-region replication = MirrorMaker 2 for cross-region failover.
Trade Offs
- Low latency vs Cost β more partitions & brokers give speed but cost infra.
- Exactly-once for billing only β rest can tolerate at-least-once.
- Driver location noise β must downsample to avoid flooding Kafka.
β Summary Flow Example:
- Rider requests β
events.raw
β cleaned βtrip.events
. - Driver sends GPS β
events.raw
β cleaned βdriver.location
. - Matching system consumes both β produces
dispatch.commands
. - When trip ends β
billing.events
β payment service. - All events mirrored into
analytics.events
for BI/ML.
Please comment Netflixβs if you want:
Design Netflixβs recommendation system with Kafka?
More Details:
Get all articles related to system design
Hashtag: SystemDesignWithZeeshanAli
systemdesignwithzeeshanali
GitHub: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli
Top comments (0)