DEV Community

Cover image for Designing the Spotify Top K
ZeeshanAli-0704
ZeeshanAli-0704

Posted on

Designing the Spotify Top K

Table of Contents

  1. System Overview
  2. Key Components
  3. High-Level Architecture
  4. Detailed Flow
  5. Flow Diagram
  6. Optimizations
  7. Scaling for Global Use
  8. Example API Request

System Overview

We need a scalable system to identify the Top K Heavy Hitters (most-viewed or most-played posts/songs) from a continuous stream of data. This system must:

  • Efficiently aggregate and rank data by views.
  • Support regional and global Top K queries.
  • Provide high availability, low latency, and scalability.

Key Components

  1. Database:
    • Cassandra/MongoDB: Optimized for writes, distributed, and highly available.
  2. Stream Processing:
    • Apache Kafka: Handles the real-time stream of view data.
    • Stream Processor: Aggregates views and calculates the top K.
  3. Cache:
    • Redis: Caches the most recent Top K results for quick access.
  4. API Layer:
    • Exposes endpoints for querying Top K posts (regionally and globally).
  5. Global Aggregator:
    • Aggregates regional data for global Top K computation.

High-Level Architecture


Detailed Flow

1. Data Ingestion

  • Scenario: Each view event streams into the system.
  • Components Involved:
    • Kafka:
    • Handles the continuous stream of data.
    • Each event contains: post_id, region_id, and timestamp.
    • Producers:
    • The frontend (Spotify clients) sends view data via Kafka Producers.
    • Partitioning:
    • Kafka topics are partitioned by region to enable parallel processing.

2. Stream Processing

  • Goal: Aggregate view counts for each post.
  • Components Involved:
    • Stream Processor (Apache Flink/Samza/Spark Streaming):
    • Aggregates views using windowing (time buckets).
    • Example: Count views per post_id for 1-minute windows.
    • Updates Cassandra in real-time.
    • Intermediate Results:
    • Stores partial aggregates (per post, per region) in Cassandra.

3. Database Storage

  • Cassandra/MongoDB Schema:
    • Posts Table:
    • post_id (Primary Key)
    • content, author, timestamp, view_count.
    • Views Table:
    • post_id (Partition Key)
    • region_id, timestamp, view_count.
    • Regions Table:
    • region_id, region_name.

4. Caching

  • Purpose: Reduce database load for repeated queries.
  • Implementation:
    • Cache the top K posts for each region (1-minute TTL).
    • Use Redis Sorted Sets for efficient ranking:
    • Key: region:<region_id>:top_k
    • Sorted by view_count.

5. Regional Aggregation

  • Goal: Calculate regional Top K posts.
  • Steps:
    • Query Cassandra for the latest view_count of all posts in the region.
    • Sort posts in descending order by view_count.
    • Store Top K results in Redis.

6. Global Aggregation

  • Goal: Calculate global Top K posts.
  • Steps:
    • Periodically fetch the regional Top K posts.
    • Aggregate them using a global aggregator.
    • Calculate global Top K using min-heaps or Redis Sorted Sets.
    • Store results in a global cache.

7. API Layer

  • Endpoint: GET /top-posts
    • Query Parameters:
    • region: Filter by region.
    • timeframe: Filter by timeframe.
    • k: Number of posts to retrieve.
    • Flow:
    • Query Redis for cached results.
    • If a cache miss occurs, query Cassandra for real-time computation.

Flow Diagram

1. User Interaction
   └──> API Service (`GET /top-posts?region=<region>&k=<value>`)
        ├──> Cache (Redis)
        │     ├──> Hit: Return Top K Results
        │     └──> Miss: Query Cassandra
        └──> Cassandra (Region/Global Tables)
              ├──> Stream Processor Updates
              └──> Kafka Streams (Real-time Views)
Enter fullscreen mode Exit fullscreen mode

Optimizations

1. Rate Limiting

  • Use API Gateway (e.g., Kong, AWS API Gateway) to throttle requests.
  • Apply per-user rate limits.

2. Caching

  • Use Redis for:
    • Regional Top K: region:<region_id>:top_k
    • Global Top K: global:top_k.

3. Partitioning

  • Partition Cassandra tables by:
    • Region: Distribute data geographically.
    • Timeframe: Separate tables for hourly/daily/weekly data.

4. Fault Tolerance

  • Kafka: Retains unprocessed events for retries.
  • Cassandra: Replicates data across nodes.

Scaling for Global Use

  1. Distributed Databases:
    • Use region-specific Cassandra clusters.
  2. Message Streaming:
    • Use Kafka MirrorMaker to replicate streams across regions.
  3. Global Aggregator:
    • Use Apache Flink for cross-region data aggregation.

Example API Request

Request:

GET /top-posts?region=NA&timeframe=24h&k=10
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "region": "NA",
  "timeframe": "24h",
  "top_posts": [
    {
      "post_id": "123",
      "author": "Artist1",
      "view_count": 120000,
      "content": "Post Content 1"
    },
    {
      "post_id": "456",
      "author": "Artist2",
      "view_count": 110000,
      "content": "Post Content 2"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Notes:
Windowing (time buckets) is separate design pattern. If you can explain them in details that will be gr8 in interview.

rate-limiting-global-tumbling-window-and-sliding-window

More Details:

Get all articles related to system design
Hastag: SystemDesignWithZeeshanAli

systemdesignwithzeeshanali

Git: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay