DEV Community

ANKUSH CHOUDHARY JOHAL
ANKUSH CHOUDHARY JOHAL

Posted on • Originally published at johal.in

Architecture Teardown: TikTok 2026 Video Feed Using Flink 2.0 and Redis 8.0

Architecture Teardown: TikTok 2026 Video Feed Using Flink 2.0 and Redis 8.0

TikTok’s 2026 video feed remains one of the most complex real-time systems globally, serving over 2.5 billion monthly active users with personalized, low-latency video recommendations. Behind the seamless scroll lies a tightly integrated stack built around Apache Flink 2.0 for stream processing and Redis 8.0 for high-performance state and caching. This teardown breaks down how these two technologies power the feed’s core workflows.

High-Level Architecture Overview

The feed system follows a four-layer architecture:

  • Ingestion Layer: Collects user interaction events (watch time, likes, shares, comments) and content metadata from TikTok’s global edge network, routing them to Apache Kafka and Pulsar clusters.
  • Stream Processing Layer: Powered by Flink 2.0, this layer processes real-time events, runs ML inference for recommendations, and updates user/ content state.
  • Storage & Caching Layer: Redis 8.0 acts as the primary low-latency store for hot data, with cold data offloaded to S3-based data lakes and TiDB for transactional metadata.
  • Serving Layer: Stateless API gateways that query Redis for pre-ranked feed items, apply final safety filters, and return results to client devices.

Apache Flink 2.0: Real-Time Processing Backbone

Flink 2.0, released in early 2025, introduced several features critical to TikTok’s 2026 stack:

  • Native Kubernetes operator for elastic scaling across global regions
  • Improved RocksDB state backend with 40% lower memory overhead
  • Built-in Flink ML 2.0 for real-time model inference without external services
  • Exactly-once processing guarantees across end-to-end pipelines

Key Flink workloads for the video feed include:

  • Event Enrichment: Joins raw user interaction events with content metadata and historical user state in 10-second tumbling windows.
  • Real-Time Feature Engineering: Computes aggregate features (e.g., average watch time for a video in the last hour) used by ranking models.
  • Feed Ranking: Runs lightweight gradient boosted tree models via Flink ML to score and sort 500+ candidate videos per user request.
  • Anomaly Detection: Flags spam, bot activity, and policy-violating content using Flink’s CEP (Complex Event Processing) library.

Flink pipelines write processed state and ranked feed items to Redis 8.0 via the optimized Flink Redis Connector 2.0, using async batched writes to minimize latency.

Redis 8.0: Low-Latency State and Caching

Redis 8.0, launched in late 2024, added capabilities tailored for large-scale recommendation systems:

  • Redis Query Engine for SQL-like queries on JSON and vector data
  • Native vector similarity search (VSS) with HNSW index support for sub-10ms nearest neighbor lookups
  • Improved cluster auto-scaling and active-active replication across regions
  • Redis TimeSeries module for high-throughput time-series data storage

Redis 8.0 serves three core use cases for the TikTok feed:

  • Pre-Ranked Feed Caching: Stores per-user sorted sets where scores represent personalized relevance, with TTLs of 5 minutes for hot users and 1 hour for inactive users. Sorted sets allow O(log N) insertion and range queries to fetch top 20 feed items.
  • Real-Time User State: Uses Redis Hashes to store user preferences, watch history snippets, and rate limit counters. JSON support enables storing complex nested feature vectors for ML models.
  • Content Feature Store: Leverages Redis VSS to store video embedding vectors, enabling content-based recommendation fallbacks when user history is sparse.

TikTok’s Redis 8.0 cluster spans 400+ nodes globally, storing 1.2TB of hot data with a p99 read latency of 8ms.

Flink-Redis Integration Patterns

TikTok uses two primary integration patterns between Flink and Redis:

  • Async State Synchronization: Flink reads user state from Redis at the start of each pipeline run, processes events, and writes updated state back to Redis in batches of 1000 records. Idempotent write keys (combining user ID and event timestamp) ensure exactly-once semantics even during Flink task retries.
  • Real-Time Feature Push: Flink pushes computed content features (e.g., trending scores) directly to Redis pub/sub channels, which Redis nodes consume to update local caches without full cluster syncs.

Cold data that falls out of Redis’s TTL is automatically offloaded to S3 via Redis’s tiered storage feature, with Flink falling back to S3 for historical state when Redis misses occur.

Performance and Reliability

The combined Flink 2.0 and Redis 8.0 stack delivers industry-leading performance metrics for the 2026 feed:

  • p99 feed generation latency: 92ms globally
  • Peak throughput: 12 million queries per second (QPS) during viral events
  • Flink cluster scale: 12,000 task managers across 6 global regions
  • Redis availability: 99.999% uptime via active-active replication and automatic failover

Flink’s incremental checkpointing and Redis’s AOF (Append Only File) persistence ensure no data loss during zone outages, with recovery times under 30 seconds for both systems.

Challenges and Lessons Learned

TikTok’s engineering team faced several challenges scaling this stack:

  • State Size Management: Flink’s per-user state grew to 2TB globally, requiring custom compression of historical watch data and tiered state offload to S3.
  • Redis Memory Optimization: Using Redis’s hash-max-ziplist-entries configuration reduced memory usage for small user state hashes by 35%.
  • Traffic Spike Handling: Flink’s Kubernetes operator auto-scales task managers within 60 seconds of QPS increases, while Redis’s cluster auto-scaling adds nodes in 2 minutes.

Key lesson: Tight coupling between stream processing and caching layers reduces network hops, but requires careful versioning of Flink connectors and Redis schemas to avoid compatibility issues.

Conclusion

TikTok’s 2026 video feed demonstrates how Flink 2.0 and Redis 8.0 can be combined to build a low-latency, high-throughput recommendation system at global scale. Flink handles the heavy lifting of real-time event processing and ML inference, while Redis delivers the sub-10ms state access needed for seamless user experiences. As TikTok moves toward 3 billion users in 2027, the team plans to adopt Flink 2.1’s upcoming GPU acceleration for ML workloads and Redis 8.1’s enhanced vector search capabilities for generative AI-powered recommendations.

Top comments (0)