Instagram Stories and WhatsApp Status look simple: upload media, show it to followers, and delete it after 24 hours. Under the hood, however, they are classic examples of ephemeral, large-scale distributed systems.
In this article, we’ll walk through a high-level and low-level system design of a Stories/Status system, focusing on architecture, key components, and lifecycle management.
1. Core Requirements
1.1 Functional requirements:
- Users can post photo or video stories
- Stories expire automatically after 24 hours
- Visibility is restricted (followers, close friends, contacts)
- Viewers can see and react to stories
1.2 Non-functional requirements:
- Low latency for feed loading
- High availability
- Horizontal scalability
- Eventual consistency is acceptable
2. High-Level Architecture (HLD)
At a high level, the system is split into independent services, each responsible for a single concern:
- API Gateway – authentication, authorization, routing, rate limiting
- Story Service – story creation and lifecycle management
- Content Service – media, text, and link handling
- Feed Service – story feed generation
- Visibility Service – privacy and audience enforcement
- Expiration Service – 24-hour TTL handling
- Kafka & Background Workers – asynchronous processing
- Analytics & Notification Services – engagement insights and alerts
3. Story Creation Flow (Write Path)
When a user posts a story:
- The client sends a request through the API Gateway
- The Content Service returns a pre-signed upload URL
- The client uploads media directly to object storage (e.g., S3)
- The Story Service stores metadata with a 24-hour expiration timestamp
Key design decision:
Media never flows through backend services. Only lightweight metadata is stored, while media is delivered via CDN.
4. Story Consumption Flow (Read Path)
When a user opens the stories tray:
- Feed Service fetches active stories from followed users
- Visibility Service filters stories based on privacy rules
- Expired stories are ignored
- Media URLs are returned to the client
- The client streams media directly from the CDN
This design is optimized for read-heavy traffic, which dominates story usage.
5. Visibility and Privacy Rules
Stories support multiple visibility modes:
- Followers only
- Close friends
- Contact-based visibility (WhatsApp Status)
- Blocked users
A dedicated Visibility Service enforces these rules using:
- Followers / contacts graph
- Redis caching for fast permission checks
By isolating visibility logic, privacy rules remain consistent and easy to evolve.
6. Expiration and Lifecycle Management
Ephemeral content is treated as a first-class concern:
- Each story has a strict 24-hour TTL
- Expiration Service monitors story timestamps
- Expired stories trigger lifecycle events
- Expired content is no longer served
This guarantees correctness even under high traffic.
7. Event-Driven Cleanup Using Kafka
Kafka is used to decouple lifecycle events from cleanup logic.
Typical events include:
- story_created
- story_expired
- story_viewed
- story_reacted
A Media Cleanup Worker consumes expiration events and:
- Deletes media from object storage
- Removes CDN references
Cleanup happens asynchronously, keeping user-facing APIs fast.
8. Engagement Tracking (Views & Reactions)
User engagement is handled by separate services:
- View Tracking Service – tracks story views
Reaction Service – likes, emojis, and replies
These services:
Handle extremely high write throughput
Are eventually consistent
Do not impact feed read performance
Engagement data is later aggregated for analytics.
Final Thoughts
Stories systems are deceptively complex. By designing explicitly for expiration, visibility, and scale, we can build systems that are resilient, efficient, and easy to evolve.
This design closely mirrors how platforms like Instagram and WhatsApp handle ephemeral content at massive scale.
This design is open to improvements and reviews. I’d love to hear feedback or alternative approaches.This is my first design.

Top comments (0)