A notification system is responsible for sending alerts to users, such as breaking news, product updates, event reminders, and promotions.
Notifications are not limited to mobile push messages. In practice, there are three major notification channels: push notifications, SMS messages, and emails.
Defining the Scope and Requirements
Before designing a notification system, we must first understand the problem and clarify the system requirements. System design problems are often open-ended, so asking the right questions is essential.
Key requirements include:
• What types of notifications are supported?
• Is the system real-time?
• Which devices should receive notifications?
• What triggers notifications?
• Can users opt out?
• What is the expected scale?
In our design, the system supports push notifications, SMS, and email. Notifications should be delivered as quickly as possible, with small delays acceptable under heavy load.
The system must work across iOS devices, Android devices, and desktop or laptop computers.
Notifications can be triggered by user actions through client applications, or by server-side scheduled jobs such as reminders and marketing campaigns.
Users must also be able to opt out, and the system must respect their notification preferences.
At scale, the system sends around 10 million push notifications, 1 million SMS messages, and 5 million emails every day.
This makes it a large-scale system that must be fast, reliable, scalable, and support multiple delivery channels.
High-Level Notification System Architecture
At a high level, services in our system do not communicate directly with external providers like Apple or Google. Instead, they send requests to a centralized Notification Service API.
This notification service determines the type of message — push, SMS, or email — builds the correct payload, and routes it through the appropriate delivery channel.
Finally, third-party providers handle the actual delivery to user devices.
Notification Delivery Channels
A modern notification system must support multiple channels, each with its own external provider.
iOS Push Notifications (APNs)
For iOS push notifications, the flow includes three components:
• Provider: our server that builds the notification request
• APNs: Apple Push Notification Service that delivers the message
• iOS Device: the client that receives and displays the alert
The provider requires:
• A device token (unique identifier for the iPhone)
• A payload (JSON containing the title, body, and metadata)
Android Push Notifications (FCM)
Android push notifications follow a similar flow, except that Firebase Cloud Messaging (FCM) is used instead of APNs.
SMS Notifications
SMS messages are typically delivered through third-party providers such as Twilio or Nexmo, rather than being sent directly from internal servers.
Email Notifications
Most companies rely on third-party email services like SendGrid or Mailchimp due to their high delivery rates, reliability, and analytics support.
Collecting User Contact Information
To send notifications, the system must store the user’s contact details. When a user signs up or installs the application, our API servers collect this information and store it in the database.
Typical stored data includes:
• Device tokens for push notifications
• Phone numbers for SMS
• Email addresses for email delivery
A simple database structure includes:
• User Table: profile information, email, phone number
• Device Table: device tokens linked to the user (supporting multiple devices per user)
Core System Components
The key building blocks of the notification architecture include:
• Services that trigger notifications (microservices, cron jobs, distributed systems)
• A central Notification System that receives requests and builds payloads
• Third-party providers responsible for delivering notifications
• User devices that ultimately receive the alerts
Third-Party Provider Considerations
Integrating external providers introduces two important requirements:
• Extensibility: adding or replacing providers should require minimal changes
• Provider Availability: some providers may not work in certain regions (e.g., FCM in China), requiring alternatives like JPush
Problems with the Initial Single-Server Design
A simple architecture with only one notification server leads to three major challenges:
• Single Point of Failure: if the server fails, all notifications stop
• Hard to Scale: individual components cannot scale independently
• Performance Bottlenecks: slow tasks and API delays can overwhelm the system
Improving Scalability and Reliability
To evolve the design into a production-ready system, we introduce three key improvements:
• Separate the database and cache into independent services for better scalability
• Add multiple notification servers behind a load balancer for horizontal scaling
• Introduce message queues to decouple components and enable asynchronous processing
With message queues, services simply enqueue notification jobs, and worker servers process them asynchronously. This removes bottlenecks, improves resilience, and supports high traffic efficiently.
Conclusion
By combining multiple delivery channels, scalable infrastructure, and asynchronous message queues, we can build a modern notification system that is reliable, extensible, and capable of handling millions of notifications per day.
Top comments (0)