Rizwan Saleem

Posted on May 30

How to design systems for your actual scale — a practical system design tutorial

#webdev #frontend #ai

How to design systems for your actual scale — a practical system design tutorial

System Design for Everyday Applications: Designing for Your Actual Traffic

Most system design content assumes you're building for millions of users Day 1. But if you're building a SaaS for 500 customers, an internal tool for your team, or a startup MVP, FAANG-scale patterns are overengineering that wastes time and money. This tutorial covers practical system design for real-world traffic with reasonable capacity decisions.

Start With Your Actual Numbers

Before choosing any technology, calculate your actual load:

Metric	Small App (1K users)	Medium App (50K users)	When to Worry
Requests/day	~10K-50K	~500K-2M	>5M/day
RPS (peak)	1-5	20-50	>100 RPS
Database size	<10GB	10-100GB	>500GB
Bandwidth	<100GB/month	100-500GB/month	>1TB/month

For 10K monthly active users with 5 pageviews/user/day at 50KB/page:

Daily bandwidth: 10K × 5 × 50KB = 2.5GB/day
Peak RPS: ~1-2 requests/second

You don't need Kubernetes, sharding, or distributed caches at this scale.

Database Choices: Pick Simple First

SQL Is Your Default

Use PostgreSQL or MySQL when:

You need transactions (payments, inventory, orders)
Data has clear relationships (users → orders → items)
You need complex queries with joins
Data consistency matters more than write speed

Real example: A 50-customer e-commerce store uses PostgreSQL. Single instance on $20/month DigitalOcean droplet handles 100 orders/day with sub-100ms queries. Adding read replicas or sharding would be wasted money.

When to Consider NoSQL

Document databases (MongoDB) when:

Schema changes frequently (product catalogs with varying attributes)
You store nested JSON data (user preferences, logs)
Rapid prototyping without migrations

In-memory databases (Redis) when:

You need sub-10ms latency (session storage, real-time leaderboards)
As a cache layer, not primary storage

Avoid NoSQL when:

You primarily need simple key-based lookups (use a cache instead)
You need strong consistency for financial data

The Index Before You Cache Rule

Before adding Redis, optimize your database query:

-- Slow query (full table scan)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Add index first
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

A proper index often gives 10-100× speedup without adding infrastructure complexity. Only add caching if:

The query is still slow after indexing
The same data is read hundreds of times per second
Slight staleness (seconds to minutes) is acceptable

Caching Strategies That Actually Help

When to Cache

Cache when data is read frequently but modified infrequently:

User profiles (read 100×, write 1× per day)
Product catalogs (read 1000×, write 1× per hour)
Dashboard aggregations (read 50×, recalculate every 5 minutes)

Don't cache:

Real-time inventory counts (stale data causes overselling)
User-specific data that changes every request
Data smaller than 100 rows (database is fast enough)

Cache Patterns for Small Apps

1. Cache-Aside (Simplest, Most Common)

def get_user(user_id):
    cache_key = f"user:{user_id}"
    user = cache.get(cache_key)
    if user is None:
        user = db.query("SELECT * FROM users WHERE id = ?", user_id)
        cache.set(cache_key, user, ttl=300)  # 5 minutes
    return user

Cache misses hit the database
Cache entries expire after TTL

2. CDN for Static Assets

Images, CSS, JavaScript → Cloudflare (free tier)
Reduces server load by 60-80%
No code changes needed, just update asset URLs

TTL Guidelines

Data Type	Recommended TTL
User session	30-60 minutes
Product info	5-15 minutes
Blog posts	1-24 hours
Analytics dashboard	5-30 minutes

Shorter TTLs = fresher data but more cache misses. Longer TTLs = faster but stale data.

When to Use a Queue vs. Just Optimize

Use a Queue When

Tasks are not part of the user's immediate response:

Sending emails after signup
Processing image uploads (resize, generate thumbnails)
Generating PDF reports
Webhook delivery to third parties

You have timeout errors from too many simultaneous requests:

Queue bursts and smooths traffic to downstream services

Real example: A photo app with 5K users uses Redis Queue for image processing. Upload returns immediately; processing happens in background. Without queue, 30-second uploads time out during peak hours.

Don't Use a Queue When

The task completes in <1 second (just do it synchronously)
You need the result immediately to show the user
You're adding a queue just to be "async" (premature optimization)

Queues add operational complexity: monitoring, retry logic, dead-letter queues, message ordering. If your sync code works fine, keep it sync.

Queue vs Database: Key Difference

Queues optimize for	Databases optimize for
Throughput	Durability
Availability	Correctness
Flow control	Integrity

Rule of thumb: If losing it would hurt, don't trust a queue with it. Persist state in a database; use queues to move events.

Capacity Planning: Reasonable Decisions

Start Small, Scale When Measured

Day 1 architecture for most apps:

Single application server (2-4 CPU, 4-8GB RAM)
Managed database (PostgreSQL on AWS RDS, DigitalOcean, Supabase)
Redis for caching (optional, $5-10/month)
Cloudflare free tier for CDN + DDoS protection

Cost: ~$30-60/month for 10K monthly users

Scale Vertically Before Horizontally

Vertical scaling (upgrade hardware):

Fastest path: upgrade from 4GB → 16GB RAM
No code changes needed
Works up to ~100K users for most apps

Horizontal scaling (add more servers):

Requires stateless application design
Adds load balancer complexity
Only needed when vertical scaling hits limits

Monitoring Triggers for Scaling

Add capacity when you consistently hit:

CPU >70% for >5 minutes during peak
Database connection pool >80% full
Response time >500ms for >10% of requests
Error rate >1%

Don't scale based on "what if." Scale based on actual metrics.

Avoiding Overengineering: Real Examples

Example 1: Blog Platform (10K monthly readers)

Overengineered:

Kubernetes cluster
Redis cluster with 3 nodes
PostgreSQL with read replicas
RabbitMQ for "async comments"
Cost: $400/month, 2 weeks setup

Actually needed:

Single $20/month VPS (DigitalOcean/Linode)
SQLite or single PostgreSQL instance
No cache (database handles 100 reads/sec easily)
Synchronous comments
Cost: $20/month, 1 day setup

Example 2: Internal Dashboard (50 users)

Overengineered:

Microservices architecture
GraphQL API layer
Multiple databases (PostgreSQL + MongoDB)
Cost: Complex to maintain, slower development

Actually needed:

Single Flask/Django/Express app
PostgreSQL with proper indexes
Server-side caching with Flask-Caching or similar
Result: 10× faster development, easier debugging

Example 3: SaaS with 500 Customers

When to add a queue:

Email notifications take 2-3 seconds synchronously
Users complain about slow page loads during peak
Add Redis Queue or BullMQ for email only

When NOT to add a queue:

Page loads are 200ms (fast enough)
Background jobs complete in <500ms
Team has no experience with queue monitoring

Practical Decision Checklist

Before adding complexity, ask:

What's my actual traffic? Measure before planning
Can I optimize the query first? Add indexes before caching
Is this task blocking the user? If no, queue it. If yes, keep it sync
What's the cost of failure? High-stakes data → database, not queue
Can I scale vertically first? Upgrade RAM/CPU before adding servers
Does my team know how to maintain this? Complexity = maintenance cost

The "Boring Technology" Stack That Works

For 90% of everyday applications (up to 100K users):

Layer	Technology	Why
Backend	Python (Django/Flask), Node.js, Ruby on Rails	Fast development, huge ecosystem
Database	PostgreSQL	Best balance of features, performance, reliability
Cache	Redis (single instance)	Simple, fast, works for most caching needs
Queue	Redis Queue / BullMQ	Uses existing Redis, no new infrastructure
CDN	Cloudflare (free)	CDN + DDoS protection + SSL
Hosting	Single VPS or managed platform (Railway, Render)	$20-50/month, no ops overhead

This stack handles 50K+ monthly users with 1-2 engineers maintaining it.

Final Rule: Design for Your Next Milestone, Not WWDC

Design for 10× your current traffic, not 1000×. If you have 1K users, design for 10K. If you hit 10K, you'll have revenue to hire engineers and refactor then. Most startups die from moving too slow, not from scaling too late.

The best system design is the simplest one that works for your actual users today, with a clear path to scale when you measure you need it.

Rizwan Saleem — https://rizwansaleem.co

DEV Community

How to design systems for your actual scale — a practical system design tutorial

How to design systems for your actual scale — a practical system design tutorial

System Design for Everyday Applications: Designing for Your Actual Traffic

Start With Your Actual Numbers

Database Choices: Pick Simple First

SQL Is Your Default

When to Consider NoSQL

The Index Before You Cache Rule

Caching Strategies That Actually Help

When to Cache

Cache Patterns for Small Apps

TTL Guidelines

When to Use a Queue vs. Just Optimize

Use a Queue When

Don't Use a Queue When

Queue vs Database: Key Difference

Capacity Planning: Reasonable Decisions

Start Small, Scale When Measured

Scale Vertically Before Horizontally

Monitoring Triggers for Scaling

Avoiding Overengineering: Real Examples

Example 1: Blog Platform (10K monthly readers)

Example 2: Internal Dashboard (50 users)

Example 3: SaaS with 500 Customers

Practical Decision Checklist

The "Boring Technology" Stack That Works

Final Rule: Design for Your Next Milestone, Not WWDC

Top comments (0)