DEV Community

Rizwan Saleem
Rizwan Saleem

Posted on

How to design systems for your actual scale — a practical system design tutorial

How to design systems for your actual scale — a practical system design tutorial

System Design for Everyday Applications: Designing for Your Actual Traffic

Most system design content assumes you're building for millions of users Day 1. But if you're building a SaaS for 500 customers, an internal tool for your team, or a startup MVP, FAANG-scale patterns are overengineering that wastes time and money. This tutorial covers practical system design for real-world traffic with reasonable capacity decisions.

Start With Your Actual Numbers

Before choosing any technology, calculate your actual load:

Metric Small App (1K users) Medium App (50K users) When to Worry
Requests/day ~10K-50K ~500K-2M >5M/day
RPS (peak) 1-5 20-50 >100 RPS
Database size <10GB 10-100GB >500GB
Bandwidth <100GB/month 100-500GB/month >1TB/month

For 10K monthly active users with 5 pageviews/user/day at 50KB/page:

  • Daily bandwidth: 10K × 5 × 50KB = 2.5GB/day
  • Peak RPS: ~1-2 requests/second

You don't need Kubernetes, sharding, or distributed caches at this scale.

Database Choices: Pick Simple First

SQL Is Your Default

Use PostgreSQL or MySQL when:

  • You need transactions (payments, inventory, orders)
  • Data has clear relationships (users → orders → items)
  • You need complex queries with joins
  • Data consistency matters more than write speed

Real example: A 50-customer e-commerce store uses PostgreSQL. Single instance on $20/month DigitalOcean droplet handles 100 orders/day with sub-100ms queries. Adding read replicas or sharding would be wasted money.

When to Consider NoSQL

Document databases (MongoDB) when:

  • Schema changes frequently (product catalogs with varying attributes)
  • You store nested JSON data (user preferences, logs)
  • Rapid prototyping without migrations

In-memory databases (Redis) when:

  • You need sub-10ms latency (session storage, real-time leaderboards)
  • As a cache layer, not primary storage

Avoid NoSQL when:

  • You primarily need simple key-based lookups (use a cache instead)
  • You need strong consistency for financial data

The Index Before You Cache Rule

Before adding Redis, optimize your database query:

-- Slow query (full table scan)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Add index first
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
Enter fullscreen mode Exit fullscreen mode

A proper index often gives 10-100× speedup without adding infrastructure complexity. Only add caching if:

  • The query is still slow after indexing
  • The same data is read hundreds of times per second
  • Slight staleness (seconds to minutes) is acceptable

Caching Strategies That Actually Help

When to Cache

Cache when data is read frequently but modified infrequently:

  • User profiles (read 100×, write 1× per day)
  • Product catalogs (read 1000×, write 1× per hour)
  • Dashboard aggregations (read 50×, recalculate every 5 minutes)

Don't cache:

  • Real-time inventory counts (stale data causes overselling)
  • User-specific data that changes every request
  • Data smaller than 100 rows (database is fast enough)

Cache Patterns for Small Apps

1. Cache-Aside (Simplest, Most Common)

def get_user(user_id):
    cache_key = f"user:{user_id}"
    user = cache.get(cache_key)
    if user is None:
        user = db.query("SELECT * FROM users WHERE id = ?", user_id)
        cache.set(cache_key, user, ttl=300)  # 5 minutes
    return user
Enter fullscreen mode Exit fullscreen mode
  • Cache misses hit the database
  • Cache entries expire after TTL

2. CDN for Static Assets

  • Images, CSS, JavaScript → Cloudflare (free tier)
  • Reduces server load by 60-80%
  • No code changes needed, just update asset URLs

TTL Guidelines

Data Type Recommended TTL
User session 30-60 minutes
Product info 5-15 minutes
Blog posts 1-24 hours
Analytics dashboard 5-30 minutes

Shorter TTLs = fresher data but more cache misses. Longer TTLs = faster but stale data.

When to Use a Queue vs. Just Optimize

Use a Queue When

Tasks are not part of the user's immediate response:

  • Sending emails after signup
  • Processing image uploads (resize, generate thumbnails)
  • Generating PDF reports
  • Webhook delivery to third parties

You have timeout errors from too many simultaneous requests:

  • Queue bursts and smooths traffic to downstream services

Real example: A photo app with 5K users uses Redis Queue for image processing. Upload returns immediately; processing happens in background. Without queue, 30-second uploads time out during peak hours.

Don't Use a Queue When

  • The task completes in <1 second (just do it synchronously)
  • You need the result immediately to show the user
  • You're adding a queue just to be "async" (premature optimization)

Queues add operational complexity: monitoring, retry logic, dead-letter queues, message ordering. If your sync code works fine, keep it sync.

Queue vs Database: Key Difference

Queues optimize for Databases optimize for
Throughput Durability
Availability Correctness
Flow control Integrity

Rule of thumb: If losing it would hurt, don't trust a queue with it. Persist state in a database; use queues to move events.

Capacity Planning: Reasonable Decisions

Start Small, Scale When Measured

Day 1 architecture for most apps:

  • Single application server (2-4 CPU, 4-8GB RAM)
  • Managed database (PostgreSQL on AWS RDS, DigitalOcean, Supabase)
  • Redis for caching (optional, $5-10/month)
  • Cloudflare free tier for CDN + DDoS protection

Cost: ~$30-60/month for 10K monthly users

Scale Vertically Before Horizontally

Vertical scaling (upgrade hardware):

  • Fastest path: upgrade from 4GB → 16GB RAM
  • No code changes needed
  • Works up to ~100K users for most apps

Horizontal scaling (add more servers):

  • Requires stateless application design
  • Adds load balancer complexity
  • Only needed when vertical scaling hits limits

Monitoring Triggers for Scaling

Add capacity when you consistently hit:

  • CPU >70% for >5 minutes during peak
  • Database connection pool >80% full
  • Response time >500ms for >10% of requests
  • Error rate >1%

Don't scale based on "what if." Scale based on actual metrics.

Avoiding Overengineering: Real Examples

Example 1: Blog Platform (10K monthly readers)

Overengineered:

  • Kubernetes cluster
  • Redis cluster with 3 nodes
  • PostgreSQL with read replicas
  • RabbitMQ for "async comments"
  • Cost: $400/month, 2 weeks setup

Actually needed:

  • Single $20/month VPS (DigitalOcean/Linode)
  • SQLite or single PostgreSQL instance
  • No cache (database handles 100 reads/sec easily)
  • Synchronous comments
  • Cost: $20/month, 1 day setup

Example 2: Internal Dashboard (50 users)

Overengineered:

  • Microservices architecture
  • GraphQL API layer
  • Multiple databases (PostgreSQL + MongoDB)
  • Cost: Complex to maintain, slower development

Actually needed:

  • Single Flask/Django/Express app
  • PostgreSQL with proper indexes
  • Server-side caching with Flask-Caching or similar
  • Result: 10× faster development, easier debugging

Example 3: SaaS with 500 Customers

When to add a queue:

  • Email notifications take 2-3 seconds synchronously
  • Users complain about slow page loads during peak
  • Add Redis Queue or BullMQ for email only

When NOT to add a queue:

  • Page loads are 200ms (fast enough)
  • Background jobs complete in <500ms
  • Team has no experience with queue monitoring

Practical Decision Checklist

Before adding complexity, ask:

  1. What's my actual traffic? Measure before planning
  2. Can I optimize the query first? Add indexes before caching
  3. Is this task blocking the user? If no, queue it. If yes, keep it sync
  4. What's the cost of failure? High-stakes data → database, not queue
  5. Can I scale vertically first? Upgrade RAM/CPU before adding servers
  6. Does my team know how to maintain this? Complexity = maintenance cost

The "Boring Technology" Stack That Works

For 90% of everyday applications (up to 100K users):

Layer Technology Why
Backend Python (Django/Flask), Node.js, Ruby on Rails Fast development, huge ecosystem
Database PostgreSQL Best balance of features, performance, reliability
Cache Redis (single instance) Simple, fast, works for most caching needs
Queue Redis Queue / BullMQ Uses existing Redis, no new infrastructure
CDN Cloudflare (free) CDN + DDoS protection + SSL
Hosting Single VPS or managed platform (Railway, Render) $20-50/month, no ops overhead

This stack handles 50K+ monthly users with 1-2 engineers maintaining it.

Final Rule: Design for Your Next Milestone, Not WWDC

Design for 10× your current traffic, not 1000×. If you have 1K users, design for 10K. If you hit 10K, you'll have revenue to hire engineers and refactor then. Most startups die from moving too slow, not from scaling too late.

The best system design is the simplest one that works for your actual users today, with a clear path to scale when you measure you need it.


Rizwan Saleem — https://rizwansaleem.co

Top comments (0)