How to design systems for your actual scale — a practical system design tutorial
System Design for Everyday Applications: Designing for Your Actual Traffic
Most system design content assumes you're building for millions of users Day 1. But if you're building a SaaS for 500 customers, an internal tool for your team, or a startup MVP, FAANG-scale patterns are overengineering that wastes time and money. This tutorial covers practical system design for real-world traffic with reasonable capacity decisions.
Start With Your Actual Numbers
Before choosing any technology, calculate your actual load:
| Metric | Small App (1K users) | Medium App (50K users) | When to Worry |
|---|---|---|---|
| Requests/day | ~10K-50K | ~500K-2M | >5M/day |
| RPS (peak) | 1-5 | 20-50 | >100 RPS |
| Database size | <10GB | 10-100GB | >500GB |
| Bandwidth | <100GB/month | 100-500GB/month | >1TB/month |
For 10K monthly active users with 5 pageviews/user/day at 50KB/page:
- Daily bandwidth: 10K × 5 × 50KB = 2.5GB/day
- Peak RPS: ~1-2 requests/second
You don't need Kubernetes, sharding, or distributed caches at this scale.
Database Choices: Pick Simple First
SQL Is Your Default
Use PostgreSQL or MySQL when:
- You need transactions (payments, inventory, orders)
- Data has clear relationships (users → orders → items)
- You need complex queries with joins
- Data consistency matters more than write speed
Real example: A 50-customer e-commerce store uses PostgreSQL. Single instance on $20/month DigitalOcean droplet handles 100 orders/day with sub-100ms queries. Adding read replicas or sharding would be wasted money.
When to Consider NoSQL
Document databases (MongoDB) when:
- Schema changes frequently (product catalogs with varying attributes)
- You store nested JSON data (user preferences, logs)
- Rapid prototyping without migrations
In-memory databases (Redis) when:
- You need sub-10ms latency (session storage, real-time leaderboards)
- As a cache layer, not primary storage
Avoid NoSQL when:
- You primarily need simple key-based lookups (use a cache instead)
- You need strong consistency for financial data
The Index Before You Cache Rule
Before adding Redis, optimize your database query:
-- Slow query (full table scan)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- Add index first
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
A proper index often gives 10-100× speedup without adding infrastructure complexity. Only add caching if:
- The query is still slow after indexing
- The same data is read hundreds of times per second
- Slight staleness (seconds to minutes) is acceptable
Caching Strategies That Actually Help
When to Cache
Cache when data is read frequently but modified infrequently:
- User profiles (read 100×, write 1× per day)
- Product catalogs (read 1000×, write 1× per hour)
- Dashboard aggregations (read 50×, recalculate every 5 minutes)
Don't cache:
- Real-time inventory counts (stale data causes overselling)
- User-specific data that changes every request
- Data smaller than 100 rows (database is fast enough)
Cache Patterns for Small Apps
1. Cache-Aside (Simplest, Most Common)
def get_user(user_id):
cache_key = f"user:{user_id}"
user = cache.get(cache_key)
if user is None:
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
cache.set(cache_key, user, ttl=300) # 5 minutes
return user
- Cache misses hit the database
- Cache entries expire after TTL
2. CDN for Static Assets
- Images, CSS, JavaScript → Cloudflare (free tier)
- Reduces server load by 60-80%
- No code changes needed, just update asset URLs
TTL Guidelines
| Data Type | Recommended TTL |
|---|---|
| User session | 30-60 minutes |
| Product info | 5-15 minutes |
| Blog posts | 1-24 hours |
| Analytics dashboard | 5-30 minutes |
Shorter TTLs = fresher data but more cache misses. Longer TTLs = faster but stale data.
When to Use a Queue vs. Just Optimize
Use a Queue When
Tasks are not part of the user's immediate response:
- Sending emails after signup
- Processing image uploads (resize, generate thumbnails)
- Generating PDF reports
- Webhook delivery to third parties
You have timeout errors from too many simultaneous requests:
- Queue bursts and smooths traffic to downstream services
Real example: A photo app with 5K users uses Redis Queue for image processing. Upload returns immediately; processing happens in background. Without queue, 30-second uploads time out during peak hours.
Don't Use a Queue When
- The task completes in <1 second (just do it synchronously)
- You need the result immediately to show the user
- You're adding a queue just to be "async" (premature optimization)
Queues add operational complexity: monitoring, retry logic, dead-letter queues, message ordering. If your sync code works fine, keep it sync.
Queue vs Database: Key Difference
| Queues optimize for | Databases optimize for |
|---|---|
| Throughput | Durability |
| Availability | Correctness |
| Flow control | Integrity |
Rule of thumb: If losing it would hurt, don't trust a queue with it. Persist state in a database; use queues to move events.
Capacity Planning: Reasonable Decisions
Start Small, Scale When Measured
Day 1 architecture for most apps:
- Single application server (2-4 CPU, 4-8GB RAM)
- Managed database (PostgreSQL on AWS RDS, DigitalOcean, Supabase)
- Redis for caching (optional, $5-10/month)
- Cloudflare free tier for CDN + DDoS protection
Cost: ~$30-60/month for 10K monthly users
Scale Vertically Before Horizontally
Vertical scaling (upgrade hardware):
- Fastest path: upgrade from 4GB → 16GB RAM
- No code changes needed
- Works up to ~100K users for most apps
Horizontal scaling (add more servers):
- Requires stateless application design
- Adds load balancer complexity
- Only needed when vertical scaling hits limits
Monitoring Triggers for Scaling
Add capacity when you consistently hit:
- CPU >70% for >5 minutes during peak
- Database connection pool >80% full
- Response time >500ms for >10% of requests
- Error rate >1%
Don't scale based on "what if." Scale based on actual metrics.
Avoiding Overengineering: Real Examples
Example 1: Blog Platform (10K monthly readers)
Overengineered:
- Kubernetes cluster
- Redis cluster with 3 nodes
- PostgreSQL with read replicas
- RabbitMQ for "async comments"
- Cost: $400/month, 2 weeks setup
Actually needed:
- Single $20/month VPS (DigitalOcean/Linode)
- SQLite or single PostgreSQL instance
- No cache (database handles 100 reads/sec easily)
- Synchronous comments
- Cost: $20/month, 1 day setup
Example 2: Internal Dashboard (50 users)
Overengineered:
- Microservices architecture
- GraphQL API layer
- Multiple databases (PostgreSQL + MongoDB)
- Cost: Complex to maintain, slower development
Actually needed:
- Single Flask/Django/Express app
- PostgreSQL with proper indexes
- Server-side caching with Flask-Caching or similar
- Result: 10× faster development, easier debugging
Example 3: SaaS with 500 Customers
When to add a queue:
- Email notifications take 2-3 seconds synchronously
- Users complain about slow page loads during peak
- Add Redis Queue or BullMQ for email only
When NOT to add a queue:
- Page loads are 200ms (fast enough)
- Background jobs complete in <500ms
- Team has no experience with queue monitoring
Practical Decision Checklist
Before adding complexity, ask:
- What's my actual traffic? Measure before planning
- Can I optimize the query first? Add indexes before caching
- Is this task blocking the user? If no, queue it. If yes, keep it sync
- What's the cost of failure? High-stakes data → database, not queue
- Can I scale vertically first? Upgrade RAM/CPU before adding servers
- Does my team know how to maintain this? Complexity = maintenance cost
The "Boring Technology" Stack That Works
For 90% of everyday applications (up to 100K users):
| Layer | Technology | Why |
|---|---|---|
| Backend | Python (Django/Flask), Node.js, Ruby on Rails | Fast development, huge ecosystem |
| Database | PostgreSQL | Best balance of features, performance, reliability |
| Cache | Redis (single instance) | Simple, fast, works for most caching needs |
| Queue | Redis Queue / BullMQ | Uses existing Redis, no new infrastructure |
| CDN | Cloudflare (free) | CDN + DDoS protection + SSL |
| Hosting | Single VPS or managed platform (Railway, Render) | $20-50/month, no ops overhead |
This stack handles 50K+ monthly users with 1-2 engineers maintaining it.
Final Rule: Design for Your Next Milestone, Not WWDC
Design for 10× your current traffic, not 1000×. If you have 1K users, design for 10K. If you hit 10K, you'll have revenue to hire engineers and refactor then. Most startups die from moving too slow, not from scaling too late.
The best system design is the simplest one that works for your actual users today, with a clear path to scale when you measure you need it.
Rizwan Saleem — https://rizwansaleem.co
Top comments (0)