DEV Community

Cover image for How a SaaS platform cut infrastructure costs by 40% while improving response times
binadit
binadit

Posted on • Originally published at binadit.com

How a SaaS platform cut infrastructure costs by 40% while improving response times

Database queries eating your budget? How we fixed N+1 problems and cut costs 40%

Ever seen your infrastructure costs climb while performance tanks? A European CRM platform with 50K users was burning €18,000 monthly on oversized servers while their API responses crawled at 2.8 seconds p95. Spoiler: throwing more CPU at the problem wasn't the answer.

The problem: classic over-provisioning meets query hell

Their setup screamed "we scaled vertically and hoped for the best":

  • Three oversized VMs running 24/7 at 12% CPU utilization
  • Database server with 32 cores mostly waiting on inefficient queries
  • 2.3M daily API requests taking 450ms average per database query
  • Cache hit rate sitting at a painful 23%

The real kicker? 78% of their traffic happened during European business hours, but they were paying for peak capacity around the clock.

What the audit revealed

We dug into their query patterns and found the usual suspects:

N+1 query nightmare: A single dashboard load triggered 47 separate database queries. Classic eager loading problem.

Synchronous external API calls: Adding 200-800ms to user requests that should have been background jobs.

Default Redis config: Treating it like basic key-value storage instead of a proper caching layer.

Even load balancing: Distributing traffic equally regardless of actual server load or time patterns.

The fix: optimize first, scale second

Instead of adding more resources, we focused on making existing ones work smarter.

Database optimization

Replaced N+1 patterns with batch queries and eager loading:

-- Before: 47 queries for user dashboard
SELECT * FROM users WHERE id = ?;
SELECT * FROM projects WHERE user_id = ?;
SELECT * FROM tasks WHERE project_id = ?;
-- ...repeated for each project

-- After: 3 optimized queries with JOINs
SELECT u.*, p.*, t.* 
FROM users u 
LEFT JOIN projects p ON u.id = p.user_id 
LEFT JOIN tasks t ON p.id = t.project_id 
WHERE u.id = ?;
Enter fullscreen mode Exit fullscreen mode

Redis caching strategy

Implemented multi-tier caching:

  • Query results: 15min TTL for user data, 1hr for reference data
  • External API responses: 30min TTL with async refresh
  • Session data cached to eliminate auth lookups

Auto-scaling configuration

min_instances: 1
max_instances: 4
scale_up_trigger: cpu > 70% for 5min
scale_down_trigger: cpu < 30% for 15min
instance_type: c5.large  # down from c5.2xlarge
Enter fullscreen mode Exit fullscreen mode

Background job processing

Moved external API calls, reports, and emails to async workers:

# Before: blocking user request
response = requests.get(external_api_url)
return render_template('dashboard.html', data=response.json())

# After: cached data with async refresh
data = cache.get(cache_key) or get_cached_default()
refresh_data_async.delay(cache_key)  # Celery task
return render_template('dashboard.html', data=data)
Enter fullscreen mode Exit fullscreen mode

Results that actually matter

Performance improvements:

  • Database queries: 450ms → 89ms (80% faster)
  • P95 response times: 2.8s → 1.1s (61% improvement)
  • Cache hit rate: 23% → 87%
  • Page load times: 4.2s → 2.1s average

Cost reductions:

  • Total monthly spend: €18,000 → €11,000 (39% reduction)
  • App server costs: 64% reduction through rightsizing
  • Database costs: 47% reduction after optimization
  • CPU utilization: 12% → 45-65% (properly sized resources)

Key takeaways for your next optimization

  1. Profile before you scale: Query analysis revealed the real bottlenecks weren't resource constraints
  2. Cache aggressively: 87% hit rates eliminated most database calls
  3. Scale horizontally with auto-scaling: Match your infrastructure to actual usage patterns
  4. Async everything external: Keep third-party API calls out of user request paths
  5. Monitor utilization: 12% CPU usage on oversized instances is just burning money

The platform now handles 3.7M daily requests (60% more than before) with better performance and 39% lower costs. Sometimes the best scaling strategy is optimizing what you already have.

Originally published on binadit.com

Top comments (0)