DEV Community

Cover image for How to optimize costs without adding servers: a cloud cost optimization guide
binadit
binadit

Posted on • Originally published at binadit.com

How to optimize costs without adding servers: a cloud cost optimization guide

Infrastructure bottlenecks are killing your budget: here's how to fix them

Before you spin up another server instance, pause. That performance problem eating your cloud budget probably isn't a capacity issue, it's an efficiency problem. Most infrastructure struggles stem from poorly utilized existing resources, not insufficient resources.

I've seen teams cut infrastructure costs by 40-50% while improving performance simply by optimizing what they already have. Here's the systematic approach that works.

The real problem with "just add more servers"

When response times spike or databases slow down, the knee-jerk reaction is scaling horizontally. But this approach masks underlying inefficiencies and compounds costs. A misconfigured database will perform poorly whether it's running on one server or ten.

Start with baseline measurement

Optimization without measurement is guesswork. Install monitoring tools and capture current performance data before changing anything.

# Install essential monitoring tools
sudo apt update && sudo apt install htop iotop nethogs sysstat

# Enable system statistics
sudo systemctl enable sysstat && sudo systemctl start sysstat
Enter fullscreen mode Exit fullscreen mode

Create a simple monitoring script to track key metrics:

#!/bin/bash
# monitor.sh - run every minute via cron
echo "$(date): $(uptime)" >> /var/log/performance.log
echo "Memory: $(free -h | grep Mem)" >> /var/log/performance.log
echo "Disk I/O: $(iostat -x 1 1 | tail -n +4)" >> /var/log/performance.log
echo "---" >> /var/log/performance.log
Enter fullscreen mode Exit fullscreen mode

Find the real bottlenecks

Most performance issues fall into four categories. Use these commands to identify which resources are actually constrained:

CPU usage patterns:

sar -u 1 60  # Monitor CPU for 60 seconds
top -o %CPU  # Find CPU-hungry processes
Enter fullscreen mode Exit fullscreen mode

Memory analysis:

free -h
ps aux --sort=-%mem | head -20  # Top memory consumers
Enter fullscreen mode Exit fullscreen mode

Disk I/O bottlenecks:

iostat -x 1 10  # Look for >90% utilization or high await times
Enter fullscreen mode Exit fullscreen mode

Network utilization:

nethogs -d 5  # Monitor network usage by process
Enter fullscreen mode Exit fullscreen mode

Database optimization delivers the biggest wins

Database queries cause most web application bottlenecks. Start optimization here.

Enable slow query logging to identify problematic queries:

SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;
Enter fullscreen mode Exit fullscreen mode

Analyze slow queries after 24 hours:

sudo mysqldumpslow /var/lib/mysql/slow.log | head -10
Enter fullscreen mode Exit fullscreen mode

Add strategic indexes for common query patterns:

-- For ecommerce platforms
ALTER TABLE orders ADD INDEX idx_created_status (created_at, status);
ALTER TABLE products ADD INDEX idx_category_price (category_id, price);
Enter fullscreen mode Exit fullscreen mode

Optimize MySQL memory settings based on available RAM:

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
innodb_buffer_pool_size = 5G  # ~60% of available RAM
query_cache_size = 512M
tmp_table_size = 256M
max_heap_table_size = 256M
Enter fullscreen mode Exit fullscreen mode

Implement smart caching

Caching reduces database load more effectively than adding database servers. Install and configure Redis:

sudo apt install redis-server
sudo systemctl enable redis-server
Enter fullscreen mode Exit fullscreen mode

Configure Redis memory settings:

# /etc/redis/redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
save 900 1
Enter fullscreen mode Exit fullscreen mode

Implement query caching in your application:

function getCachedProducts($categoryId) {
    $redis = new Redis();
    $redis->connect('127.0.0.1', 6379);

    $cacheKey = "products_category_" . $categoryId;
    $cached = $redis->get($cacheKey);

    if ($cached) {
        return json_decode($cached, true);
    }

    $products = $this->database->query(
        "SELECT * FROM products WHERE category_id = ?", 
        [$categoryId]
    );

    $redis->setex($cacheKey, 3600, json_encode($products));
    return $products;
}
Enter fullscreen mode Exit fullscreen mode

Web server configuration matters

Optimize Nginx based on your actual traffic patterns:

# /etc/nginx/nginx.conf
worker_processes auto;
worker_connections 1024;

http {
    keepalive_timeout 65;
    gzip on;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/javascript;

    # Static file caching
    location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
        access_log off;
    }
}
Enter fullscreen mode Exit fullscreen mode

Configure PHP-FPM connection pooling:

# /etc/php/8.1/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 35
Enter fullscreen mode Exit fullscreen mode

Measure success with numbers

After implementing optimizations, measure improvements using the same baseline metrics:

# Compare CPU utilization
sar -u -f /var/log/sysstat/saXX | grep Average

# Check memory improvement
free -h

# Test response times
ab -n 1000 -c 10 http://yoursite.com/
Enter fullscreen mode Exit fullscreen mode

Successful optimization typically shows:

  • 20-50% faster response times
  • Reduced database queries per page
  • Stable memory usage
  • Lower CPU peaks

Avoid these optimization traps

  1. Don't optimize everything at once - Implement changes incrementally to isolate impact
  2. Profile before optimizing - Don't guess what needs optimization
  3. Monitor during changes - Some improvements in one area may degrade others

The long-term strategy

Effective cost optimization requires ongoing attention to infrastructure efficiency. The goal isn't just reducing immediate costs, but building systems that scale efficiently.

Most performance problems that seem to require additional servers actually indicate inefficient resource usage. Focus on building optimization into your deployment pipeline and monitoring strategy.

Set up automated alerts for key performance metrics to catch issues before they require emergency scaling. Plan regular optimization reviews as your application grows and usage patterns evolve.

Originally published on binadit.com

Top comments (0)