DEV Community

Chandra Shettigar
Chandra Shettigar

Posted on

2 1 1

Scaling Rails Background Jobs in Kubernetes: From Queue to HPA

Ever tried processing a million records in a Rails controller action? Yeah, that's not going to end well. Your users will be staring at a spinning wheel, your server will be gasping for resources, and your ops team will be giving you that "we need to talk" look.

The Problem: Long-Running Requests

Picture this: Your Rails app needs to:

  • Generate complex reports from millions of records
  • Process large file uploads
  • Send thousands of notifications
  • Sync data with external systems

Long Running Requests in Rails

Doing any of these in a controller action means:

  • Timeout issues (Nginx, Rails, Load Balancer)
  • Blocked server resources
  • Poor user experience
  • Potential data inconsistency if the request fails

Step 1: Moving to Background Processing

First, let's move these long-running tasks to background jobs:

# app/controllers/reports_controller.rb
class ReportsController < ApplicationController
  def create
    report_id = SecureRandom.uuid
    ReportGenerationJob.perform_later(
      user_id: current_user.id,
      report_id: report_id,
      parameters: report_params
    )

    render json: { 
      report_id: report_id,
      status: 'processing',
      status_url: report_status_path(report_id)
    }
  end
end

# app/jobs/report_generation_job.rb
class ReportGenerationJob < ApplicationJob
  queue_as :reports

  def perform(user_id:, report_id:, parameters:)
    # Process report
    report_data = generate_report(parameters)

    # Store results
    store_report(report_id, report_data)

    # Notify user
    ReportMailer.completed(user_id, report_id).deliver_now
  end
end
Enter fullscreen mode Exit fullscreen mode

Long Running Requests in Rails - with Async Jobs

Great! Now our users get immediate feedback, and our server isn't blocked. But we've just moved the problem - now it's in our job queue.

The Scaling Challenge

A single Rails worker instance with Sidekiq needs proper configuration for queues and concurrency. Here's a basic setup:

# config/initializers/sidekiq.rb
# Note: This is a simplified example to demonstrate the concept.
# Actual syntax might vary based on your Sidekiq version and requirements.

Sidekiq.configure_server do |config|
  # Configure Redis connection
  config.redis = { url: ENV.fetch('REDIS_URL', 'redis://localhost:6379/0') }

  # Configure concurrency based on environment
  config.options[:concurrency] = case Rails.env
    when 'production'
      ENV.fetch('SIDEKIQ_CONCURRENCY', 25).to_i
    else
      10
  end
end

# Queue configuration with weights for priority
config.options[:queues] = [
  ['critical', 5],      # Higher weight = higher priority
  ['sequential', 3],
  ['default', 2],
  ['low', 1]
]
Enter fullscreen mode Exit fullscreen mode

And in your config/sidekiq.yml:

# Note: This is a simplified example. Adjust based on your needs
:verbose: false
:concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
:timeout: 25

# Environment-specific configurations
production:
  :concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
  :queues:
    - [critical, 5]
    - [sequential, 3]
    - [default, 2]
    - [low, 1]
Enter fullscreen mode Exit fullscreen mode

This gives us 25 concurrent jobs in production, but what happens when:

  • We have 1000 reports queued up
  • Some jobs need to run sequentially (like financial transactions)
  • Different jobs need different resources
  • We have mixed workloads (quick jobs vs long-running jobs)

Queue Strategy: Not All Jobs Are Equal

Let's organize our jobs based on their processing requirements:

class FinancialTransactionJob < ApplicationJob
  queue_as :sequential
  sidekiq_options retry: 3, backtrace: true

  def perform(transaction_id)
    # Must process one at a time
    process_transaction(transaction_id)
  end
end

class ReportGenerationJob < ApplicationJob
  queue_as :default
  sidekiq_options retry: 5, backtrace: true

  def perform(report_id)
    # Can process many simultaneously
    generate_report(report_id)
  end
end

class NotificationJob < ApplicationJob
  queue_as :low
  sidekiq_options retry: 3

  def perform(user_ids)
    # Quick jobs, high volume
    send_notifications(user_ids)
  end
end
Enter fullscreen mode Exit fullscreen mode

Enter Kubernetes HPA: Dynamic Worker Scaling

Long Running Requests in Rails with Jobs on Kubernetes HPA

Now we can set up our worker deployment and HPA:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rails-workers
spec:
  template:
    spec:
      containers:
      - name: sidekiq
        image: myapp/rails:latest
        command: ["bundle", "exec", "sidekiq"]
        env:
        - name: RAILS_ENV
          value: "production"
        - name: SIDEKIQ_CONCURRENCY
          value: "25"
        resources:
          requests:
            memory: "1Gi"
            cpu: "1"
          limits:
            memory: "2Gi"
            cpu: "2"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: rails-workers
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: rails-workers
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: sidekiq_queue_depth
      target:
        type: AverageValue
        averageValue: 100
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Pods
        value: 2
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60
Enter fullscreen mode Exit fullscreen mode

This setup gives us:

  • Minimum 2 worker pods (50 concurrent jobs)
  • Maximum 10 worker pods (250 concurrent jobs)
  • Automatic scaling based on queue depth
  • Conservative scale-down to prevent thrashing
  • Resource limits to protect our cluster

Monitoring and Fine-Tuning

To make this work smoothly, monitor:

  1. Queue depths by queue type
  2. Job processing times
  3. Error rates
  4. Resource utilization

Add Prometheus metrics:

# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
  config.on(:startup) do
    queue_depth_gauge = Prometheus::Client.registry.gauge(
      :sidekiq_queue_depth,
      docstring: 'Sidekiq queue depth',
      labels: [:queue]
    )

    Sidekiq::Scheduled::Poller.new.poll do
      Sidekiq::Queue.all.each do |queue|
        queue_depth_gauge.set(
          { queue: queue.name },
          queue.size
        )
      end
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

Best Practices and Gotchas

  1. Queue Isolation

    • Separate queues for different job types
    • Consider dedicated workers for critical queues
    • Use queue priorities effectively
  2. Resource Management

    • Set appropriate memory/CPU limits
    • Monitor job memory usage
    • Use batch processing for large datasets
  3. Error Handling

    • Implement retry strategies
    • Set up dead letter queues
    • Monitor failed jobs
  4. Scaling Behavior

    • Set appropriate scaling thresholds
    • Use stabilization windows
    • Consider time-of-day patterns

Conclusion

By combining Rails' background job capabilities with Kubernetes' scaling features, we can build a robust, scalable system for processing long-running tasks. The key is to:

  1. Move long-running tasks to background jobs
  2. Organize queues based on job characteristics
  3. Configure worker processes appropriately
  4. Use HPA for dynamic scaling
  5. Monitor and adjust based on real-world usage

Remember: The goal isn't just to scale - it's to provide a reliable, responsive system that efficiently processes work while maintaining data consistency and user experience.

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay