DEV Community

Cover image for Latency based container scaling with Orbit
Kav Pather for Air Pipe

Posted on • Originally published at blog.airpipe.io on

Latency based container scaling with Orbit

Haven't read Part 1? Start with Building Orbit: A Lightweight Container Orchestrator in Rust to learn about our journey's beginning.

In our previous article, we introduced Orbit, our lightweight container orchestrator built in Rust. Since then, we've made significant improvements driven by both community feedback and production requirements. Let's dive into the technical evolution that's making Orbit even more powerful and efficient.

Community-Driven Development

One of the most exciting aspects of Orbit's development has been the community engagement. A perfect example is our implementation of CoDel (Controlled Delay) for scaling decisions, which came directly from a community member's suggestion on Medium. We're also grateful to community members like Josselin Chevalay who contributed the pull_policy feature in our latest release, allowing control over container image pulling behavior. This collaborative approach will continue to help shape Orbit's feature set and technical direction.

Technical Evolution: Key Improvements

1. CoDel (Controlled Delay) - Inspired Scaling: Latency-Driven Container Orchestration

Unlike traditional orchestrators that rely solely on CPU and memory metrics, we've implemented CoDel-based/inspired scaling - a feature not natively available in Kubernetes or other major orchestrators. Here's how it works:

pub struct CoDelMetrics {
    service_name: String,
    sojourn_times: VecDeque<(Instant, Duration)>,
    first_above_time: Option<Instant>,
    last_scale_time: Instant,
    config: CoDelConfig,
}
Enter fullscreen mode Exit fullscreen mode
name: adaptive-scaling
instance_count:
  min: 2
  max: 10

# CoDel-inspired adaptive scaling based on request latency
codel:
  target: 100ms                   # Target latency threshold
  interval: 1s                    # Interval for checking delays
  consecutive_intervals: 3        # Number of intervals above target before scaling
  max_scale_step: 1              # Maximum instances to scale up at once
  scale_cooldown: 30s            # Minimum time between scaling actions
  overload_status_code: 503      # Return 503 when overloaded

# Fine-tune scaling behavior
scaling_policy:
  cooldown_duration: 60s         # Wait time between scaling actions
  scale_down_threshold_percentage: 50.0  # Scale down if usage below 50%

spec:
  containers:
    - name: main
      image: airpipeio/infoapp:latest
      ports:
        - port: 80
          node_port: 4335
Enter fullscreen mode Exit fullscreen mode

The CoDel inspired implementation monitors request latency and makes intelligent scaling decisions based on both immediate and historical performance data. Benefits include:

  • More responsive scaling based on actual service performance

  • Better handling of latency spikes

  • Prevention of unnecessary scale-ups during temporary load increases

Note that this is just our initial implementation, and we will continue to improve where possible and perhaps rename when appropriate.

Key Differences from Traditional CoDel:

  • Service-Level Application :

    • Our implementation applies CoDel principles at the service level rather than packet level
    • Uses request latency instead of packet sojourn time
    • Focuses on scaling rather than packet dropping
  • State Management :

    • This is simpler than traditional CoDel's state machine.
pub struct CoDelMetrics {
    sojourn_times: VecDeque<(Instant, Duration)>,
    first_above_time: Option<Instant>,
    last_scale_time: Instant,
}
Enter fullscreen mode Exit fullscreen mode

2. Health Monitoring

We've added the usual health monitoring with TCP health checks:

pub struct HealthCheckConfig {
    pub startup_timeout: Duration,
    pub startup_failure_threshold: u32,
    pub liveness_period: Duration,
    pub liveness_failure_threshold: u32,
    pub tcp_check: Option<TcpHealthCheck>,
}
Enter fullscreen mode Exit fullscreen mode

This system provides:

  • Configurable health check parameters

  • TCP-level connectivity verification

  • Granular control over failure thresholds

  • Separate startup and liveness checks

3. Performance Optimizations

We've made several low-level optimizations to improve performance:

Switching to FxHashMap/FxHashSet

use rustc_hash::{FxHashMap, FxHashSet};

pub static INSTANCE_STORE: OnceLock<
    Arc<RwLock<FxHashMap<String, FxHashMap<Uuid, InstanceMetadata>>>>
> = OnceLock::new();
Enter fullscreen mode Exit fullscreen mode

By replacing standard HashMap with FxHashMap:

  • Reduced memory overhead

  • Faster hash computation

  • Better performance for string keys

  • Lower collision rates in our specific use cases

4. Improved Resource Management

We've implemented a more sophisticated resource management system:

pub struct ResourceThresholds {
    pub cpu_percentage: Option<u8>,
    pub cpu_percentage_relative: Option<u8>,
    pub memory_percentage: Option<u8>,
    pub metrics_strategy: PodMetricsStrategy,
}
Enter fullscreen mode Exit fullscreen mode

This allows for:

  • Fine-grained control over resource utilization

  • Better handling of CPU quota management

  • More accurate memory tracking

  • Customizable metrics aggregation strategies

Real-World Impact

These improvements have had significant real-world impact:

  • 30% reduction in unnecessary scaling operations

  • More stable performance under varying load conditions

  • Reduced resource usage in the orchestrator itself

  • Better handling of microservices with varying performance characteristics

  • Still managed to retain a <5MB binary size footprint

What's Next: Decentralized Clustering!?

We're excited to explore our next major development focus: a decentralized clustering solution. This will allow Orbit to:

  • Operate without a central control plane

  • Provide better resilience in edge deployments

  • Enable peer-to-peer node coordination

  • Support dynamic cluster topology changes

We have some initial ideas on how to design the solution, so please follow for our next update to see how we hope to make this happen!

Building at Scale with Air Pipe

While Orbit handles container orchestration, it's just one piece of the puzzle. At Air Pipe, we're building a comprehensive platform for creating scalable, resilient APIs, integrations, and workflows. Our platform enables you to:

  • Build and deploy scalable APIs with minimal boilerplate

  • Create robust integration workflows

  • Implement resilient data processing pipelines

  • Leverage edge computing capabilities

If you're building distributed systems or scalable applications, visit airpipe.io to learn how our platform can accelerate your development.

Get Involved

We're building Orbit in the open and value community input. Whether you're interested in the technical details or want to contribute to our upcoming clustering features:

Stay tuned for our next technical deep-dive where we'll explore the architecture of our decentralized clustering approach!

Top comments (0)