Haven't read Part 1? Start with Building Orbit: A Lightweight Container Orchestrator in Rust to learn about our journey's beginning.
In our previous article, we introduced Orbit, our lightweight container orchestrator built in Rust. Since then, we've made significant improvements driven by both community feedback and production requirements. Let's dive into the technical evolution that's making Orbit even more powerful and efficient.
Community-Driven Development
One of the most exciting aspects of Orbit's development has been the community engagement. A perfect example is our implementation of CoDel (Controlled Delay) for scaling decisions, which came directly from a community member's suggestion on Medium. We're also grateful to community members like Josselin Chevalay who contributed the pull_policy feature in our latest release, allowing control over container image pulling behavior. This collaborative approach will continue to help shape Orbit's feature set and technical direction.
Technical Evolution: Key Improvements
1. CoDel (Controlled Delay) - Inspired Scaling: Latency-Driven Container Orchestration
Unlike traditional orchestrators that rely solely on CPU and memory metrics, we've implemented CoDel-based/inspired scaling - a feature not natively available in Kubernetes or other major orchestrators. Here's how it works:
pub struct CoDelMetrics {
service_name: String,
sojourn_times: VecDeque<(Instant, Duration)>,
first_above_time: Option<Instant>,
last_scale_time: Instant,
config: CoDelConfig,
}
name: adaptive-scaling
instance_count:
min: 2
max: 10
# CoDel-inspired adaptive scaling based on request latency
codel:
target: 100ms # Target latency threshold
interval: 1s # Interval for checking delays
consecutive_intervals: 3 # Number of intervals above target before scaling
max_scale_step: 1 # Maximum instances to scale up at once
scale_cooldown: 30s # Minimum time between scaling actions
overload_status_code: 503 # Return 503 when overloaded
# Fine-tune scaling behavior
scaling_policy:
cooldown_duration: 60s # Wait time between scaling actions
scale_down_threshold_percentage: 50.0 # Scale down if usage below 50%
spec:
containers:
- name: main
image: airpipeio/infoapp:latest
ports:
- port: 80
node_port: 4335
The CoDel inspired implementation monitors request latency and makes intelligent scaling decisions based on both immediate and historical performance data. Benefits include:
More responsive scaling based on actual service performance
Better handling of latency spikes
Prevention of unnecessary scale-ups during temporary load increases
Note that this is just our initial implementation, and we will continue to improve where possible and perhaps rename when appropriate.
Key Differences from Traditional CoDel:
-
Service-Level Application :
- Our implementation applies CoDel principles at the service level rather than packet level
- Uses request latency instead of packet sojourn time
- Focuses on scaling rather than packet dropping
-
State Management :
- This is simpler than traditional CoDel's state machine.
pub struct CoDelMetrics {
sojourn_times: VecDeque<(Instant, Duration)>,
first_above_time: Option<Instant>,
last_scale_time: Instant,
}
2. Health Monitoring
We've added the usual health monitoring with TCP health checks:
pub struct HealthCheckConfig {
pub startup_timeout: Duration,
pub startup_failure_threshold: u32,
pub liveness_period: Duration,
pub liveness_failure_threshold: u32,
pub tcp_check: Option<TcpHealthCheck>,
}
This system provides:
Configurable health check parameters
TCP-level connectivity verification
Granular control over failure thresholds
Separate startup and liveness checks
3. Performance Optimizations
We've made several low-level optimizations to improve performance:
Switching to FxHashMap/FxHashSet
use rustc_hash::{FxHashMap, FxHashSet};
pub static INSTANCE_STORE: OnceLock<
Arc<RwLock<FxHashMap<String, FxHashMap<Uuid, InstanceMetadata>>>>
> = OnceLock::new();
By replacing standard HashMap with FxHashMap:
Reduced memory overhead
Faster hash computation
Better performance for string keys
Lower collision rates in our specific use cases
4. Improved Resource Management
We've implemented a more sophisticated resource management system:
pub struct ResourceThresholds {
pub cpu_percentage: Option<u8>,
pub cpu_percentage_relative: Option<u8>,
pub memory_percentage: Option<u8>,
pub metrics_strategy: PodMetricsStrategy,
}
This allows for:
Fine-grained control over resource utilization
Better handling of CPU quota management
More accurate memory tracking
Customizable metrics aggregation strategies
Real-World Impact
These improvements have had significant real-world impact:
30% reduction in unnecessary scaling operations
More stable performance under varying load conditions
Reduced resource usage in the orchestrator itself
Better handling of microservices with varying performance characteristics
Still managed to retain a <5MB binary size footprint
What's Next: Decentralized Clustering!?
We're excited to explore our next major development focus: a decentralized clustering solution. This will allow Orbit to:
Operate without a central control plane
Provide better resilience in edge deployments
Enable peer-to-peer node coordination
Support dynamic cluster topology changes
We have some initial ideas on how to design the solution, so please follow for our next update to see how we hope to make this happen!
Building at Scale with Air Pipe
While Orbit handles container orchestration, it's just one piece of the puzzle. At Air Pipe, we're building a comprehensive platform for creating scalable, resilient APIs, integrations, and workflows. Our platform enables you to:
Build and deploy scalable APIs with minimal boilerplate
Create robust integration workflows
Implement resilient data processing pipelines
Leverage edge computing capabilities
If you're building distributed systems or scalable applications, visit airpipe.io to learn how our platform can accelerate your development.
Get Involved
We're building Orbit in the open and value community input. Whether you're interested in the technical details or want to contribute to our upcoming clustering features:
Star us on GitHub
Join our Discord community
Visit Air Pipe
Follow our progress as we build the decentralized clustering solution
Stay tuned for our next technical deep-dive where we'll explore the architecture of our decentralized clustering approach!
Top comments (0)