Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

#kubernetes #gke #spotinstances #sre

Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

The Challenge

Spot instances offer 60-90% cost savings, but come with a catch: 30-second termination notice. This creates reliability challenges - pod disruptions, capacity drops, and potential service degradation.

After running workloads on both ECS and Kubernetes with spot instances, I've found K8s provides architectural advantages that ECS simply cannot match. K8s has native features for coordinated shutdown, flexible scheduling constraints, and priority-based resource management that make 100% spot clusters production-viable.

Here's how K8s handles spot terminations differently.

K8s Features for Spot Reliability (Overview)

K8s provides a comprehensive set of primitives for handling spot terminations:

Graceful Shutdown: Application-level SIGTERM handling with request draining
Readiness Probe: Fast endpoint removal with failureThreshold: 1 (ECS equivalent: ALB health checks, limited to load balancer scenarios)
PreStop Hook: Coordinate shutdown timing before SIGTERM (No ECS equivalent - critical gap)
Over-Provisioning: Run excess capacity; still cheaper on spot than minimal on-demand
topologySpreadConstraints: Automatic multi-zone distribution with rebalancing
Soft Anti-Affinity: preferredDuringScheduling adapts to capacity (ECS has only hard constraints)
PriorityClass: Priority-based eviction for instant capacity reclamation (No ECS equivalent)
HorizontalPodAutoscaler: Asymmetric scaling - fast up, slow down

Key insight: 9 pods on spot ($270/mo) < 5 pods on-demand ($500/mo) with superior reliability.

(See Appendix for complete production-ready K8s configuration)

K8s vs ECS: Feature Comparison

Platform capability analysis for spot instance workloads:

Strategy	Kubernetes	ECS	Key Difference
1. Graceful shutdown	✓ Yes	✓ Yes	Application-level - identical implementation
2a. Readiness probe	✓ Yes	⚠ Partial	K8s: Any Service, `failureThreshold: 1` ECS: ALB/NLB only, minimum 2 checks
2b. PreStop hook	✓ Yes	✗ No	Critical gap: K8s delays SIGTERM for coordination ECS: Immediate SIGTERM causes ALB draining race condition
3. Over-provisioning	✓ Yes	✓ Yes	Conceptually similar, K8s features amplify effectiveness
4. Multi-zone	✓ Yes	⚠ Limited	K8s: `topologySpreadConstraints` with auto-rebalancing ECS: Task placement strategies, less dynamic
5. Soft anti-affinity	✓ Yes	✗ No	K8s exclusive: Adaptive constraints for dynamic capacity ECS: Hard constraints only, tasks can block
6. Overprovisioner	✓ Yes	✗ No	K8s exclusive: PriorityClass enables instant replacement ECS: No priority-based eviction
7. Asymmetric HPA	✓ Yes	✓ Yes	K8s: HorizontalPodAutoscaler ECS: Application Auto Scaling - comparable
PodDisruptionBudget	✓ Yes	✗ No	K8s only (voluntary disruptions, not spot)

Key Architectural Gaps in ECS

ECS missing capabilities:

PreStop hooks - No coordination mechanism; immediate SIGTERM creates load balancer draining race conditions
Soft constraints - All-or-nothing placement; tasks remain Pending when constraints conflict with capacity
Priority-based eviction - No overprovisioner pattern; cannot reclaim capacity from low-priority workloads

ECS limited capabilities:

Health checks - ALB/NLB only (K8s: any Service including internal mesh)
Multi-zone placement - Static strategies (K8s: dynamic rebalancing)

ECS equivalent capabilities:

Graceful shutdown - Application-level implementation
Over-provisioning - Task count management
Auto-scaling - Target tracking policies

Observed impact:

ECS: 0.1-1% error rate during spot terminations
K8s: <0.05% error rate with proper configuration

Node Management Layer: AWS wins

Beyond pod orchestration, there's another layer to consider: node management automation. AWS provides superior options here—EKS with Karpenter offers intelligent bin-packing at EC2 pricing ($89/month for our workload), while GKE Autopilot charges a serverless premium ($118/month for the same). For cost-conscious architectures, AWS's node management solutions (Karpenter in EKS, managed scaling in ECS) deliver better economics than GKE Autopilot's per-pod pricing model. (I'll cover this in detail in a separate post on Karpenter vs Autopilot cost models.)

Conclusion

K8s provides architectural primitives that enable production-grade spot instance reliability:

PreStop hooks eliminate shutdown race conditions
Soft constraints adapt to dynamic capacity
PriorityClass enables instant replacement

Combined with over-provisioning economics (spot discounts make excess capacity cheaper than minimal on-demand), these features make 100% spot clusters viable for production workloads.

The key difference from ECS: K8s doesn't just manage containers - it provides coordination mechanisms that enable graceful degradation under failure.

Running spot workloads? What strategies have worked for your architecture?

Appendix: Complete K8s Configuration

Production-ready configuration implementing all strategies:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api-production
spec:
  replicas: 9  # Over-provisioning: Run 80% more pods than minimum (need 5, run 9)

  strategy:
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 1

  template:
    spec:
      terminationGracePeriodSeconds: 30  # Total time budget for graceful shutdown
      priorityClassName: high-priority   # For overprovisioner pattern with pause pods

      # Multi-zone distribution (ECS equivalent: task placement strategies)
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: web-api

      # Soft anti-affinity (K8s exclusive - ECS can't do soft constraints!)
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:  # "preferred" = soft
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: web-api
              topologyKey: kubernetes.io/hostname

      containers:
      - name: api
        image: my-api:v1.0.0

        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "2000m"    # Allow 4x burst
            memory: "2Gi"

        # Readiness probe (similar to ECS ALB health checks, but works without LB)
        readinessProbe:
          httpGet:
            path: /ready    # Your app must implement this endpoint
            port: 8080
          periodSeconds: 5
          failureThreshold: 1  # K8s allows 1, ECS minimum is 2

        # PreStop hook (K8s exclusive - ECS has NO equivalent!)
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - |
                sleep 5      # Delay SIGTERM for endpoint removal propagation
                kill -TERM 1 # Trigger app's graceful shutdown handler
                sleep 20     # Allow app to drain in-flight requests

---
# HorizontalPodAutoscaler - Asymmetric scaling (fast up, slow down)
# ECS equivalent: Application Auto Scaling with target tracking policies
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-api-hpa
spec:
  scaleTargetRef:
    name: web-api-production
  minReplicas: 9              # Maintain over-provisioned baseline
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60  # Scale proactively before capacity exhaustion
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 0   # No delay - respond to spikes immediately
      policies:
      - type: Percent
        value: 100            # Aggressive: double capacity if needed
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300  # Conservative: wait 5 min to preserve buffer
      policies:
      - type: Pods
        value: 1              # Remove only 1 pod/min to maintain over-provisioning
        periodSeconds: 60