We encountered a strange issue where build agents were failing randomly While running Jenkins pipelines on Kubernetes.
Agent pods would start normally and builds would run successfully for some time. Then the pods would terminate unexpectedly, causing pipeline failures.
Initially, we investigated:
• Jenkins logs
• Pipeline configuration
• Docker build stages
• SonarQube scans
Everything looked normal.
The real cause became clear after inspecting pod events:
kubectl describe pod jenkins-agent-pod
The Events section showed:
Evicted
The node was low on resource: ephemeral-storage
What Consumes Ephemeral Storage in CI Agents?
CI agent pods often consume more disk than expected due to:
• Docker image layers
• Dependency downloads
• Temporary build files
• Test artifacts
• Coverage reports
• SonarQube cache
• Package manager caches
Unlike CPU and memory, ephemeral storage is frequently ignored in resource configuration.
Why This Causes Pipeline Failures?
When ephemeral storage usage exceeds node capacity:
- Kubernetes marks the node under disk pressure
- Pods get evicted
- Jenkins agents disappear
- Pipelines fail unexpectedly Since Jenkins does not clearly indicate storage-related failures, this often looks like a Jenkins or pipeline problem.
The Fix
We resolved the issue by explicitly defining ephemeral storage resources:
resources:
requests:
cpu: 500m
memory: 1Gi
ephemeral-storage: 4Gi
limits:
cpu: 2
memory: 4Gi
ephemeral-storage: 10Gi
Additional improvements included:
• Cleaning workspace after builds
• Splitting heavy pipelines into separate agents
• Increasing node storage capacity
• Reducing artifact retention
Key Takeaway
If Jenkins agents fail randomly in Kubernetes, always check pod events:
kubectl describe pod jenkins-agent-pod
Ephemeral storage exhaustion is one of the most common but overlooked causes of CI/CD instability.
Top comments (0)