DEV Community

Shriharsh Pandurang Gaikwad
Shriharsh Pandurang Gaikwad

Posted on

Dynamic Jenkins Agents with Kubernetes and Docker: Scale Your CI/CD Infrastructure Elastically

Introduction

Hook
Start with a relatable pain point:

  • "Remember the days of maintaining a pool of static Jenkins build servers? Constant capacity planning, resource waste during off-hours, and bottlenecks during peak deployment times?"
  • Brief story of a team spending thousands on idle build infrastructure

The Problem Statement

  • Static Jenkins agents are expensive and inefficient
  • Resource contention during peak hours
  • Different projects require different build environments
  • Maintenance overhead of keeping agents updated
  • Difficulty scaling globally across teams

The Solution Preview

  • Introduce dynamic agent provisioning with Kubernetes
  • Benefits: elasticity, isolation, cost optimization, consistency
  • What readers will learn: architecture, implementation, optimization strategies

Article Roadmap
Quick overview of sections to set expectations


Section 1: Architecture Deep Dive
1.1 Traditional Jenkins Architecture Review

  • Master-agent architecture recap
  • Static agent pool limitations
  • Resource allocation challenges

1.2 Kubernetes-Native Jenkins Architecture
Diagram/Visual: Jenkins Master → Kubernetes API → Dynamic Pods
Key Components:

  • Jenkins Master: Orchestrates builds, runs in Kubernetes as a deployment
  • Kubernetes Plugin: Communicates with K8s API to provision agents
  • Pod Templates: Define agent specifications (containers, resources, volumes)
  • Dynamic Agents: Ephemeral pods created on-demand, destroyed after use

1.3 How It Works: The Agent Lifecycle
Step-by-step flow:

  1. Pipeline triggered
  2. Jenkins requests agent from Kubernetes
  3. K8s schedules pod with specified containers
  4. Pod pulls Docker images and starts
  5. Jenkins connects via JNLP/WebSocket
  6. Build executes in pod containers
  7. Pod terminates and cleans up automatically

1.4 Benefits Quantified

  • Cost savings: Real metrics (e.g., "Reduce idle resource costs by 60-80%")
  • Scalability: Handle 10x more concurrent builds
  • Isolation: Every build gets fresh environment
  • Flexibility: Different tools/versions per project

Section 2: Prerequisites and Setup
2.1 What You'll Need
Infrastructure:

  • Kubernetes cluster (v1.24+) - EKS, GKE, AKS, or self-managed
  • Minimum 3 nodes recommended
  • kubectl configured and authenticated

Jenkins:

  • Jenkins 2.4+ (LTS recommended)
  • Admin access to install plugins
  • Existing Jenkins or new installation

Knowledge Prerequisites:

  • Understanding of Kubernetes concepts (pods, namespaces, services)
  • Familiarity with Jenkins pipelines (declarative or scripted)
  • Docker image building basics

2.2 Namespace and RBAC Setup
Code Block: Kubernetes manifest for:

# ServiceAccount, Role, RoleBinding for Jenkins
# Permissions needed: pods (create, delete, list, watch)
Enter fullscreen mode Exit fullscreen mode

2.3 Installing the Kubernetes Plugin

  • Navigate to Manage Jenkins → Plugin Manager
  • Search for "Kubernetes" plugin
  • Install and restart Jenkins
  • Verify installation

Section 3: Configuring Jenkins Kubernetes Cloud
3.1 Initial Cloud Configuration
Step-by-step with screenshots/annotations:

Navigate to: Manage Jenkins → Clouds → New Cloud

Configure Kubernetes connection:

  • Kubernetes URL (in-cluster or external)
  • Kubernetes Namespace
  • Credentials (service account token)
  • Jenkins URL and tunnel

Code Block: Example configuration as code (JCasC):

jenkins:
  clouds:
    - kubernetes:
        name: "kubernetes"
        serverUrl: "https://kubernetes.default"
        namespace: "jenkins"
        jenkinsUrl: "http://jenkins:8080"
        jenkinsTunnel: "jenkins-agent:50000"
Enter fullscreen mode Exit fullscreen mode

3.2 Testing the Connection

  • Use "Test Connection" button
  • Troubleshooting common issues:

    • Certificate validation errors
    • Network connectivity
    • RBAC permissions

3.3 Pod Template Configuration Basics
Essential settings explained:

  • Name and Labels: Identifying agents
  • Containers: Define build environment(s)
  • Volumes: Persistent data, Docker socket, caching
  • Resource Limits: CPU and memory constraints
  • Service Account: Pod-level permissions

Section 4: Creating Your First Pod Template
4.1 Simple Pod Template: Single Container
Practical Example: Basic Maven build agent
Configuration walkthrough:

// Pod template definition
podTemplate(
  name: 'maven-agent',
  label: 'maven',
  containers: [
    containerTemplate(
      name: 'maven',
      image: 'maven:3.8-openjdk-11',
      ttyEnabled: true,
      command: 'cat'
    )
  ]
)
Enter fullscreen mode Exit fullscreen mode

Explanation of each parameter:

Why command: 'cat' (keeps container alive)
ttyEnabled: true for interactive shells

4.2 Using the Template in a Pipeline
Complete pipeline example:

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: maven
    image: maven:3.8-openjdk-11
    command:
    - cat
    tty: true
'''
        }
    }
    stages {
        stage('Build') {
            steps {
                container('maven') {
                    sh 'mvn clean package'
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Running your first build:

  • Create new pipeline job
  • Watch pod creation in K8s: kubectl get pods -n jenkins -w
  • Observe automatic cleanup after completion

Section 5: Advanced Pod Templates
5.1 Multi-Container Pods
Use Case: Build that requires multiple tools (build, test, scan)
Example: Node.js app with Docker build capability

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: node
    image: node:18-alpine
    command: ['cat']
    tty: true
  - name: docker
    image: docker:24-dind
    securityContext:
      privileged: true
  - name: trivy
    image: aquasec/trivy:latest
    command: ['cat']
    tty: true
'''
        }
    }
    stages {
        stage('Install Dependencies') {
            steps {
                container('node') {
                    sh 'npm ci'
                }
            }
        }
        stage('Build Docker Image') {
            steps {
                container('docker') {
                    sh 'docker build -t myapp:${BUILD_NUMBER} .'
                }
            }
        }
        stage('Security Scan') {
            steps {
                container('trivy') {
                    sh 'trivy image myapp:${BUILD_NUMBER}'
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Key Concepts Explained:

  • Container isolation and when to use multiple containers
  • Switching between containers with container() block
  • Shared workspace volume across containers

5.2 Volume Mounts and Caching
Problem: Repeated dependency downloads slow builds
Solution: Persistent volume claims for caching

volumes:
  - persistentVolumeClaim:
      claimName: maven-cache
      mountPath: /root/.m2
  - persistentVolumeClaim:
      claimName: npm-cache
      mountPath: /root/.npm
Enter fullscreen mode Exit fullscreen mode

Implementation tips:

  • PVC creation and sizing
  • Cache invalidation strategies
  • ReadWriteMany vs ReadWriteOnce considerations

5.3 Docker-in-Docker (DinD) Configuration
Two approaches compared:
Approach 1: Docker-in-Docker (privileged)

- name: docker
  image: docker:24-dind
  securityContext:
    privileged: true
  volumeMounts:
    - name: docker-sock
      mountPath: /var/run
Enter fullscreen mode Exit fullscreen mode

Approach 2: Docker socket mounting (host Docker)

volumes:
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock
Enter fullscreen mode Exit fullscreen mode

Security considerations:

  • Privileged containers risks
  • Host Docker socket implications
  • Alternatives: Kaniko, Buildah for rootless builds

5.4 Environment Variables and Secrets
Injecting configuration securely:

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: builder
    image: alpine
    env:
    - name: ENVIRONMENT
      value: "production"
    - name: API_KEY
      valueFrom:
        secretKeyRef:
          name: build-secrets
          key: api-key
'''
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Best practices:

  • Kubernetes secrets for sensitive data
  • ConfigMaps for non-sensitive configuration
  • Jenkins credentials integration

5.5 Resource Management
Setting requests and limits:

containers:
  - name: maven
    image: maven:3.8-openjdk-11
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1000m"
Enter fullscreen mode Exit fullscreen mode

Guidance:

  • Rightsizing based on build requirements
  • Impact of resource constraints on scheduling
  • Monitoring and adjustment strategies

Section 6: Reusable Pod Templates with Shared Libraries
6.1 The Problem with Inline YAML

  • Duplication across pipelines
  • Difficult to maintain and update
  • No version control

6.2 Creating Centralized Pod Templates
Option 1: Jenkins Configuration as Code

jenkins:
  clouds:
    - kubernetes:
        templates:
          - name: "maven-template"
            label: "maven"
            yaml: |
              apiVersion: v1
              kind: Pod
              spec:
                containers:
                - name: maven
                  image: maven:3.8-openjdk-11
Enter fullscreen mode Exit fullscreen mode

Option 2: Shared Library with Pod Template

// vars/mavenpod.groovy
def call(Closure body) {
    podTemplate(
        yaml: libraryResource('podtemplates/maven.yaml')
    ) {
        body()
    }
}

// Usage in pipeline:
mavenPod {
    node(POD_LABEL) {
        container('maven') {
            sh 'mvn clean install'
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Single source of truth
  • Version controlled templates
  • Easy updates across all pipelines

Section 7: Production Best Practices
7.1 Resource Optimization
Strategies:

  • Pod retention: Keep pods for debugging (podRetention: onFailure)
  • Idle timeout: Reclaim resources from stalled builds
  • Concurrent build limits: Per-label restrictions
  • Node affinity: Direct builds to appropriate node pools

Example configuration:

podTemplate(
    idleMinutes: 5,
    podRetention: onFailure(),
    activeDeadlineSeconds: 3600,
    nodeSelector: 'workload=builds'
) { ... }
Enter fullscreen mode Exit fullscreen mode

7.2 Security Hardening
Essential measures:

  • Namespace isolation: Separate namespaces per team/project
  • Network policies: Restrict pod-to-pod communication
  • Pod security policies/standards: Enforce non-root containers
  • Image scanning: Integrate Trivy/Anchore in templates
  • Secret management: External secret stores (Vault, AWS Secrets Manager) Example network policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: jenkins-agents
spec:
  podSelector:
    matchLabels:
      jenkins: agent
  policyTypes:
  - Ingress
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: jenkins
Enter fullscreen mode Exit fullscreen mode

7.3 High Availability Configuration
Jenkins Master HA:

  • Active-passive setup with shared PVC
  • Cloud provider load balancers
  • Health checks and auto-recovery

Agent connectivity resilience:

  • WebSocket connection (preferred over JNLP)
  • Connection retry configuration
  • Graceful pod eviction handling

7.4 Monitoring and Observability
Key metrics to track:

  • Agent provisioning time
  • Build queue length
  • Pod failure rates
  • Resource utilization

Tools integration:

  • Prometheus metrics from Jenkins
  • Kubernetes metrics server
  • Grafana dashboards

Sample queries:

# Average agent startup time
rate(jenkins_pod_launch_duration_seconds_sum[5m]) / 
rate(jenkins_pod_launch_duration_seconds_count[5m])

# Failed pod launches
sum(rate(jenkins_pod_launch_failed_total[5m]))
Enter fullscreen mode Exit fullscreen mode

7.5 Troubleshooting Common Issues
Problem-Solution table:

Debugging techniques:

  • kubectl describe pod for pod events
  • kubectl logs for container logs
  • Jenkins system logs for connection issues
  • Enable debug logging: java.util.logging.ConsoleHandler.level = FINEST

Section 8: Real-World Implementation Example
8.1 Case Study: Multi-Stage Application Pipeline
Scenario: Complete CI/CD for a microservice
Architecture:

  • Source: Git repository
  • Build: Maven/Gradle
  • Test: JUnit, integration tests
  • Security: SAST, container scanning
  • Deploy: Helm chart to staging

Complete pipeline with optimized pod template:

@Library('shared-library') _

pipeline {
    agent {
        kubernetes {
            yaml """
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: microservice-builder
spec:
  serviceAccountName: jenkins-agent
  containers:
  - name: maven
    image: maven:3.8-openjdk-11
    command: ['cat']
    tty: true
    volumeMounts:
    - name: m2-cache
      mountPath: /root/.m2
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1000m"
  - name: kaniko
    image: gcr.io/kaniko-project/executor:debug
    command: ['/busybox/cat']
    tty: true
    volumeMounts:
    - name: docker-config
      mountPath: /kaniko/.docker
  - name: trivy
    image: aquasec/trivy:latest
    command: ['cat']
    tty: true
  - name: helm
    image: alpine/helm:latest
    command: ['cat']
    tty: true
  volumes:
  - name: m2-cache
    persistentVolumeClaim:
      claimName: maven-cache
  - name: docker-config
    secret:
      secretName: docker-registry-credentials
"""
        }
    }

    environment {
        APP_NAME = 'my-microservice'
        IMAGE_REGISTRY = 'myregistry.io'
        IMAGE_TAG = "${env.BUILD_NUMBER}"
    }

    stages {
        stage('Checkout') {
            steps {
                checkout scm
            }
        }

        stage('Build & Test') {
            steps {
                container('maven') {
                    sh '''
                        mvn clean verify
                        mvn sonar:sonar -Dsonar.host.url=${SONAR_URL}
                    '''
                }
            }
            post {
                always {
                    junit '**/target/surefire-reports/*.xml'
                    jacoco()
                }
            }
        }

        stage('Build Image') {
            steps {
                container('kaniko') {
                    sh """
                        /kaniko/executor \
                          --context=\${WORKSPACE} \
                          --dockerfile=Dockerfile \
                          --destination=${IMAGE_REGISTRY}/${APP_NAME}:${IMAGE_TAG} \
                          --destination=${IMAGE_REGISTRY}/${APP_NAME}:latest \
                          --cache=true \
                          --cache-ttl=24h
                    """
                }
            }
        }

        stage('Security Scan') {
            steps {
                container('trivy') {
                    sh """
                        trivy image \
                          --severity HIGH,CRITICAL \
                          --exit-code 1 \
                          ${IMAGE_REGISTRY}/${APP_NAME}:${IMAGE_TAG}
                    """
                }
            }
        }

        stage('Deploy to Staging') {
            when {
                branch 'main'
            }
            steps {
                container('helm') {
                    sh """
                        helm upgrade --install ${APP_NAME} ./helm \
                          --namespace staging \
                          --set image.tag=${IMAGE_TAG} \
                          --wait
                    """
                }
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Build ${env.BUILD_NUMBER} succeeded!"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Build ${env.BUILD_NUMBER} failed!"
            )
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

8.2 Performance Analysis
Metrics from this implementation:

  • Agent startup: 30-45 seconds (cold start)
  • Maven build cache hit: 70% faster subsequent builds
  • Kaniko layer caching: 60% faster image builds
  • Total pipeline time: ~8 minutes (vs 15 minutes with static agents)

8.3 Cost Comparison
Before (Static Agents):

  • 5 x m5.xlarge EC2 instances (24/7)
  • Monthly cost: ~$720

After (Dynamic K8s Agents):

  • Shared K8s cluster resources
  • Average concurrent builds: 3-4 pods
  • Monthly cost: ~$200-250
  • Savings: 65-70%

Section 9: Advanced Patterns and Tips
9.1 Matrix Builds with Dynamic Agents
Running parallel builds across configurations:

pipeline {
    agent none
    stages {
        stage('Test Matrix') {
            matrix {
                axes {
                    axis {
                        name: 'JAVA_VERSION'
                        values '11', '17', '21'
                    }
                    axis {
                        name: 'OS'
                        values 'alpine', 'ubuntu'
                    }
                }
                agent {
                    kubernetes {
                        yaml """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: java
    image: openjdk:${JAVA_VERSION}-${OS}
    command: ['cat']
    tty: true
"""
                    }
                }
                stages {
                    stage('Test') {
                        steps {
                            container('java') {
                                sh 'java -version'
                                sh 'mvn test'
                            }
                        }
                    }
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

9.2 Spot/Preemptible Instances for Cost Savings
Kubernetes node pools strategy:

  • Regular node pool for critical jobs
  • Spot instance pool for non-critical builds
  • Pod tolerations and node affinity
spec:
  tolerations:
  - key: "spot"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  nodeSelector:
    workload: spot-builds
Enter fullscreen mode Exit fullscreen mode

9.3 Git Repository Caching
Speed up checkouts with persistent volumes:

volumes:
  - name: git-cache
    persistentVolumeClaim:
      claimName: git-reference-cache
Enter fullscreen mode Exit fullscreen mode

Pipeline usage:

checkout([
    $class: 'GitSCM',
    userRemoteConfigs: [[
        url: 'https://github.com/myorg/myrepo',
        refspec: '+refs/heads/*:refs/remotes/origin/*'
    ]],
    extensions: [
        [$class: 'CloneOption', 
         reference: '/git-cache/myrepo.git']
    ]
])
Enter fullscreen mode Exit fullscreen mode

9.4 Workload Identity / IAM Roles for Service Accounts
AWS EKS example:

spec:
  serviceAccountName: jenkins-builder
  # Service account annotated with IAM role
  # Pods automatically get temporary AWS credentials
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • No hardcoded credentials
  • Automatic credential rotation
  • Fine-grained permissions per namespace/pod

Conclusion
Key Takeaways
What you've learned:

  • Architecture of Kubernetes-native Jenkins with dynamic agents
  • Step-by-step setup and configuration
  • Creating simple to advanced pod templates
  • Production-grade best practices for security, performance, and reliability
  • Real-world implementation patterns

The Impact
Transformation achieved:

  • ✅** 70% cost reduction** through elastic scaling
  • Zero idle resource waste - pay only for active builds
  • 10x scalability - handle massive build spikes
  • 100% environment consistency - fresh containers every build
  • Faster onboarding - developers get custom environments instantly

Next Steps
Your journey continues:

  • Start small: Deploy single-container pod template
  • Iterate: Add multi-container templates as needed
  • Optimize: Implement caching and resource tuning
  • Scale: Roll out across teams with shared libraries
  • Monitor: Establish observability dashboards

Additional Resources
Further learning:

  • Jenkins Kubernetes Plugin Documentation
  • Kubernetes Pod Spec Reference
  • Jenkins Configuration as Code
  • GitHub repository with example configurations: github.com/yourorg/jenkins-k8s-examples
    Call to Action
    Engage with readers:

  • "What's your biggest challenge with Jenkins scaling? Drop a comment below!"

  • "Share your pod template configurations - let's learn from each other!"

  • "Subscribe for more advanced DevOps content"


Appendix: Quick Reference
Useful kubectl Commands

# Watch agent pods being created/destroyed
kubectl get pods -n jenkins -w -l jenkins=agent

# Check pod logs
kubectl logs -n jenkins <pod-name> -c <container-name>

# Describe pod for troubleshooting
kubectl describe pod -n jenkins <pod-name>

# Get pod YAML
kubectl get pod -n jenkins <pod-name> -o yaml
Enter fullscreen mode Exit fullscreen mode

Common Pod Template Snippets
Bookmarkable code blocks for quick reference

  • Python agent
  • Node.js agent
  • Go agent
  • Rust agent
  • Multi-language polyglot agent

Troubleshooting Checklist
Quick diagnostic steps:

  • Kubernetes connectivity working?
  • Namespace has correct RBAC permissions?
  • Jenkins URL/tunnel accessible from pods?
  • Image pull secrets configured?
  • Resource quotas not exceeded?
  • Network policies allowing traffic?

Top comments (0)