Shriharsh Pandurang Gaikwad

Posted on Oct 19

Dynamic Jenkins Agents with Kubernetes and Docker: Scale Your CI/CD Infrastructure Elastically

#devops #kubernetes #docker #jenkins

Introduction

Hook
Start with a relatable pain point:

"Remember the days of maintaining a pool of static Jenkins build servers? Constant capacity planning, resource waste during off-hours, and bottlenecks during peak deployment times?"
Brief story of a team spending thousands on idle build infrastructure

The Problem Statement

Static Jenkins agents are expensive and inefficient
Resource contention during peak hours
Different projects require different build environments
Maintenance overhead of keeping agents updated
Difficulty scaling globally across teams

The Solution Preview

Introduce dynamic agent provisioning with Kubernetes
Benefits: elasticity, isolation, cost optimization, consistency
What readers will learn: architecture, implementation, optimization strategies

Article Roadmap
Quick overview of sections to set expectations

Section 1: Architecture Deep Dive
1.1 Traditional Jenkins Architecture Review

Master-agent architecture recap
Static agent pool limitations
Resource allocation challenges

1.2 Kubernetes-Native Jenkins Architecture
Diagram/Visual: Jenkins Master → Kubernetes API → Dynamic Pods
Key Components:

Jenkins Master: Orchestrates builds, runs in Kubernetes as a deployment
Kubernetes Plugin: Communicates with K8s API to provision agents
Pod Templates: Define agent specifications (containers, resources, volumes)
Dynamic Agents: Ephemeral pods created on-demand, destroyed after use

1.3 How It Works: The Agent Lifecycle
Step-by-step flow:

Pipeline triggered
Jenkins requests agent from Kubernetes
K8s schedules pod with specified containers
Pod pulls Docker images and starts
Jenkins connects via JNLP/WebSocket
Build executes in pod containers
Pod terminates and cleans up automatically

1.4 Benefits Quantified

Cost savings: Real metrics (e.g., "Reduce idle resource costs by 60-80%")
Scalability: Handle 10x more concurrent builds
Isolation: Every build gets fresh environment
Flexibility: Different tools/versions per project

Section 2: Prerequisites and Setup
2.1 What You'll Need
Infrastructure:

Kubernetes cluster (v1.24+) - EKS, GKE, AKS, or self-managed
Minimum 3 nodes recommended
kubectl configured and authenticated

Jenkins:

Jenkins 2.4+ (LTS recommended)
Admin access to install plugins
Existing Jenkins or new installation

Knowledge Prerequisites:

Understanding of Kubernetes concepts (pods, namespaces, services)
Familiarity with Jenkins pipelines (declarative or scripted)
Docker image building basics

2.2 Namespace and RBAC Setup
Code Block: Kubernetes manifest for:

# ServiceAccount, Role, RoleBinding for Jenkins
# Permissions needed: pods (create, delete, list, watch)

2.3 Installing the Kubernetes Plugin

Navigate to Manage Jenkins → Plugin Manager
Search for "Kubernetes" plugin
Install and restart Jenkins
Verify installation

Section 3: Configuring Jenkins Kubernetes Cloud
3.1 Initial Cloud Configuration
Step-by-step with screenshots/annotations:

Navigate to: Manage Jenkins → Clouds → New Cloud

Configure Kubernetes connection:

Kubernetes URL (in-cluster or external)
Kubernetes Namespace
Credentials (service account token)
Jenkins URL and tunnel

Code Block: Example configuration as code (JCasC):

jenkins:
  clouds:
    - kubernetes:
        name: "kubernetes"
        serverUrl: "https://kubernetes.default"
        namespace: "jenkins"
        jenkinsUrl: "http://jenkins:8080"
        jenkinsTunnel: "jenkins-agent:50000"

3.2 Testing the Connection

Use "Test Connection" button
Troubleshooting common issues:
- Certificate validation errors
- Network connectivity
- RBAC permissions

3.3 Pod Template Configuration Basics
Essential settings explained:

Name and Labels: Identifying agents
Containers: Define build environment(s)
Volumes: Persistent data, Docker socket, caching
Resource Limits: CPU and memory constraints
Service Account: Pod-level permissions

Section 4: Creating Your First Pod Template
4.1 Simple Pod Template: Single Container
Practical Example: Basic Maven build agent
Configuration walkthrough:

// Pod template definition
podTemplate(
  name: 'maven-agent',
  label: 'maven',
  containers: [
    containerTemplate(
      name: 'maven',
      image: 'maven:3.8-openjdk-11',
      ttyEnabled: true,
      command: 'cat'
    )
  ]
)

Explanation of each parameter:

Why command: 'cat' (keeps container alive)
ttyEnabled: true for interactive shells

4.2 Using the Template in a Pipeline
Complete pipeline example:

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: maven
    image: maven:3.8-openjdk-11
    command:
    - cat
    tty: true
'''
        }
    }
    stages {
        stage('Build') {
            steps {
                container('maven') {
                    sh 'mvn clean package'
                }
            }
        }
    }
}

Running your first build:

Create new pipeline job
Watch pod creation in K8s: kubectl get pods -n jenkins -w
Observe automatic cleanup after completion

Section 5: Advanced Pod Templates
5.1 Multi-Container Pods
Use Case: Build that requires multiple tools (build, test, scan)
Example: Node.js app with Docker build capability

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: node
    image: node:18-alpine
    command: ['cat']
    tty: true
  - name: docker
    image: docker:24-dind
    securityContext:
      privileged: true
  - name: trivy
    image: aquasec/trivy:latest
    command: ['cat']
    tty: true
'''
        }
    }
    stages {
        stage('Install Dependencies') {
            steps {
                container('node') {
                    sh 'npm ci'
                }
            }
        }
        stage('Build Docker Image') {
            steps {
                container('docker') {
                    sh 'docker build -t myapp:${BUILD_NUMBER} .'
                }
            }
        }
        stage('Security Scan') {
            steps {
                container('trivy') {
                    sh 'trivy image myapp:${BUILD_NUMBER}'
                }
            }
        }
    }
}

Key Concepts Explained:

Container isolation and when to use multiple containers
Switching between containers with container() block
Shared workspace volume across containers

5.2 Volume Mounts and Caching
Problem: Repeated dependency downloads slow builds
Solution: Persistent volume claims for caching

volumes:
  - persistentVolumeClaim:
      claimName: maven-cache
      mountPath: /root/.m2
  - persistentVolumeClaim:
      claimName: npm-cache
      mountPath: /root/.npm

Implementation tips:

PVC creation and sizing
Cache invalidation strategies
ReadWriteMany vs ReadWriteOnce considerations

5.3 Docker-in-Docker (DinD) Configuration
Two approaches compared:
Approach 1: Docker-in-Docker (privileged)

- name: docker
  image: docker:24-dind
  securityContext:
    privileged: true
  volumeMounts:
    - name: docker-sock
      mountPath: /var/run

Approach 2: Docker socket mounting (host Docker)

volumes:
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock

Security considerations:

Privileged containers risks
Host Docker socket implications
Alternatives: Kaniko, Buildah for rootless builds

5.4 Environment Variables and Secrets
Injecting configuration securely:

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: builder
    image: alpine
    env:
    - name: ENVIRONMENT
      value: "production"
    - name: API_KEY
      valueFrom:
        secretKeyRef:
          name: build-secrets
          key: api-key
'''
        }
    }
}

Best practices:

Kubernetes secrets for sensitive data
ConfigMaps for non-sensitive configuration
Jenkins credentials integration

5.5 Resource Management
Setting requests and limits:

containers:
  - name: maven
    image: maven:3.8-openjdk-11
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1000m"

Guidance:

Rightsizing based on build requirements
Impact of resource constraints on scheduling
Monitoring and adjustment strategies

Section 6: Reusable Pod Templates with Shared Libraries
6.1 The Problem with Inline YAML

Duplication across pipelines
Difficult to maintain and update
No version control

6.2 Creating Centralized Pod Templates
Option 1: Jenkins Configuration as Code

jenkins:
  clouds:
    - kubernetes:
        templates:
          - name: "maven-template"
            label: "maven"
            yaml: |
              apiVersion: v1
              kind: Pod
              spec:
                containers:
                - name: maven
                  image: maven:3.8-openjdk-11

Option 2: Shared Library with Pod Template

// vars/mavenpod.groovy
def call(Closure body) {
    podTemplate(
        yaml: libraryResource('podtemplates/maven.yaml')
    ) {
        body()
    }
}

// Usage in pipeline:
mavenPod {
    node(POD_LABEL) {
        container('maven') {
            sh 'mvn clean install'
        }
    }
}

Benefits:

Single source of truth
Version controlled templates
Easy updates across all pipelines

Section 7: Production Best Practices
7.1 Resource Optimization
Strategies:

Pod retention: Keep pods for debugging (podRetention: onFailure)
Idle timeout: Reclaim resources from stalled builds
Concurrent build limits: Per-label restrictions
Node affinity: Direct builds to appropriate node pools

Example configuration:

podTemplate(
    idleMinutes: 5,
    podRetention: onFailure(),
    activeDeadlineSeconds: 3600,
    nodeSelector: 'workload=builds'
) { ... }

7.2 Security Hardening
Essential measures:

Namespace isolation: Separate namespaces per team/project
Network policies: Restrict pod-to-pod communication
Pod security policies/standards: Enforce non-root containers
Image scanning: Integrate Trivy/Anchore in templates
Secret management: External secret stores (Vault, AWS Secrets Manager) Example network policy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: jenkins-agents
spec:
  podSelector:
    matchLabels:
      jenkins: agent
  policyTypes:
  - Ingress
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: jenkins

7.3 High Availability Configuration
Jenkins Master HA:

Active-passive setup with shared PVC
Cloud provider load balancers
Health checks and auto-recovery

Agent connectivity resilience:

WebSocket connection (preferred over JNLP)
Connection retry configuration
Graceful pod eviction handling

7.4 Monitoring and Observability
Key metrics to track:

Agent provisioning time
Build queue length
Pod failure rates
Resource utilization

Tools integration:

Prometheus metrics from Jenkins
Kubernetes metrics server
Grafana dashboards

Sample queries:

# Average agent startup time
rate(jenkins_pod_launch_duration_seconds_sum[5m]) / 
rate(jenkins_pod_launch_duration_seconds_count[5m])

# Failed pod launches
sum(rate(jenkins_pod_launch_failed_total[5m]))

7.5 Troubleshooting Common Issues
Problem-Solution table:

Debugging techniques:

kubectl describe pod for pod events
kubectl logs for container logs
Jenkins system logs for connection issues
Enable debug logging: java.util.logging.ConsoleHandler.level = FINEST

Section 8: Real-World Implementation Example
8.1 Case Study: Multi-Stage Application Pipeline
Scenario: Complete CI/CD for a microservice
Architecture:

Source: Git repository
Build: Maven/Gradle
Test: JUnit, integration tests
Security: SAST, container scanning
Deploy: Helm chart to staging

Complete pipeline with optimized pod template:

@Library('shared-library') _

pipeline {
    agent {
        kubernetes {
            yaml """
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: microservice-builder
spec:
  serviceAccountName: jenkins-agent
  containers:
  - name: maven
    image: maven:3.8-openjdk-11
    command: ['cat']
    tty: true
    volumeMounts:
    - name: m2-cache
      mountPath: /root/.m2
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1000m"
  - name: kaniko
    image: gcr.io/kaniko-project/executor:debug
    command: ['/busybox/cat']
    tty: true
    volumeMounts:
    - name: docker-config
      mountPath: /kaniko/.docker
  - name: trivy
    image: aquasec/trivy:latest
    command: ['cat']
    tty: true
  - name: helm
    image: alpine/helm:latest
    command: ['cat']
    tty: true
  volumes:
  - name: m2-cache
    persistentVolumeClaim:
      claimName: maven-cache
  - name: docker-config
    secret:
      secretName: docker-registry-credentials
"""
        }
    }

    environment {
        APP_NAME = 'my-microservice'
        IMAGE_REGISTRY = 'myregistry.io'
        IMAGE_TAG = "${env.BUILD_NUMBER}"
    }

    stages {
        stage('Checkout') {
            steps {
                checkout scm
            }
        }

        stage('Build & Test') {
            steps {
                container('maven') {
                    sh '''
                        mvn clean verify
                        mvn sonar:sonar -Dsonar.host.url=${SONAR_URL}
                    '''
                }
            }
            post {
                always {
                    junit '**/target/surefire-reports/*.xml'
                    jacoco()
                }
            }
        }

        stage('Build Image') {
            steps {
                container('kaniko') {
                    sh """
                        /kaniko/executor \
                          --context=\${WORKSPACE} \
                          --dockerfile=Dockerfile \
                          --destination=${IMAGE_REGISTRY}/${APP_NAME}:${IMAGE_TAG} \
                          --destination=${IMAGE_REGISTRY}/${APP_NAME}:latest \
                          --cache=true \
                          --cache-ttl=24h
                    """
                }
            }
        }

        stage('Security Scan') {
            steps {
                container('trivy') {
                    sh """
                        trivy image \
                          --severity HIGH,CRITICAL \
                          --exit-code 1 \
                          ${IMAGE_REGISTRY}/${APP_NAME}:${IMAGE_TAG}
                    """
                }
            }
        }

        stage('Deploy to Staging') {
            when {
                branch 'main'
            }
            steps {
                container('helm') {
                    sh """
                        helm upgrade --install ${APP_NAME} ./helm \
                          --namespace staging \
                          --set image.tag=${IMAGE_TAG} \
                          --wait
                    """
                }
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Build ${env.BUILD_NUMBER} succeeded!"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Build ${env.BUILD_NUMBER} failed!"
            )
        }
    }
}

8.2 Performance Analysis
Metrics from this implementation:

Agent startup: 30-45 seconds (cold start)
Maven build cache hit: 70% faster subsequent builds
Kaniko layer caching: 60% faster image builds
Total pipeline time: ~8 minutes (vs 15 minutes with static agents)

8.3 Cost Comparison
Before (Static Agents):

5 x m5.xlarge EC2 instances (24/7)
Monthly cost: ~$720

After (Dynamic K8s Agents):

Shared K8s cluster resources
Average concurrent builds: 3-4 pods
Monthly cost: ~$200-250
Savings: 65-70%

Section 9: Advanced Patterns and Tips
9.1 Matrix Builds with Dynamic Agents
Running parallel builds across configurations:

pipeline {
    agent none
    stages {
        stage('Test Matrix') {
            matrix {
                axes {
                    axis {
                        name: 'JAVA_VERSION'
                        values '11', '17', '21'
                    }
                    axis {
                        name: 'OS'
                        values 'alpine', 'ubuntu'
                    }
                }
                agent {
                    kubernetes {
                        yaml """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: java
    image: openjdk:${JAVA_VERSION}-${OS}
    command: ['cat']
    tty: true
"""
                    }
                }
                stages {
                    stage('Test') {
                        steps {
                            container('java') {
                                sh 'java -version'
                                sh 'mvn test'
                            }
                        }
                    }
                }
            }
        }
    }
}

9.2 Spot/Preemptible Instances for Cost Savings
Kubernetes node pools strategy:

Regular node pool for critical jobs
Spot instance pool for non-critical builds
Pod tolerations and node affinity

spec:
  tolerations:
  - key: "spot"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  nodeSelector:
    workload: spot-builds

9.3 Git Repository Caching
Speed up checkouts with persistent volumes:

volumes:
  - name: git-cache
    persistentVolumeClaim:
      claimName: git-reference-cache

Pipeline usage:

checkout([
    $class: 'GitSCM',
    userRemoteConfigs: [[
        url: 'https://github.com/myorg/myrepo',
        refspec: '+refs/heads/*:refs/remotes/origin/*'
    ]],
    extensions: [
        [$class: 'CloneOption', 
         reference: '/git-cache/myrepo.git']
    ]
])

9.4 Workload Identity / IAM Roles for Service Accounts
AWS EKS example:

spec:
  serviceAccountName: jenkins-builder
  # Service account annotated with IAM role
  # Pods automatically get temporary AWS credentials

Benefits:

No hardcoded credentials
Automatic credential rotation
Fine-grained permissions per namespace/pod

Conclusion
Key Takeaways
What you've learned:

Architecture of Kubernetes-native Jenkins with dynamic agents
Step-by-step setup and configuration
Creating simple to advanced pod templates
Production-grade best practices for security, performance, and reliability
Real-world implementation patterns

The Impact
Transformation achieved:

✅** 70% cost reduction** through elastic scaling
✅ Zero idle resource waste - pay only for active builds
✅ 10x scalability - handle massive build spikes
✅ 100% environment consistency - fresh containers every build
✅ Faster onboarding - developers get custom environments instantly

Next Steps
Your journey continues:

Start small: Deploy single-container pod template
Iterate: Add multi-container templates as needed
Optimize: Implement caching and resource tuning
Scale: Roll out across teams with shared libraries
Monitor: Establish observability dashboards

Additional Resources
Further learning:

Jenkins Kubernetes Plugin Documentation
Kubernetes Pod Spec Reference
Jenkins Configuration as Code
GitHub repository with example configurations: github.com/yourorg/jenkins-k8s-examples
Call to Action
Engage with readers:
"What's your biggest challenge with Jenkins scaling? Drop a comment below!"
"Share your pod template configurations - let's learn from each other!"
"Subscribe for more advanced DevOps content"

Appendix: Quick Reference
Useful kubectl Commands

# Watch agent pods being created/destroyed
kubectl get pods -n jenkins -w -l jenkins=agent

# Check pod logs
kubectl logs -n jenkins <pod-name> -c <container-name>

# Describe pod for troubleshooting
kubectl describe pod -n jenkins <pod-name>

# Get pod YAML
kubectl get pod -n jenkins <pod-name> -o yaml

Common Pod Template Snippets
Bookmarkable code blocks for quick reference