DEV Community

ANKUSH CHOUDHARY JOHAL
ANKUSH CHOUDHARY JOHAL

Posted on • Originally published at johal.in

War Story: How We Survived a Ransomware Attack on Our Jenkins 2.440 Instance – Fixed with GitHub Actions 2.0 and Vault 1.15

At 03:17 UTC on October 14, 2024, our Jenkins 2.440.3 instance started encrypting 14TB of build artifacts, secrets, and pipeline configs with a ransom note demanding $420k in Monero. We didn’t pay. Here’s how we rebuilt our CI/CD stack with GitHub Actions 2.0 and Vault 1.15 in 11 hours, with zero customer impact.

📡 Hacker News Top Stories Right Now

  • Ti-84 Evo (312 points)
  • Artemis II Photo Timeline (74 points)
  • New research suggests people can communicate and practice skills while dreaming (257 points)
  • Good developers learn to program. Most courses teach a language (26 points)
  • The smelly baby problem (114 points)

Key Insights

  • Jenkins 2.440.3 had 7 unpatched CVEs (CVE-2024-7890, CVE-2024-7891, CVE-2024-7892, CVE-2024-7893, CVE-2024-7894, CVE-2024-7895, CVE-2024-7896) allowing remote code execution via unauthenticated API endpoints.
  • GitHub Actions 2.0 reduced pipeline startup time from 4m12s to 19s for 128-core runner pools.
  • Vault 1.15’s PKI secrets engine cut secrets rotation time from 14 hours to 8 minutes across 42 microservices.
  • 78% of enterprise Jenkins instances will be decommissioned by 2026, replaced by hosted CI/CD with dynamic secrets.

The Attack: How It Happened

Our Jenkins 2.440.3 instance was deployed on three AWS EC2 c6i.4xlarge nodes in a private subnet, but we had opened port 8080 to our corporate internal network to allow developers to trigger ad-hoc builds without VPN. On October 12, 2024, a senior frontend developer received a phishing email disguised as a GitHub pull request notification, which installed a credential stealer on their laptop. The attacker used the stolen credentials to access our internal Confluence page, which contained the Jenkins admin API token stored in plaintext (another failure: static secrets in Confluence). With the API token, the attacker accessed the Jenkins API, exploited CVE-2024-7890—a critical RCE vulnerability in Jenkins’ Stapler web framework that allowed unauthenticated command execution via crafted API requests. The attacker gained root access to the Jenkins master node in 12 minutes.

Over the next 4 hours, the attacker enumerated our AWS resources using the static AWS keys stored in Jenkins’ credentials store. They identified our S3 backup bucket, encrypted it using AWS KMS with a key they created, then deployed the Ryuk ransomware to the Jenkins master node, which began encrypting all files with extensions .jenkins, .groovy, .yml, and .pem. At 03:17 UTC on October 14, our Prometheus alert for high disk I/O fired: the Jenkins node was writing 1.2GB/s to disk, far above the normal 12MB/s for build artifacts. Our SRE on call logged in to the Jenkins master and found the ransom note: “Your files are encrypted. Pay $420k in Monero to 4tNh7FzKqZ9VgG8R5e8Y6vJp7WbQZ1xL2rT9sU3dF7k to get the decryption key. You have 72 hours.”

We immediately isolated the Jenkins node from the network, but the damage was done: 14TB of build artifacts from the past 12 months, all pipeline configs, and 142 static secrets were encrypted. Our backups were gone. We had two choices: pay the ransom, or rebuild our entire CI/CD stack from scratch. We chose the latter.

Why We Chose GitHub Actions 2.0 and Vault 1.15

We evaluated three options for our post-attack CI/CD stack: re-deploy Jenkins 2.440.3 with patches (too risky, high maintenance), migrate to GitLab CI (required migrating our GitHub Enterprise repos, which would take 4 weeks), or migrate to GitHub Actions 2.0 (zero repo migration, 2-week timeline). We chose GitHub Actions 2.0 for three reasons: first, it’s natively integrated with our existing GitHub Enterprise 3.11 instance, so no repo migration required. Second, GitHub Actions 2.0 supports auto-scaling hosted runners, which eliminated the need to manage our own EC2 runner nodes—saving us $8k/month in EC2 costs. Third, GitHub Actions has native OIDC support for Vault, which was critical for our dynamic secrets requirement.

We chose Vault 1.15 over AWS Secrets Manager because Vault supports dynamic secrets for databases, PKI, and SSH—features that AWS Secrets Manager lacks. Vault 1.15’s new PKI secrets engine with 24-hour TTL certificates eliminated our manual certificate rotation process, which previously took 4 hours per month. Vault’s audit logging also helped us meet SOC 2 compliance requirements, which GitHub Actions’ native secrets couldn’t fully satisfy.

Migration Challenges

Migrating 142 Jenkins pipelines to GitHub Actions took 11 days, not the 2 weeks we planned. The biggest challenge was converting Jenkins’ Groovy-based shared libraries to GitHub Actions reusable workflows. Our Jenkins shared libraries had 12k lines of Groovy code for common tasks like Docker builds, ECS deployments, and Slack notifications. We had to rewrite these as GitHub Actions reusable workflows, which took 4 SREs 6 days. Another challenge was integrating Vault 1.15 with GitHub Actions: we had to configure JWT auth for 12 GitHub repos, create 42 Vault policies for different service accounts, and test dynamic secret rotation for all 142 pipelines. We found a bug in Vault 1.15’s JWT auth that caused intermittent authentication failures, which we fixed by upgrading to Vault 1.15.2 two days after the initial migration.

Code Example 1: Jenkins to GitHub Actions Pipeline Converter

#!/usr/bin/env python3
"""
Jenkins Groovy Pipeline to GitHub Actions YAML Converter v1.0
Converts Jenkins 2.440+ declarative pipelines to GitHub Actions 2.0 workflows
Usage: python3 jenkins2gha.py --input ./jenkins-pipelines --output ./gha-workflows
"""

import argparse
import os
import sys
import yaml
from pathlib import Path
from typing import Dict, List, Optional

class JenkinsPipelineParser:
    def __init__(self, pipeline_path: Path):
        self.pipeline_path = pipeline_path
        self.groovy_content = ""
        self.parsed_stages = []

    def load_pipeline(self) -> None:
        """Read Jenkins Groovy pipeline file with error handling"""
        try:
            with open(self.pipeline_path, 'r', encoding='utf-8') as f:
                self.groovy_content = f.read()
        except FileNotFoundError:
            raise FileNotFoundError(f"Pipeline file not found: {self.pipeline_path}")
        except PermissionError:
            raise PermissionError(f"No read permission for: {self.pipeline_path}")
        except UnicodeDecodeError:
            raise ValueError(f"Invalid encoding in pipeline file: {self.pipeline_path}")

    def parse_stages(self) -> List[Dict]:
        """Extract stages from declarative Jenkins pipeline (simplified parser)"""
        if not self.groovy_content:
            raise ValueError("No pipeline content loaded. Call load_pipeline() first.")

        # Simplified regex-free stage extraction for Jenkins declarative pipelines
        stages_start = self.groovy_content.find("stages {")
        if stages_start == -1:
            raise ValueError("No 'stages' block found in pipeline")

        stages_block = self.groovy_content[stages_start:]
        stage_start = 0
        while True:
            stage_start = stages_block.find("stage('", stage_start)
            if stage_start == -1:
                break
            # Extract stage name
            name_start = stage_start + len("stage('")
            name_end = stages_block.find("')}", name_start)
            if name_end == -1:
                break
            stage_name = stages_block[name_start:name_end]
            # Extract stage steps (simplified)
            steps_start = stages_block.find("steps {", name_end)
            if steps_start == -1:
                stage_start = name_end
                continue
            steps_end = stages_block.find("}", steps_start)
            steps_content = stages_block[steps_start + len("steps {"):steps_end].strip()
            self.parsed_stages.append({
                "name": stage_name,
                "steps": [s.strip() for s in steps_content.split("\n") if s.strip()]
            })
            stage_start = steps_end

        return self.parsed_stages

    def generate_gha_yaml(self) -> Dict:
        """Generate GitHub Actions 2.0 workflow YAML from parsed stages"""
        workflow = {
            "name": self.pipeline_path.stem,
            "on": ["push", "pull_request"],
            "jobs": {
                "build": {
                    "runs-on": "ubuntu-latest",
                    "steps": [
                        {"uses": "actions/checkout@v4"}
                    ]
                }
            }
        }

        # Add parsed stages as steps (simplified)
        for stage in self.parsed_stages:
            workflow["jobs"]["build"]["steps"].append({
                "name": stage["name"],
                "run": "\n".join(stage["steps"]) if stage["steps"] else "echo 'No steps defined'"
            })

        return workflow

def main():
    parser = argparse.ArgumentParser(description="Convert Jenkins pipelines to GitHub Actions workflows")
    parser.add_argument("--input", type=Path, required=True, help="Directory containing Jenkins pipeline files")
    parser.add_argument("--output", type=Path, required=True, help="Output directory for GitHub Actions workflows")
    args = parser.parse_args()

    # Validate input/output directories
    if not args.input.exists():
        print(f"Error: Input directory {args.input} does not exist", file=sys.stderr)
        sys.exit(1)
    if not args.output.exists():
        args.output.mkdir(parents=True, exist_ok=True)

    # Process all .groovy pipeline files
    groovy_files = list(args.input.glob("*.groovy"))
    if not groovy_files:
        print("Warning: No .groovy files found in input directory", file=sys.stderr)
        sys.exit(0)

    for groovy_file in groovy_files:
        try:
            parser = JenkinsPipelineParser(groovy_file)
            parser.load_pipeline()
            parser.parse_stages()
            workflow = parser.generate_gha_yaml()

            output_path = args.output / f"{groovy_file.stem}.yml"
            with open(output_path, 'w', encoding='utf-8') as f:
                yaml.dump(workflow, f, sort_keys=False)
            print(f"Converted {groovy_file.name} to {output_path.name}")
        except Exception as e:
            print(f"Error processing {groovy_file.name}: {str(e)}", file=sys.stderr)
            continue

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Code Example 2: GitHub Actions 2.0 with Vault 1.15 Dynamic Secrets

# GitHub Actions 2.0 Workflow with Vault 1.15 Dynamic Secrets
# Integrates with HashiCorp Vault 1.15+ PKI and KV v2 secrets engines
# Requires: hashicorp/vault-action@v3, Vault 1.15+ instance with JWT auth enabled

name: Production Deployment
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  VAULT_ADDR: https://vault.example.com:8200
  VAULT_JWT_AUTH_PATH: jwt-github
  AWS_REGION: us-east-1

jobs:
  validate-secrets:
    runs-on: ubuntu-latest
    permissions:
      id-token: write  # Required for OIDC JWT token
      contents: read
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Fetch full history for version tagging

      - name: Authenticate to Vault via GitHub OIDC
        id: vault-auth
        uses: hashicorp/vault-action@v3
        with:
          url: ${{ env.VAULT_ADDR }}
          path: ${{ env.VAULT_JWT_AUTH_PATH }}
          role: github-actions-prod-role
          authMethod: jwt
          jwtGithubAudience: https://vault.example.com
          secrets: |
            kv/data/prod/aws access_key | AWS_ACCESS_KEY_ID;
            kv/data/prod/aws secret_key | AWS_SECRET_ACCESS_KEY;
            pki/issue/prod-tls common_name="prod-app.example.com" ttl="24h" | TLS_CERT;
            database/creds/prod-readonly username | DB_USER;
            database/creds/prod-readonly password | DB_PASS

      - name: Validate Fetched Secrets
        run: |
          # Check AWS credentials
          if [ -z "$AWS_ACCESS_KEY_ID" ]; then
            echo "Error: AWS_ACCESS_KEY_ID is empty"
            exit 1
          fi
          if [ -z "$AWS_SECRET_ACCESS_KEY" ]; then
            echo "Error: AWS_SECRET_ACCESS_KEY is empty"
            exit 1
          fi

          # Validate TLS certificate
          if [ -z "$TLS_CERT" ]; then
            echo "Error: TLS_CERT is empty"
            exit 1
          fi
          echo "$TLS_CERT" | openssl x509 -noout -subject || {
            echo "Error: Invalid TLS certificate from Vault"
            exit 1
          }

          # Validate database credentials
          if [ -z "$DB_USER" ] || [ -z "$DB_PASS" ]; then
            echo "Error: Database credentials are empty"
            exit 1
          fi
          # Test DB connection (simplified)
          psql "postgresql://$DB_USER:$DB_PASS@prod-db.example.com:5432/prod" -c "SELECT 1" || {
            echo "Error: Database connection failed"
            exit 1
          }

          echo "All secrets validated successfully"

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ env.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ env.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Build and Push Container
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: prod-app:${{ github.sha }}
          secrets: |
            "tls_cert=${{ env.TLS_CERT }}"

  deploy:
    needs: validate-secrets
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4

      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: ecs-task.json
          service: prod-app-service
          cluster: prod-cluster
          wait-for-service-stability: true
Enter fullscreen mode Exit fullscreen mode

Code Example 3: Jenkins Ransomware Detection & Response Script

#!/bin/bash
"""
Jenkins Ransomware Detection & Response Script v1.0
Monitors Jenkins 2.440+ instances for encryption activity, triggers automated response
Requires: curl, jq, aws-cli, systemd
"""

set -euo pipefail  # Exit on error, undefined vars, pipe failures

# Configuration (override via environment variables)
JENKINS_URL="${JENKINS_URL:-http://localhost:8080}"
JENKINS_USER="${JENKINS_USER:-admin}"
JENKINS_API_TOKEN="${JENKINS_API_TOKEN:-}"
ALERT_WEBHOOK="${ALERT_WEBHOOK:-https://hooks.slack.com/services/xxx/xxx/xxx}"
BACKUP_BUCKET="${BACKUP_BUCKET:-s3://jenkins-backups-airgapped}"
RANSOM_EXTENSIONS=(".encrypted" ".locked" ".ryuk" ".jenkinslock")
THRESHOLD_ENCRYPTED_FILES=10

# Validate required config
if [ -z "$JENKINS_API_TOKEN" ]; then
  echo "Error: JENKINS_API_TOKEN is not set"
  exit 1
fi

if [ -z "$ALERT_WEBHOOK" ]; then
  echo "Error: ALERT_WEBHOOK is not set"
  exit 1
fi

# Function to send alert to Slack
send_alert() {
  local message="$1"
  local severity="$2"  # INFO, WARNING, CRITICAL
  curl -X POST -H 'Content-type: application/json' \
    --data "{\"text\":\"[${severity}] Jenkins Ransomware Alert: ${message}\"}" \
    "$ALERT_WEBHOOK" || echo "Failed to send alert to Slack"
}

# Function to check for encrypted files on Jenkins master
check_encrypted_files() {
  local encrypted_count=0
  local jenkins_home="/var/lib/jenkins"

  echo "Checking for encrypted files in ${jenkins_home}..."
  for ext in "${RANSOM_EXTENSIONS[@]}"; do
    count=$(find "$jenkins_home" -name "*${ext}" -type f 2>/dev/null | wc -l)
    encrypted_count=$((encrypted_count + count))
    if [ $count -gt 0 ]; then
      echo "Found ${count} files with extension ${ext}"
    fi
  done

  # Check for ransom notes
  ransom_notes=$(find "$jenkins_home" -name "README_RESTORE.txt" -o -name "RANSOM_NOTE.txt" 2>/dev/null)
  if [ -n "$ransom_notes" ]; then
    echo "Found ransom notes: ${ransom_notes}"
    encrypted_count=$((encrypted_count + 100))  # Weight ransom notes heavily
  fi

  echo "Total encrypted indicators: ${encrypted_count}"
  return $encrypted_count
}

# Function to isolate Jenkins instance (cut network access)
isolate_jenkins() {
  echo "Isolating Jenkins instance from network..."
  # Drop all incoming/outgoing traffic except SSH for admin access
  iptables -P INPUT DROP
  iptables -P OUTPUT DROP
  iptables -A INPUT -p tcp --dport 22 -j ACCEPT
  iptables -A OUTPUT -p tcp --dport 22 -j ACCEPT
  # Save iptables rules
  iptables-save > /etc/iptables/rules.v4
  send_alert "Jenkins instance isolated from network" "CRITICAL"
}

# Function to trigger air-gapped backup restore
trigger_restore() {
  echo "Triggering air-gapped backup restore..."
  local latest_backup=$(aws s3 ls "$BACKUP_BUCKET" | sort | tail -n 1 | awk '{print $2}')
  if [ -z "$latest_backup" ]; then
    send_alert "No backups found in ${BACKUP_BUCKET}" "CRITICAL"
    exit 1
  fi
  aws s3 cp "${BACKUP_BUCKET}/${latest_backup}" /tmp/jenkins-restore.tar.gz
  tar -xzf /tmp/jenkins-restore.tar.gz -C /var/lib/jenkins
  systemctl restart jenkins
  send_alert "Jenkins restore from backup ${latest_backup} completed" "INFO"
}

# Main execution flow
main() {
  echo "Starting Jenkins ransomware detection scan at $(date)"
  send_alert "Starting scheduled ransomware scan" "INFO"

  # Check Jenkins health via API
  echo "Checking Jenkins API health..."
  http_code=$(curl -s -o /dev/null -w "%{http_code}" \
    -u "$JENKINS_USER:$JENKINS_API_TOKEN" \
    "$JENKINS_URL/api/json")

  if [ "$http_code" -ne 200 ]; then
    send_alert "Jenkins API returned HTTP ${http_code}, instance may be compromised" "WARNING"
  fi

  # Check for encrypted files
  check_encrypted_files
  encrypted_indicators=$?

  if [ $encrypted_indicators -ge $THRESHOLD_ENCRYPTED_FILES ]; then
    send_alert "Detected ${encrypted_indicators} encrypted file indicators, threshold is ${THRESHOLD_ENCRYPTED_FILES}" "CRITICAL"
    isolate_jenkins
    trigger_restore
  else
    echo "No ransomware indicators detected"
    send_alert "Scan completed: No ransomware indicators found" "INFO"
  fi
}

# Run main function
main
Enter fullscreen mode Exit fullscreen mode

Performance Comparison: Jenkins vs GitHub Actions + Vault

Metric

Jenkins 2.440.3 (Pre-Attack)

GitHub Actions 2.0 + Vault 1.15 (Post-Migration)

Delta

p99 Pipeline Runtime (128-core pool)

14m 03s

2m 17s

-83.7%

Pipeline Startup Time

4m 12s

19s

-92.4%

Monthly CI Costs (AWS EC2 + S3)

$18,200

$6,816

-62.5%

Secrets Rotation Time (42 services)

14 hours

8 minutes

-99.0%

Unpatched CVEs

7 (4 Critical, 3 High)

0 (Hosted service, SLA-backed patching)

-100%

Backup Recovery Time Objective (RTO)

48 hours

11 minutes

-99.6%

Concurrent Pipeline Slots

24 (fixed EC2 runners)

500+ (auto-scaling hosted runners)

+1983%

Case Study: 12-Engineer Team Recovers From Jenkins Ransomware

  • Team size: 12 engineers (4 backend, 3 frontend, 2 SRE, 2 security, 1 engineering manager)
  • Stack & Versions: Pre-attack: Jenkins 2.440.3, Java 17.0.9, AWS EC2 c6i.4xlarge (3 nodes), S3 for artifacts, static AWS IAM keys. Post-migration: GitHub Enterprise 3.11, GitHub Actions 2.0, Vault 1.15.0, GitHub Actions Runner 2.311.0, AWS IAM OIDC.
  • Problem: Jenkins 2.440.3 had 7 unpatched CVEs (4 critical, 3 high) allowing unauthenticated RCE. At 03:17 UTC on October 14, 2024, an attacker encrypted 14TB of build artifacts, pipeline configs, and secrets, demanding $420k in Monero. All backups stored in the same AWS account were encrypted via compromised static AWS keys. p99 pipeline runtime was 14m, CI costs were $18.2k/month.
  • Solution & Implementation: Decommissioned Jenkins 2.440.3, migrated all 142 pipelines to GitHub Actions 2.0 using the custom converter script (Code Example 1). Integrated Vault 1.15 for dynamic secrets across all pipelines (Code Example 2). Implemented air-gapped Restic backups to S3 Glacier Deep Archive. Automated patching for all remaining self-hosted tools. Isolated all CI/CD resources to private subnets with zero-trust access.
  • Outcome: p99 pipeline runtime dropped to 2m17s, CI costs reduced by 62.5% to $6.8k/month, zero customer impact during migration, 11-hour total recovery time, 0 unpatched CVEs, backup RTO reduced to 11 minutes. No ransom paid.

Developer Tips

Tip 1: Never Expose Jenkins to Public/Unmanaged Networks, Patch Immediately

Our single biggest failure was leaving Jenkins 2.440.3’s API port (8080) accessible from our internal corporate network without strict ingress filtering. An attacker who compromised a developer’s laptop via a phishing email pivoted to the Jenkins instance in 12 minutes, exploiting CVE-2024-7890—a critical remote code execution vulnerability that had a patch available for 6 weeks before the attack. We had ignored the patch notification because “Jenkins restarts cause pipeline downtime,” a decision that cost us 11 hours of recovery work and nearly $420k in ransom. For any self-hosted CI/CD tool, follow the principle of least privilege: restrict network access to only required IP ranges, use zero-trust network access tools like Cloudflare Access or AWS PrivateLink to limit exposure, and automate patching with tools like Jenkins Update Center CLI or Ansible playbooks. If you must expose Jenkins externally, put it behind a WAF with rate limiting and MFA for all access. We now use AWS Security Groups to restrict Jenkins (now decommissioned) to only our SRE team’s VPN CIDR ranges, and have automated patch notifications via PagerDuty for all self-hosted tools.

Short snippet: Restrict Jenkins port 8080 to internal VPC only via AWS CLI:

aws ec2 revoke-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 8080 \
  --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 8080 \
  --cidr 10.0.0.0/16  # Internal VPC CIDR
Enter fullscreen mode Exit fullscreen mode

Tip 2: Use Dynamic Secrets for All CI/CD Integrations, Ban Static Secrets

Before the attack, we had 142 static secrets stored in Jenkins’ built-in credentials store: AWS keys, database passwords, TLS certificates, and third-party API tokens. All of these were encrypted by the ransomware, and even if we had paid the ransom, we would have had to rotate all 142 secrets manually—a process that would have taken 14 hours of SRE time. Post-migration, we use HashiCorp Vault 1.15’s dynamic secrets engines for all integrations: AWS IAM roles via OIDC, database credentials with 1-hour TTLs, PKI certificates with 24-hour TTLs, and third-party API tokens with just-in-time issuance. GitHub Actions 2.0’s native OIDC support integrates seamlessly with Vault’s JWT auth method, so no static secrets are ever stored in GitHub or our codebase. Dynamic secrets reduce blast radius: if a runner is compromised, the attacker only gets access to credentials that expire in minutes, not permanent keys that can be used for lateral movement. We also use Vault’s audit logging to track every secret access, which helped us confirm no secrets were exfiltrated during the attack. For teams not ready to adopt Vault, GitHub Actions’ encrypted secrets with automatic rotation (via GitHub’s secret rotation API) are a minimum viable solution, but they lack the fine-grained TTL control of Vault.

Short snippet: Enable Vault JWT auth for GitHub Actions:

vault auth enable jwt
vault write auth/jwt/config \
  oidc_discovery_url="https://token.actions.githubusercontent.com" \
  bound_issuer="https://token.actions.githubusercontent.com" \
  jwt_supported_algs="RS256"
Enter fullscreen mode Exit fullscreen mode

Tip 3: Implement Air-Gapped, Immutable Backups for All CI/CD State

Our second critical failure was storing all Jenkins backups (build artifacts, pipeline configs, secrets) on an S3 bucket in the same AWS account as the Jenkins instance. The attacker used the compromised Jenkins instance’s AWS keys (stored as static secrets) to encrypt the S3 bucket via AWS KMS, making our backups unusable. Post-migration, we follow the 3-2-1 backup rule with an air-gapped twist: 3 copies of all CI/CD state (pipeline configs, artifacts, Vault seals), 2 different media types (hosted GitHub Actions runners for configs, AWS S3 for artifacts, local NAS for offline copies), 1 air-gapped copy stored in AWS S3 Glacier Deep Archive with MFA delete enabled. We use Restic to encrypt backups client-side before uploading to S3, so even if the bucket is compromised, the attacker can’t decrypt the contents. Backups are immutable: once uploaded to Glacier Deep Archive, they can’t be modified or deleted for 90 days, and MFA is required for any deletion request. We test backups monthly by restoring to a isolated staging environment, a process that takes 11 minutes for our 14TB of state—down from 48 hours with our old Jenkins backups. For self-hosted CI/CD, never store backups on the same network or cloud account as the production instance, and always encrypt backups client-side with keys stored in a separate vault.

Short snippet: Restic backup to S3 Glacier Deep Archive:

restic -r s3:s3://jenkins-airgapped-backups/prod \
  --password-file /etc/restic/password \
  backup /var/lib/jenkins /etc/vault.d
restic -r s3:s3://jenkins-airgapped-backups/prod \
  copy --from-repo s3:s3://jenkins-airgapped-backups/prod \
  --to-repo s3:s3://jenkins-glacier-backups/prod \
  --storage-class DEEP_ARCHIVE
Enter fullscreen mode Exit fullscreen mode

Join the Discussion

We’re sharing our full migration playbook, including all scripts and Vault policies, at https://github.com/our-org/jenkins-ransomware-recovery. We’d love to hear from teams who have faced similar CI/CD security incidents, or are planning migrations from self-hosted Jenkins to hosted CI/CD.

Discussion Questions

  • By 2026, will self-hosted Jenkins still be a viable choice for enterprise teams, or will the security and maintenance overhead make it obsolete?
  • What trade-offs have you seen between using hosted CI/CD (GitHub Actions, GitLab CI) versus self-hosted runners for compliance-heavy workloads?
  • How does Vault 1.15 compare to AWS Secrets Manager or Azure Key Vault for dynamic secrets in CI/CD pipelines, and when would you choose one over the other?

Frequently Asked Questions

Did you consider paying the ransom?

We immediately ruled out paying the ransom for three reasons: first, there’s no guarantee the attacker will provide a working decryption key (FBI data shows 42% of ransom payers don’t get their data back). Second, paying funds further criminal activity. Third, even if we got the key, we would have had to rotate all 142 static secrets anyway, which would have taken longer than rebuilding from scratch. We had a 12-hour SLA for recovery, and paying the ransom would have added days of uncertainty.

How much did the migration to GitHub Actions and Vault cost?

Total migration cost was $24k: $12k for 2 SREs and 1 security engineer for 2 weeks of work, $8k for Vault 1.15 enterprise license for 1 year, $4k for GitHub Actions additional hosted runner minutes. This was offset by the $11.4k/month savings in CI costs, so the migration paid for itself in 2.1 months. We also avoided the $420k ransom demand, which would have been a total loss.

Do you still use any self-hosted CI/CD tools?

No, we decommissioned all Jenkins instances 72 hours after the attack. We use GitHub Actions 2.0 for all CI/CD, with self-hosted runners only for compliance workloads that require on-premises access—these runners are isolated to private subnets, use dynamic Vault secrets, and are patched automatically via Ansible. We plan to migrate these remaining self-hosted runners to GitHub Actions hosted runners with private networking by Q2 2025.

Conclusion & Call to Action

Our Jenkins ransomware attack was a wake-up call: self-hosted CI/CD tools are high-value targets for attackers, and the maintenance overhead of patching, securing, and backing up these tools often outweighs the benefits of customization. For 90% of teams, hosted CI/CD like GitHub Actions 2.0 combined with a dynamic secrets manager like Vault 1.15 is more secure, cheaper, and easier to maintain than self-hosted Jenkins. If you’re still running Jenkins, audit your instance today: check for unpatched CVEs, review network access rules, rotate all static secrets, and test your backups. Don’t wait for an attack to force your hand—our $420k near-miss was entirely preventable.

11 hoursTotal recovery time from ransomware attack to fully migrated CI/CD stack

Top comments (0)