DEV Community

Yash Pritwani
Yash Pritwani

Posted on • Originally published at techsaas.cloud

CI/CD Pipeline Optimization: From 20-Minute to 3-Minute Builds

Originally published on TechSaaS Cloud


Originally published on TechSaaS Cloud


CI/CD Pipeline Optimization: From 20-Minute to 3-Minute Builds

Real numbers from a startup that cut build times by 85% — every step with code.

The Problem: 20 Minutes of Watching Spinners

Our CI pipeline was 20 minutes. On a busy day with 30+ PRs, that meant 10 hours of cumulative CI time. Developers context-switched while waiting. Reviews stalled. Deployments backed up.

We're a 12-person team running 84 Docker containers on self-hosted infrastructure. Our stack: Python + TypeScript + Go microservices, GitHub Actions CI, Docker-based deploys, PostgreSQL + Redis.

Every optimization below is free. No paid CI tools. No enterprise cache services. Just configuration changes and architectural decisions.

The 6 Changes That Got Us to 3 Minutes

1. Docker Layer Caching (Saved: 6 minutes)

Before: Every build pulled fresh base images and reinstalled all dependencies.

# BAD: invalidates cache on every code change
FROM python:3.12-slim
COPY . /app
RUN pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

After: Separate dependency installation from code changes.

# GOOD: dependencies cached until requirements.txt changes
FROM python:3.12-slim
COPY requirements.txt /app/
RUN pip install -r requirements.txt
COPY . /app
Enter fullscreen mode Exit fullscreen mode

In GitHub Actions, enable BuildKit cache:

- name: Build
  uses: docker/build-push-action@v5
  with:
    context: .
    cache-from: type=gha
    cache-to: type=gha,mode=max
    push: true
Enter fullscreen mode Exit fullscreen mode

Impact: First build unchanged. Subsequent builds skip the 6-minute dependency installation step entirely. Cache hit rate: ~92%.

2. Parallel Test Sharding (Saved: 5 minutes)

Before: 847 tests running sequentially: 8 minutes.

After: Split across 4 parallel runners using pytest-split:

strategy:
  matrix:
    shard: [1, 2, 3, 4]
steps:
  - name: Run tests
    run: |
      pytest --splits 4 --group ${{ matrix.shard }} \
        --splitting-algorithm least_duration
Enter fullscreen mode Exit fullscreen mode

The least_duration algorithm uses historical test timing data to balance shards evenly. We store timing data in .test_durations committed to the repo.

Impact: 8 minutes → 2.5 minutes (longest shard). The parallelism costs 4x the runner minutes, but wall-clock time dropped 68%.

For Indian startups on GitHub's free tier (2,000 minutes/month), this is a trade-off. We self-host our runners on the same server that runs production — more on that in step 6.

3. Dependency Pre-Build with Docker Compose (Saved: 3 minutes)

Before: Every microservice built its own node_modules or venv from scratch.

After: A shared base image with pre-installed dependencies, rebuilt only when lockfiles change.

# docker-compose.ci.yml
services:
  deps-python:
    build:
      context: .
      dockerfile: Dockerfile.deps-python
    image: registry.local/deps-python:latest

  service-api:
    build:
      context: ./services/api
      args:
        BASE_IMAGE: registry.local/deps-python:latest
Enter fullscreen mode Exit fullscreen mode
# Dockerfile.deps-python
FROM python:3.12-slim
COPY requirements/*.txt /deps/
RUN pip install -r /deps/base.txt -r /deps/test.txt
Enter fullscreen mode Exit fullscreen mode

A separate nightly CI job rebuilds the deps image. Feature branch builds pull it from our local registry.

Impact: Eliminated redundant dependency installation across 6 Python services. Saved ~3 minutes per build.

4. Smart Test Selection (Saved: 2 minutes)

Not every commit needs every test. We built a simple mapper:

# .github/scripts/test_selector.py
import subprocess, json, pathlib

changed = subprocess.check_output(
    ["git", "diff", "--name-only", "origin/main...HEAD"]
).decode().strip().split("\n")

test_map = {
    "services/api/": "tests/api/",
    "services/auth/": "tests/auth/",
    "services/billing/": "tests/billing/",
    "shared/": "tests/",  # shared code = run everything
}

tests_to_run = set()
for file in changed:
    for src, test_dir in test_map.items():
        if file.startswith(src):
            tests_to_run.add(test_dir)

# If nothing matched, run everything (safety net)
if not tests_to_run:
    tests_to_run.add("tests/")

print(" ".join(tests_to_run))
Enter fullscreen mode Exit fullscreen mode
- name: Select tests
  id: tests
  run: echo "dirs=$(python .github/scripts/test_selector.py)" >> $GITHUB_OUTPUT

- name: Run tests
  run: pytest ${{ steps.tests.outputs.dirs }}
Enter fullscreen mode Exit fullscreen mode

Impact: Most PRs touch 1-2 services. Running only relevant tests: 2.5 minutes → 45 seconds. Full suite still runs on merge to main.

5. Artifact Caching for Lint and Type Checks (Saved: 2 minutes)

ESLint, mypy, and tsc have incremental modes. Use them:

- name: Cache mypy
  uses: actions/cache@v4
  with:
    path: .mypy_cache
    key: mypy-${{ hashFiles('**/*.py') }}
    restore-keys: mypy-

- name: Type check
  run: mypy --incremental src/
Enter fullscreen mode Exit fullscreen mode

For ESLint:

- name: Cache ESLint
  uses: actions/cache@v4
  with:
    path: .eslintcache
    key: eslint-${{ hashFiles('**/*.ts', '**/*.tsx') }}

- name: Lint
  run: eslint --cache --cache-location .eslintcache src/
Enter fullscreen mode Exit fullscreen mode

Impact: Incremental lint/type-check: 2 minutes → 15 seconds on most PRs.

6. Self-Hosted Runners (Saved: 2 minutes of queue time)

GitHub-hosted runners have 30-90 second startup times plus queue time during peak hours. We run our CI on the same bare metal server as our staging environment.

runs-on: self-hosted

# In our runner setup (systemd service)
# Runner installed at /opt/actions-runner
# Runs as dedicated ci-runner user with Docker socket access
Enter fullscreen mode Exit fullscreen mode

Setup (one-time, 15 minutes):

  1. Download GitHub Actions runner binary
  2. Create systemd service
  3. Give the runner user Docker socket access
  4. Configure labels for routing

Self-hosted runners start instantly — no cloud VM boot, no image pull. Queue time went from 30-90 seconds to 0.

For teams in India or Southeast Asia, this also eliminates the latency penalty of GitHub's US-based runners pulling from your APAC Docker registry.

Impact: 2 minutes of queue/startup time eliminated. Free. Forever.

The Result

Step Before After
Queue + startup 1.5 min 0 min
Dependency install 6 min 0 min (cached)
Lint + type check 2 min 0.25 min
Build 3 min 0.5 min
Tests 8 min 2.5 min
Total 20.5 min 3.25 min

85% reduction. Zero additional cost.

Common Mistakes That Negate These Gains

We've seen teams implement all six optimizations and still have slow pipelines. Here's why.

Mistake 1: Flaky tests that force re-runs. If 5% of your test suite is flaky, you'll re-run CI on average once every 3-4 PRs. That re-run costs the full pipeline time. We quarantine flaky tests into a separate non-blocking job: they run, their results are logged, but they don't block the PR. A weekly "flaky test cleanup" ticket keeps the quarantine from growing forever.

Mistake 2: Not pinning dependency versions. If your requirements.txt has unpinned ranges (requests>=2.28), the dependency resolution step runs every time — even with caching — because pip needs to check if a newer version satisfies the constraint. Pin exact versions (requests==2.31.0) and use Dependabot or Renovate for updates. This alone can save 30-60 seconds per build.

Mistake 3: Running security scans synchronously. SAST/DAST tools (Snyk, Trivy, Bandit) are important but slow. Run them in a parallel job that doesn't block the main build. Your pipeline reports results, but developers can merge without waiting for a 3-minute vulnerability scan. Critical findings trigger a separate alert. This principle extends to secret scanning too — we cover the full secret management pipeline in our dedicated guide.

Mistake 4: Over-building in CI. Some teams build Docker images for every microservice on every PR, even when the service code didn't change. Use the same path-based filtering from Step 4 to skip builds for unchanged services. Our docker-compose.ci.yml has a --profile flag per service — CI only activates profiles for services with code changes.

Mistake 5: Ignoring the feedback loop. After optimizing, most teams stop measuring. We track CI build times in Prometheus and alert if the p95 build time exceeds 5 minutes. Performance degrades slowly — a new dependency here, an extra test there — and without monitoring, you're back to 15 minutes within 6 months.

Security Considerations in Fast Pipelines

Fast pipelines are only valuable if they're secure. Skipping security checks for speed is a false economy.

Our approach: security scans run in parallel, never blocking the main build path, but their results are mandatory before deploy. The build completes in 3 minutes, the security scan completes in 5, and the deploy job waits for both.

jobs:
  build-and-test:  # 3 minutes
    runs-on: self-hosted
    steps: [...]

  security-scan:  # 5 minutes, runs in parallel
    runs-on: self-hosted
    steps:
      - uses: aquasecurity/trivy-action@master
      - run: bandit -r src/ -f json -o bandit-report.json

  deploy:  # waits for BOTH
    needs: [build-and-test, security-scan]
    if: github.ref == 'refs/heads/main'
    steps: [...]
Enter fullscreen mode Exit fullscreen mode

This means the critical path is still 5 minutes (the slower security scan), but the developer feedback loop (did my tests pass?) is 3 minutes. Developers get fast feedback; deploys get security guarantees.

For teams handling sensitive credentials in their pipelines, the secret management best practices we published today covers how to avoid leaking secrets through CI logs — a common issue with fast, parallelized builds.

What We'd Add Next

  • Bazel or Nx for true incremental builds across a monorepo. We're not there yet — our repo isn't big enough to justify the complexity.
  • Test impact analysis using coverage data to be even more surgical about test selection.
  • Merge queues (GitHub's native feature) to batch CI runs and reduce total runner time.
  • Remote build caching (Turborepo, Gradle remote cache) for teams with larger monorepos — we've seen this shave another 40% off already-optimized builds.

The ROI Math

The ROI on CI optimization is absurd. A 12-person team saving 17 minutes per build across 30 daily builds reclaims 8.5 engineering hours per day. That's a full-time engineer's worth of productivity — recovered by spending 2 days on pipeline optimization.

But the real ROI isn't time saved — it's behavior change. When CI takes 3 minutes, developers wait for results before context-switching. When it takes 20 minutes, they start another task and the PR review sits for hours. Fast CI changes how your entire team works. The same build vs buy analysis applies here: investing 2 days in pipeline optimization is always better than buying an expensive CI SaaS tool.

Frequently Asked Questions

Q: Does this work for monorepos?

Yes, with adjustments. Steps 4 (smart test selection) and the Docker profile trick become even more valuable in monorepos because the ratio of "code changed" to "total code" is smaller. For monorepos over 50 services, consider Bazel, Nx, or Turborepo for incremental build tracking — they maintain a dependency graph that makes test selection automatic rather than manual.

Q: What about Windows or macOS builds?

Self-hosted runners (Step 6) work on all platforms, but the Docker caching strategy (Steps 1 and 3) is Linux-specific. For macOS CI (common in mobile development), focus on dependency caching (Cocoapods, Carthage) and parallel test sharding (XCTest supports this natively). The ROI is even higher for macOS builds because GitHub-hosted macOS runners are 10x more expensive than Linux runners.

Q: We use GitLab CI / Jenkins / CircleCI — does this still apply?

Every optimization except the GitHub-specific YAML applies to any CI system. Docker layer caching works everywhere Docker runs. Parallel test sharding works with any test framework. Dependency pre-builds work with any registry. Self-hosted runners exist for GitLab (gitlab-runner), Jenkins (agents), and CircleCI (self-hosted runner). The concepts transfer; only the config syntax changes.

Related Reading


We help teams audit and optimize their CI/CD pipelines. If your builds take longer than 5 minutes, there's almost certainly low-hanging fruit.

Get a free pipeline audit →

Subscribe to our newsletter for weekly deep-dives into developer productivity and infrastructure optimization.

Top comments (0)