DEV Community

Oleksandr Kuryzhev
Oleksandr Kuryzhev

Posted on

Docker BuildKit Cache Optimization for Faster CI Pipelines


Docker BuildKit cache optimization for faster CI pipelines is one of those infrastructure improvements that pays dividends immediately — every pipeline run after the first one benefits from layers that were already built and pushed. This post walks through the full setup: enabling BuildKit correctly, choosing between inline and registry cache modes, and writing a GitHub Actions workflow that actually reuses layers instead of rebuilding from scratch on every push.

The performance gap between a cold build and a warm cache hit can be dramatic. A Node.js or Python image that takes four minutes to install dependencies cold can complete that same stage in under ten seconds when the cache layer is pulled from a registry. Getting there requires more than just adding a flag — it requires understanding how BuildKit stores, retrieves, and validates cache manifests.

Requirements


Before touching a single YAML file, confirm your environment meets these prerequisites. Missing any one of them is the most common reason cache configurations appear to work but silently miss on every run.

  • Docker 20.10+ with BuildKit support — either set DOCKER_BUILDKIT=1 as an environment variable or add "features": {"buildkit": true} to /etc/docker/daemon.json on self-hosted runners.
  • Docker Buildx plugin — required for type=registry and type=gha cache backends. Plain docker build does not support these modes.
  • GitHub Actions runner on ubuntu-22.04 or later — the docker/setup-buildx-action works reliably on this image.
  • A container registry with write access — GitHub Container Registry (ghcr.io) works well here because the GITHUB_TOKEN handles authentication without managing separate credentials.
  • A Dockerfile with dependency installation separated from application code — if your COPY . . instruction appears before RUN pip install or RUN npm ci, every code change invalidates the dependency layer and the cache never helps where it matters most.
  • Network access from the runner to the registry — this sounds obvious, but using --cache-from without prior registry authentication causes a silent cache miss, not a build failure. You will not see an error; you will just see a slow build. If you are running self-hosted runners in a private network, review the BuildKit cache backend documentation for proxy and authentication considerations before proceeding.

Implementation
There are two cache strategies worth understanding before writing any configuration: inline cache and registry cache. Inline cache embeds cache metadata directly into the image manifest using the BUILDKIT_INLINE_CACHE=1 build argument. It requires no separate cache tag, but it only captures the final stage — useless for multi-stage builds where you want to cache the builder stage independently. Registry cache, by contrast, stores cache data as a separate manifest tag and supports mode=max, which pushes every intermediate layer rather than just the final one.

For most CI pipelines, registry cache with mode=max is the right choice. The difference between mode=min and mode=max is significant: min caches only the final stage output, while max caches every intermediate layer including builder stages, test runners, and dependency installers. If your Dockerfile has a dedicated build stage that compiles binaries, mode=max is what makes that stage reusable across runs.

Here is the complete GitHub Actions workflow implementing registry cache with BuildKit. Note that DOCKER_BUILDKIT: "1" is set at the job-level env block, not inside a single step — this ensures it is available to all steps that invoke Docker, including setup actions.

# .github/workflows/docker-build.yml
name: Docker Build with BuildKit Cache

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  DOCKER_BUILDKIT: "1"
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    runs-on: ubuntu-22.04
    permissions:
      contents: read
      packages: write

    steps:
      - name: Checkout source
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata for Docker
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

      - name: Build and push with registry cache
        uses: docker/build-push-action@v5
        with:
          context: .
          file: ./Dockerfile
          push: ${{ github.event_name != 'pull_request' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          # Pull existing cache layers from registry before building
          cache-from: |
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache
          # Push all intermediate layers (mode=max) for maximum cache reuse
          cache-to: |
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache,mode=max
          build-args: |
            BUILDKIT_INLINE_CACHE=1

      - name: Verify cache hit in build log
        run: |
          echo "Check above build step output for 'CACHED' layer lines."
          echo "Re-run this workflow without code changes to confirm cache is working."
Enter fullscreen mode Exit fullscreen mode

A few implementation details worth highlighting. The cache tag (:cache) is stored as a separate manifest in the registry — deleting your application image tag does not remove the cache tag. Manage them independently if you have registry retention policies. Also note that pull requests use cache-from but the workflow skips push on PR runs (push: ${{ github.event_name != 'pull_request' }}), which means PR builds read from cache but do not write back. This is intentional — you do not want untested branches polluting the cache used by main.

For teams already using GitHub Actions cache storage rather than a registry, switching to type=gha is straightforward: replace type=registry,ref=... with type=gha in both cache-from and cache-to. The docker/setup-buildx-action handles the necessary token injection automatically. That said, registry cache is more portable and works identically on self-hosted runners without GitHub-specific configuration.

For more on structuring CI pipelines around reusable patterns, see the DevOps_DayS pipeline guides on kuryzhev.cloud, which cover related topics including Jenkins shared library design and GitLab CI artifact passing between stages.

Test the Setup
Verifying that cache is actually being used — not just configured — requires inspecting build output directly. The key signal is the CACHED prefix on layer lines in the build log.

Run the workflow twice without changing any application code. On the second run, add --progress=plain to your build command (or check the raw step output in GitHub Actions) and look for lines like this:

#8 [builder 3/5] RUN pip install --no-cache-dir -r requirements.txt
#8 CACHED

#9 [builder 4/5] COPY src/ /app/src/
#9 CACHED
Enter fullscreen mode Exit fullscreen mode

If you see CACHED on dependency installation layers, the registry cache is working. If every layer shows a timestamp and byte count instead, the cache is being rebuilt. The most common causes are: the registry authentication step running after the buildx setup step (reorder them), the cache tag not yet existing on first run (expected — second run will be warm), or a COPY instruction ordering problem in the Dockerfile invalidating the cache before the expensive layers.

To confirm the cache manifest exists in the registry independently of your application image, run the following against your registry:

# Inspect the cache manifest tag directly
docker manifest inspect ghcr.io/your-org/your-repo:cache

# Expected output includes mediaType for BuildKit cache manifest
# "mediaType": "application/vnd.oci.image.manifest.v1+json"
# with layers referencing cached build stages
Enter fullscreen mode Exit fullscreen mode

If the manifest inspect returns a standard image manifest rather than a BuildKit cache manifest, the cache-to step did not complete successfully — check that the runner has write permissions to the packages scope and that the registry login step preceded the build step in your workflow.

Timing comparison is the most practical validation. Record the build duration from the Actions summary on run one (cold) and run two (warm). A well-structured Dockerfile with stable dependency layers should show a 60–80% reduction in build time on cache hits for dependency-heavy images.

  • Set DOCKER_BUILDKIT=1 at the job env level in GitHub Actions, not just inside a single step — it must be present for all Docker-related actions in the job.
  • Use mode=max for multi-stage Dockerfiles; mode=min only caches the final stage and misses most of the value in complex builds.
  • The registry cache tag (:cache) is independent of your application image tag — manage its lifecycle separately if you have registry cleanup automation.
  • A missing or failed registry login before cache-from causes a silent miss — verify authentication order in your workflow steps.
  • Structure your Dockerfile so stable RUN dependency installation steps appear before frequently-changing COPY instructions — no amount of cache configuration compensates for poor layer ordering.
  • Inline cache (BUILDKIT_INLINE_CACHE=1) is useful as a fallback for single-stage images but should not replace registry cache in production CI pipelines.

More interesting topics I will touch on in my blog.

Top comments (0)