Aniket Abhishek Soni

Posted on Jun 11 • Edited on Jun 13

Zero to Hardened: A Practical Migration Playbook for Docker Hardened Images in Regulated Industries

#docker #security #containers #devops

Why I chose this topic: In December 2025, Docker made Hardened Images (DHI) free and open for everyone — arguably the biggest shift in container security defaults since multi-stage builds. The announcement content is everywhere; the migration content is almost nowhere. Having spent years building data platforms inside banks and healthcare organizations — where every image ships through security review, every CVE generates a ticket, and "just patch it" involves three teams — I want to fill that gap with the playbook I'd hand a platform team on day one.

In 2025, software supply-chain attacks caused tens of billions of dollars in damage, and base images remained one of the largest attack surfaces in most organizations: hundreds of packages nobody asked for, shells nobody uses in production, and CVE backlogs that exist purely because ubuntu:22.04 ships a kitchen sink.

Docker Hardened Images change the default. They're minimal, distroless-style, near-zero-CVE bases with provenance attestations and SBOMs built in — and as of December 2025 they're free to use and build on. Vulnerability counts drop by up to ~95% compared to typical general-purpose bases, simply because the packages that carry those CVEs aren't there.

That's the easy paragraph. The hard part is what this article is about: actually migrating a fleet of real images — with their shell scripts, their apt-get habits, their healthchecks, and their compliance paperwork — onto hardened bases without breaking production.

First, recalibrate your mental model

A hardened/distroless-style image breaks four assumptions baked into a decade of Dockerfiles:

There is no shell. docker exec -it app sh returns an error. Every runbook that says "exec in and check" is now wrong.
There is no package manager. RUN apt-get install curl fails at build time. Good — that was the attack surface — but your Dockerfile patterns must change.
Non-root is the default. Anything writing to /, binding port 80, or assuming UID 0 breaks.
Debugging moves out of the image. Tools attach to containers instead of living in them.

If you communicate only one thing to your engineers before migration, make it this list. Most "DHI broke us" incidents are actually "our assumptions broke us."

The migration playbook

Phase 0 — Inventory and triage (one week, mostly scripting)

You cannot migrate what you can't see. Build a fleet inventory: every image in production, its base, its CVE count, and whether it needs a shell at runtime (spoiler: almost none truly do).

# Quick-and-dirty fleet triage from your registry
for img in $(cat production-images.txt); do
  base=$(docker buildx imagetools inspect "$img" --format '{{json .}}' \
         | jq -r '.image.config.Labels["org.opencontainers.image.base.name"] // "unknown"')
  cves=$(docker scout cves "$img" --format only-counts 2>/dev/null)
  echo "$img | $base | $cves"
done

Triage into three buckets:

Bucket	Criteria	Strategy
Green	Stock runtimes (Python, Node, JVM, Go) with no exotic native deps	Direct base-image swap
Yellow	Native dependencies, custom packages, shell-based entrypoints	Multi-stage rebuild
Red	Vendor images, legacy apps assuming a full OS	Defer; pressure the vendor; isolate harder

In my experience with data platform fleets, roughly 60–70% of images land in Green — far more than teams expect.

Phase 1 — The Green wave: swap and verify

For a Python service, the migration is often genuinely this small:

# Before
FROM python:3.12-slim
COPY requirements.lock .
RUN pip install --no-cache-dir -r requirements.lock
COPY src/ /app/src
CMD ["python", "/app/src/main.py"]

# After — build in a dev-variant, run in the hardened runtime
FROM <registry>/dhi/python:3.12-dev AS build
COPY requirements.lock .
RUN pip install --no-cache-dir --target=/deps -r requirements.lock

FROM <registry>/dhi/python:3.12
COPY --from=build /deps /deps
COPY --chown=nonroot:nonroot src/ /app/src
ENV PYTHONPATH=/deps
USER nonroot
ENTRYPOINT ["python", "/app/src/main.py"]

The pattern generalizes: hardened images come in -dev variants (compilers, package managers) for build stages, and minimal variants for runtime. Multi-stage builds are no longer an optimization; they're the migration mechanism.

Two things will bite you in this phase:

Healthchecks that shell out. HEALTHCHECK CMD curl -f localhost:8080/health dies with no curl and no shell. Replace with an exec-form check against a binary you ship, or move health checking to the orchestrator (Kubernetes probes hit the endpoint from outside the container — better anyway).
ENTRYPOINT scripts. entrypoint.sh needs a shell. Either compile your startup logic into the app, use exec-form with explicit args, or (transitionally) ship a static busybox into a known path — and write a ticket to remove it.

Phase 2 — The debugging story (this is where migrations die)

The single biggest organizational objection will be: "How do we debug in production without a shell?" If you don't answer it before migration, your on-call engineers will answer it for you by quietly pinning old images.

The answer is ephemeral, attachable tooling:

# Docker: attach a debug sidecar sharing the target's namespaces
docker debug my-app            # Docker Desktop / DD-adjacent tooling

# Kubernetes: ephemeral debug containers (stable since 1.25)
kubectl debug -it my-app-pod --image=busybox:1.36 --target=app

The debug container gets your shell, your tools, and visibility into the target's processes and filesystem — without those tools ever shipping in the production image. Frame it to your security team this way: the attacker no longer gets a shell, but your engineers still do, on demand, with an audit trail. In a bank, that sentence wins the meeting.

Phase 3 — Make compliance an output, not a project

Here's the part regulated-industry teams underestimate in the good direction: hardened images don't just reduce CVEs, they generate evidence.

SBOMs and provenance attestations ship with the image — auditors asking "what's in this container and where did it come from?" get a signed, machine-readable answer.
Patch SLAs become a vendor guarantee rather than an internal aspiration (enterprise tiers offer SLA-backed CVE remediation).
Vulnerability review meetings shrink. When the base contributes near-zero CVEs, every finding that remains is yours — signal, not noise. One platform team I worked alongside cut their weekly vuln-triage meeting from 90 minutes to 20, not because scanning improved but because the haystack disappeared.

Map this explicitly to your frameworks: container provenance and minimal-footprint requirements appear, in different language, in PCI DSS, HIPAA's risk-analysis expectations, SOC 2 change-management criteria, and FedRAMP container guidance. Build the mapping table once, attach it to the migration epic, and your security organization becomes the migration's sponsor instead of its blocker.

Phase 4 — Hold the line in CI

Migration without enforcement regresses in a quarter. Add a policy gate:

# CI policy gate (conceptual — adapt to your scanner/policy engine)
- name: Enforce hardened bases
  run: |
    docker scout policy "$IMAGE" --org "$ORG" \
      --exit-code   # fails the build on: non-approved base,
                    # missing SBOM/provenance, critical CVEs, root user

Pair it with a registry rule: production namespaces only accept signed images from CI. Now the secure path is the only path, and entropy works for you instead of against you.

Lessons learned

Migrate the noisiest image first, not the easiest. Pick the service with the worst CVE report and the most security-review friction. The before/after slide from that one migration buys you executive sponsorship for the other two hundred.
Don't gold-plate Red-bucket images. A legacy vendor app that needs a full OS won't be fixed by heroics. Contain the blast radius (network policy, read-only rootfs, no privileged mode) and put pressure on the vendor with your renewal leverage.
Watch the -dev-variant leak. The classic regression: someone ships the build-stage image to production "temporarily." Your policy gate should distinguish dev and runtime variants explicitly.
Time-zone and CA-certificate surprises. Minimal images may not carry tzdata or the CA bundle your code assumes. Test TLS calls and timestamp logic in staging, not in an incident review.
Budget for runbook rewrites. The technical migration took us weeks; updating every "exec into the container and…" runbook took longer. Plan it as real work.

Production considerations

Pin hardened bases by digest, not tag, and rebuild on a cadence — hardened upstreams patch fast, and you want those patches flowing. Mirror the images into your own registry for availability and policy control. And keep one escape hatch documented: the approved procedure for attaching a debug container in production, including who can do it and how it's logged. An escape hatch you designed is security; one your engineers improvise is an incident.

Conclusion

Free hardened images move container security from "aspirational best practice" to "default starting point." The teams that win in 2026 won't be the ones who adopted them first — they'll be the ones who migrated systematically: inventory, green-wave swaps, a real debugging answer, compliance-as-output, and CI enforcement.

Your move: run the triage script against your registry this week. I suspect your Green bucket is bigger than you think. Tell me in the comments what percentage you found — I'm collecting data points for a follow-up on fleet-scale migration metrics.

SEO keywords: Docker Hardened Images migration, DHI tutorial, distroless debugging, container security 2026, near-zero CVE base images, SBOM provenance attestation, container compliance PCI HIPAA SOC 2, kubectl debug ephemeral containers, secure base images, supply chain security Docker.

Tags: #docker #security #devops #containers #platformengineering

DEV Community