DEV Community

Ali Mehraji
Ali Mehraji

Posted on • Edited on

Offline, Multistage Python Dockerfile

There Was a need to reduce a python application docker image size, also had to have offline installation either in the PYPI packages installation or in APT update and installing packages.

Lets get to the Optimization Line by Line.

Docker build buildx syntax

  • First of all make sure docker uses the docker buildx as its default docker build.

BuildKit

If you have installed Docker Desktop, you don't need to enable BuildKit. If you are running a version of Docker Engine version earlier than 23.0, you can enable BuildKit either by setting an environment variable, or by making BuildKit the default setting in the daemon configuration.

DOCKER_BUILDKIT=1 docker build --file /path/to/dockerfile -t docker_image_name:tag
Enter fullscreen mode Exit fullscreen mode
# syntax=docker/dockerfile:1.4 # Required for heredocs [3, 4]
Enter fullscreen mode Exit fullscreen mode

Project Directory Tree

├── main.py
├── requirements.txt
└── src
    ├── log.py
    └── prometheus.py
Enter fullscreen mode Exit fullscreen mode

Multistage Dockerfile

as mentioned before, at first provisioning the base stage to be used in the next build and runtime stages.

base stage

  • base image
ARG JFROG=jfrog.example.com

FROM ${JFROG}/docker/python:3.13-slim AS base
Enter fullscreen mode Exit fullscreen mode
  • Change the default SHELL
    • A safe way with custom shell with pipefail and errexit options, its very useful in the Heredoc in the Debian Private repo setup section.
SHELL ["/bin/bash", "-c", "-o", "pipefail", "-o", "errexit"]
Enter fullscreen mode Exit fullscreen mode
ARG JFROG=jfrog.example.com

ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_TIMEOUT=60 \
    PIP_INDEX_URL=https://${JFROG}/artifactory/api/pypi/python/simple/
Enter fullscreen mode Exit fullscreen mode
  • Private Debian Repository (Offline Installation)

Used heredoc in Docker to change the base image apt sources to update and install packages from private Debian repository. heredoc needs the dockerfile syntax mentioned before.
If the structure of the Debian repo is different in a private repo , please change the URIs.

DEB822 format (apt .sources files)

# Using DEB822 format (.sources files) - for newer systems
RUN <<EOF

CODENAME=$(grep VERSION_CODENAME /etc/os-release | cut -d'=' -f2)
DISTRO=$(grep '^ID=' /etc/os-release | cut -d'=' -f2)

cat > /etc/apt/sources.list.d/debian.sources <<SOURCE_FILE_CONTENT
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian/
Suites: ${CODENAME} ${CODENAME}-updates
Components: main
Trusted: true

Types: deb
URIs: https://${JFROG}/artifactory/debian/debian-security/
Suites: ${CODENAME}-security
Components: main
Trusted: true
SOURCE_FILE_CONTENT
EOF
Enter fullscreen mode Exit fullscreen mode
  • Install Shared and common packages in all stages.

    • In the package installation there is no need to install recommended packages to reduces the image size.
    • After installation, for the sake of size image there is need to remove packages downloads.
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    curl \
    gnupg \
    lsb-release \
    && rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode

build stage

  • Use the prepared base image as build image
FROM base AS build
Enter fullscreen mode Exit fullscreen mode
  • There was no need for build specific packages in all stages, so just install them in build stage.
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential && \
    rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode
WORKDIR /app

RUN python -m venv .venv
ENV PATH="/app/.venv/bin:$PATH"

COPY requirements.txt .

RUN --mount=type=cache,target=/root/.cache/pip \
pip --timeout 100 install --no-cache-dir -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

runtime stage

  • Use the prepared base image as build image
FROM base AS build

WORKDIR /app
Enter fullscreen mode Exit fullscreen mode
  • Security best practices

    • Create group and user to leverage the kubernetes runAsUser, runAsGroup and fsGroup securityContext
RUN addgroup --gid 1001 --system nonroot && \
    adduser --no-create-home --shell /bin/false \
    --disabled-password --uid 1001 --system --group nonroot

USER nonroot:nonroot
Enter fullscreen mode Exit fullscreen mode
ENV VIRTUAL_ENV=/app/.venv \
    PATH="/app/.venv/bin:$PATH"

COPY --from=build --chown=nonroot:nonroot /app/.venv /app/.venv
Enter fullscreen mode Exit fullscreen mode
  • Copy src directory.
COPY --chown=nonroot:nonroot src /app/src
COPY --chown=nonroot:nonroot main.py .
Enter fullscreen mode Exit fullscreen mode
  • CMD to run container from image.
CMD ["python", "/app/main.py"]
Enter fullscreen mode Exit fullscreen mode

Before And After The optimization

Before The Optimization

The Dockerfile was:

FROM jfrog.example.com/docker/python:latest

WORKDIR /app
ADD src/ .

RUN pip config set global.index-url https://jfrog.example.com/artifactory/api/pypi/python/simple/ &&  \
    pip --timeout 100 install -r requirements.txt

CMD ["python","-u","main.py"]
Enter fullscreen mode Exit fullscreen mode

After build its size was 1.02GB.

Final Dockerfile After Optimization

After all Optimization and multistage Dockerfile its size reduced to 242MB.

# syntax=docker/dockerfile:1.4 # Required for heredocs [3, 4]

ARG PYTHON_VERSION=3.12.3
ARG JFROG=jfrog.example.com

FROM ${JFROG}/docker/python:${PYTHON_VERSION}-slim AS base
SHELL ["/bin/bash", "-c", "-o", "pipefail", "-o", "errexit"]

ARG JFROG=jfrog.example.com

ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_TIMEOUT=60 \
    PIP_INDEX_URL=https://${JFROG}/artifactory/api/pypi/python/simple/

# Using DEB822 format (.sources files) - for newer systems
RUN <<EOF

CODENAME=$(grep VERSION_CODENAME /etc/os-release | cut -d'=' -f2)
# DISTRO=$(grep '^ID=' /etc/os-release | cut -d'=' -f2)

cat > /etc/apt/sources.list.d/debian.sources <<SOURCE_FILE_CONTENT
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian/
Suites: ${CODENAME} ${CODENAME}-updates
Components: main
Trusted: true

Types: deb
URIs: https://${JFROG}/artifactory/debian/debian-security/
Suites: ${CODENAME}-security
Components: main
Trusted: true
SOURCE_FILE_CONTENT
EOF

# securely copy .netrc using BuildKit secrets
RUN --mount=type=secret,id=netrc,target=/root/.netrc \
    apt-get update && apt-get install --no-install-recommends --no-install-suggests -y \
      ca-certificates \
      gnupg \
      curl \
    && apt-get clean \
    && apt-get remove --purge --auto-remove -y \
    && rm -rf /var/lib/apt/lists/*

FROM base AS build

# securely copy .netrc using BuildKit secrets
RUN --mount=type=secret,id=netrc,target=/root/.netrc \
    apt-get update && apt-get install --no-install-recommends --no-install-suggests -y \
      build-essential \
    && apt-get clean \
    && apt-get remove --purge --auto-remove -y \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

RUN python -m venv .venv
ENV PATH="/app/.venv/bin:$PATH"

COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --timeout 100 --no-cache-dir --upgrade pip && \
    pip install --timeout 100 --no-cache-dir -r requirements.txt

FROM base AS runtime

WORKDIR /app

RUN addgroup --gid 1001 --system nonroot && \
    adduser --no-create-home --shell /bin/false \
    --disabled-password --uid 1001 --system --group nonroot

USER nonroot:nonroot

ENV VIRTUAL_ENV=/app/.venv \
    PATH="/app/.venv/bin:$PATH"

COPY --from=build --chown=nonroot:nonroot /app/.venv /app/.venv
COPY --chown=nonroot:nonroot src /app/src
COPY --chown=nonroot:nonroot main.py .

CMD ["python", "/app/main.py"]

Enter fullscreen mode Exit fullscreen mode

Notice

After a second thought, realized there is no need for dockerfile syntax version 1.4 to manipulate the apt sources, it could be done with sed.

The new dockerfile syntax was fun to learn, so i keep this guide as it is.

The Dockerfile with no new syntax and manipulating the apt sources via sed.

Updates

  • Mon Jul 14 2025

    • Add Credentials for Artifactories with authentication
    • handled via .netrc and mount it in build time via Buildkit
    • no credential exposed in image history or saved in image with Secret Mount
    • Added Line
    RUN --mount=type=secret,id=netrc,target=/root/.netrc
    
    • After update, updating and installing packages steps
    # securely copy .netrc using BuildKit secrets
    RUN --mount=type=secret,id=netrc,target=/root/.netrc \
        apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        && apt-get clean \
        && rm -rf /var/lib/apt/lists/*
    

Resources

Top comments (2)

Collapse
 
sawyerwolfe profile image
Sawyer Wolfe

This is a great optimization! One potential limitation is relying on private repositories for both APT and PyPI—how would you handle builds if the private repos were temporarily unavailable, or for developers who may not have access? Any suggestions for fallbacks or more robust offline support?

Collapse
 
alimehr75 profile image
Ali Mehraji

I will make sure the private repository is HA.

You're right, if a developer has no access, it won't be built.

The entire pip index URL could be an ARG if there is no PyPI repository
Also, the APT repo can be copied from a local file to image, and that also could be an ARG

if the arg has been set, so it will be used from private repo, otherwise use the base image apt sources