DEV Community

STeve Shary
STeve Shary

Posted on • Edited on

Tips on Building a Docker image (quickly) in a CI/CD Pipeline

Building docker images has been evolving over the years and there are new strategies to do this. One of the pain points that comes up frequently is the (lack of) speed when building an image on a CI/CD pipeline. Many pipelines are now docker containers themselves and are a total blank slate. We lose out on many optimizations if we don't realize it.

Below are a series of tips and tricks to make your docker image speed up in your build pipeline. I have seen build times drop from 10's of minutes to less than 60 seconds, though your miles may vary!

  • Use .dockerfileignore: The first step of every docker build is that docker will tarball the entire directory and send the data to docker. This is output as the line Sending build context to Docker daemon XX.XXMB This should only be the source code and configuration files that are used in your build. It really should not be more than 10's of MB. Make sure to exclude everything that is not needed.
  • Things that don't often change should be pushed up in the file. The initial package installs, static environment variables, should be moved up. Docker has a concept of caching, but the first line in the dockerfile that has a change, all lines after it will no longer be cached. The last line of the dockerfile should be adding in your code, or compiling it!

A typical file should have the following flow:

  FROM base-cotnainer

  # Add all static environment variables, folders, users...
  ENV IS_APPLICATION=1
  RUN mkdir app
  WORKDIR /app  
  RUN useradd -ms /bin/sh newuser
  COMMAND ["python3", "main.py"]

  # Install all base image packages
  RUN apt install ...

  # Copy over your library depency files (Python example below)
  COPY Pipfile /app/Pipfile
  COPY Pipfile.lock /app/Pipfile.lock

  # Install your libraries (Python example below)
  RUN pipenv install --deploy

  # Lastly, copy the code over
  COPY . /app/
Enter fullscreen mode Exit fullscreen mode
  • Don't use a wildcard with the COPY command. Docker will never cache that line. Be explicit!

  • Build your containers once. There should be zero reason to build your application multiple times.

  • Build your containers at the same time. If you have an application, test, integration, performance, et.al container,
    build them all as parallel tasks/steps/jobs in your pipeline. Much of the docker builds is around I/O tasks and can and should be parallelized.

  • Copy your library dependency file definitions manually and then run the library install before you copy your source code! Doing
    this step first will allow you cache the pulling of libraries (which feels like a full download of the internet). Below are
    language specific examples of doing this:

    • Javascript
      FROM [...]
      mkdir /app
      WORKDIR /app
      # ...
      COPY package.json /app
      COPY package-lock.json /app
      RUN npm install
    
      COPY [all your code and configuration]    
    
    • Python (requirements.txt)
    FROM [...]
    mkdir /app
    WORKDIR /app
    # ...
    COPY requirements.txt /app
    RUN pip3 install
    
    COPY [all your code and configuration]    
    
    • Python (pipenv)
    FROM [...]
    mkdir /app
    WORKDIR /app
    # ...
    COPY Pipfile /app
    COPY Pipfile.lock /app
    RUN pipenv install --deploy
    
    COPY [all your code and configuration]    
    
    • Java (Maven)
    FROM [...]
    mkdir /app
    WORKDIR /app
    # ...
    COPY pom.xml /app/pom.xml
    RUN mvn install
    
    COPY [all your code and configuration]    
    RUN mvn package
    
    • Java (Gradle)

    Add this to your build.gradle file:

    task downloadDependencies(type: Exec) {
    configurations.testRuntime.files
    commandLine 'echo', 'Downloaded all dependencies'
    }
    

    Example Dockerfile to pull dependencies as a cached layer.

    FROM [...]
    mkdir /app
    WORKDIR /app
    # ...
    COPY build.gradle /app/build.gradle
    RUN ./gradlew downloadDependencies
    
    COPY [all your code and configuration]    
    RUN ./gradlew build
    
    • DotNet (Nuget)
    FROM [...]
    mkdir /app
    WORKDIR /app
    # ...
    # Copy all solution files .sln (NO WILDCARDS!)
    COPY WebApp.sln /app/WebApp.sln 
    # COPY all .csproj files (NO WILDCARDS!)
    COPY WebApp.Application/WebApp.Application.csproj /app/WebApp.Application/WebApp.Application.csproj
    # Yep.. gotta manually add them all (... Pokemon!)
    
    # This command will pull all library 
    RUN dotnet restore
    
    COPY [all your code and configuration]    
    
    • Don't pre-pull images. It's slow and it doesn't work. Using the new BUILDKIT defined below, Docker will be smarter and only pull the image layers that are cached. This is faster,more efficient and it works.
  • Use docker's new library BUILDKIT and inline cache information into your builds. (This is a simple and big one)
    There are three things we need to set to get proper image caching in a pipeline:

    1. Set the DOCKER_BUILDKIT environment variable to 1
    2. Add a build argument to tell the new BUILDKIT library to line caching information in each image layer: --build-arg BUILDKIT_INLINE_CACHE=1
    3. Tell the docker build command where you can get your image from to cache --cache-from [remote-repo-url]/[CONTAINER_NAME]:latest The full command would look something like this: DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from [remote-repo-url]/[CONTAINER_NAME]:latest --tag [CONTAINER_NAME] .`

Lastly, don't forget to push an updated image to your remote docker repo!

Top comments (1)

Collapse
 
webreaper profile image
Mark Otway

Another great tip - if your docker image has other non-build dependencies (or transitive dependencies that you need to build before your app), have a separate base-image build that you only do when those deps change. Then your app build uses that as the base image.

For example, in my app (github.com/webreaper/damselfly) I need a bunch of unmanaged libs for linux to make GDI+ code work, and OpenCV etc, as well as ExifTool and a bunch of other apt-get packages. Instead of pulling/building all of that every time I build my app, I have a Damselfly-Base image with them all pre-populated that's used as the base image for my app's container.