DEV Community

Cover image for Dive Into Docker part 4: Inspecting Docker Image
Kacey Gambill
Kacey Gambill

Posted on

Dive Into Docker part 4: Inspecting Docker Image

This post is going to be shorter. I'd like to highlight a tool that I really enjoy working with called "Dive"

Dive is a an essential tool when building or inspecting Dockerfiles. This tool can help pinpoint exactly what is contained in each layer of the Dockerfile. Specifically it
quickly combs through the Dockerfile and tries to show wasted space.

dive image wasted space

Installing Dive

installation-instructions
My preferred way to install Dive, if using a mac, is to use brew: -- brew install dive

Using Dive

I prefer to use dive during local development of Docker containers. To get started I typically just run: dive image-name if the image is not found locally this will take care of pulling the image from the remote repository.

Note: tmux keybindings will get in the way, I usually detach from tmux or open another terminal session before using dive

Running dive ruby:3.2.0
dive ruby:3.2.0
It first pulls the image if it is not found locally, and then we are presented with "Layers", "Layer Details", "Image Details" and "Current Layer Contents".

Press to move between views.
In each view, it presents us with a few more hotkeys that we can use to further inspect this image.

Looking at the "Layers" tab, it presents us with either "layer changes" or "aggregated changes" on the right-hand side.
You can press either or to switch between these two.

Before moving to the "Layer Contents" view, I like to pick through the various
"Layer Details" right below "Layers"
dive ruby:3.2.0, layer details

Here it shows the command that was run to generate that layer.

On the right-hand side of the screen we can see "Current Layer Contents", this includes the details of the files that were added, removed, permissions on those files and how much space these files are taking up.
If we over to that view, it presents a few new options:

  • - collapse single dir
  • - collapse all dir's
  • Added
  • Removed
  • modified
  • unmodified
  • attributes
  • wrap

I prefer to start out by collapsing all dir's and then start digging into the layers that are showing the largest increase in file space.

dive ruby:3.2.0, current layer details

Using Dive in a Continuous integration Pipeline

Running Dive with CI=true is one of the most effective ways to quickly find wasted space.
Example: CI=true dive ruby:3.2.0
This also is something that could be plugged into a docker image pipeline to ensure that a ridiculous amount of assets or image space is not wasted.

Full output here:



CI=true dive ruby:3.2.0
  Using default CI config
Image Source: docker://ruby:3.2.0
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 98.8316 %
  wastedBytes: 11616315 bytes (12 MB)
  userWastedPercent: 1.6002 %
Inefficient Files:
Count  Wasted Space  File Path
    6        5.0 MB  /var/cache/debconf/templates.dat
    4        3.2 MB  /var/cache/debconf/templates.dat-old
    6        1.2 MB  /var/lib/dpkg/status
    6        1.2 MB  /var/lib/dpkg/status-old
    5        376 kB  /var/log/dpkg.log
    5        194 kB  /var/log/apt/term.log
    6         95 kB  /etc/ld.so.cache
    6         86 kB  /var/cache/debconf/config.dat
    6         71 kB  /var/lib/apt/extended_states
    5         54 kB  /var/cache/ldconfig/aux-cache
    5         52 kB  /var/log/apt/eipp.log.xz
    4         42 kB  /var/cache/debconf/config.dat-old
    5         36 kB  /var/log/apt/history.log
    4         26 kB  /var/log/alternatives.log
    2         903 B  /etc/group
    2         892 B  /etc/group-
    2         756 B  /etc/gshadow
    2           0 B  /etc/.pwd.lock
    6           0 B  /tmp
    5           0 B  /var/cache/apt/archives/partial
    3           0 B  /var/lib/dpkg/triggers/Unincorp
    6           0 B  /var/lib/dpkg/lock-frontend
    5           0 B  /var/cache/apt/archives/lock
    6           0 B  /var/lib/dpkg/lock
    6           0 B  /var/cache/debconf/passwords.dat
    5           0 B  /var/lib/apt/lists
    2           0 B  /usr/src
    6           0 B  /var/lib/dpkg/triggers/Lock
    6           0 B  /var/lib/dpkg/updates
Results:
  PASS: highestUserWastedPercent
  SKIP: highestWastedBytes: rule disabled
  PASS: lowestEfficiency
Result:PASS [Total:3] [Passed:2] [Failed:0] [Warn:0] [Skipped:1]


Enter fullscreen mode Exit fullscreen mode

With this particular image we could go through and remove those files, but in this case it does not take up a significant amount of room, so it is unnecessary.
more configuration options

Dealing with Sensitive Data

Do not pass sensitive details through build-arg's and environment variables into Dockerfiles during image creation. Simply inspecting the resulting docker image layers will expose these secrets.

If a Dockerfile needs sensitive data, pass it using buildx secrets mounts.

This can be done either with a file, containing the secret value, or an environment variable containing the secret.

First Create a file, named build_key with the value
xyz:xyz

Next add this to the Dockerfile to access the secret.



RUN --mount=type=secret,id=build_key
# to access the secret:
RUN echo "using build_key: $(cat /run/secrets/build_key)" # note this is an example



Enter fullscreen mode Exit fullscreen mode

Finally when running the docker build with buildx we use the secret:
docker buildx build --secret id=build_key,src=build_key .

If we were to use an environment variable containing the secret the command to build would look like:



build_key=xyz:xyz
docker buildx build --secret id=build_key,env=build_key


Enter fullscreen mode Exit fullscreen mode

Example of Secrets Leaking

This is a rough example, because we would likely never need to add the db connection string at build time, but there are a few apps that require a build_license when installing packages, or a way to authenticate to a remote GitHub server..



FROM ubuntu

ARG build_license \
    postgres_db_string

ENV build_license=$build_license \
    postgres_db_string=$postgres_db_string

COPY . .

CMD echo "secret_sauce: $secret_sauce" \
 && echo "build_license: $build_license" \
 && echo "postgres_db_string: $postgres_db_string"



Enter fullscreen mode Exit fullscreen mode

To see these details in an image, all that is needed is the image to exist locally, then run: docker save <image-name> -o <image.tar> then from inspecting the tar archive with vim I can see the layer contents.



" tar.vim version v32
" Browsing tarfile /Users/kaceygambill/personal/ubuntu-mount/blog/4/test.tar
" Select a file with cursor and press ENTER

444f68a42c829ead4bff4566c6554c761e2075c92d2eef50cbb9152fde8b13cc/
444f68a42c829ead4bff4566c6554c761e2075c92d2eef50cbb9152fde8b13cc/VERSION
444f68a42c829ead4bff4566c6554c761e2075c92d2eef50cbb9152fde8b13cc/json
444f68a42c829ead4bff4566c6554c761e2075c92d2eef50cbb9152fde8b13cc/layer.tar
a93a4c1e4d72d16b55e6aae767bb48e862a4ad8a43ab33107f8d5dfdc749912b.json
ee72d37eae4759eeaadd189b4341c0418faa7662ebc5089ddb528b4640e08c2f/
ee72d37eae4759eeaadd189b4341c0418faa7662ebc5089ddb528b4640e08c2f/VERSION
ee72d37eae4759eeaadd189b4341c0418faa7662ebc5089ddb528b4640e08c2f/json
ee72d37eae4759eeaadd189b4341c0418faa7662ebc5089ddb528b4640e08c2f/layer.tar
manifest.json
repositories


Enter fullscreen mode Exit fullscreen mode

Looking at any one of those .json files gives us more details about the layer.
Expanding 444f68a42c829ead4bff4566c6554c761e2075c92d2eef50cbb9152fde8b13cc/json
I can see a JSON object that includes the sensitive data.

If you haven't checked out Dive, I'd highly
suggest checking it out and implementing it as a check in your CI/CD pipelines!

Oldest comments (5)

Collapse
 
samurai71 profile image
Mark Landeryou

Very interesting will into this more

Collapse
 
rouilj profile image
John P. Rouillard

I have used dive. But I have never been successful at reducing the wasted space except in limited circumstances.

For example, I was able to delete unused documentation installed
by pip.

However a bunch of other files that were needed for the application to work were listed as "wasted space" when using CI=true

Do you know how does it determine a file is "waste space"?

Collapse
 
klip_klop profile image
Kacey Gambill

I'm glad you were able to get rid of the documentation that pip had installed, sometimes documentation can account for a lot of bloat when trying to run an application!

Any chance you have an example dockerfile? I can't say that I have really experienced that with this tool :\

Collapse
 
rouilj profile image
John P. Rouillard

Try the image rounduptracker/roundup from the dockerfile
github.com/roundup-tracker/roundup....

I get output like:

    3         246 B  /usr/local/lib/python3.11/site-packages/setuptools/_vendor/more_itertools/__init__.py
    2         238 B  /usr/local/lib/python3.11/site-packages/README.txt
    2         232 B  /usr/local/lib/python3.11/site-packages/pip/_internal/cli/status_codes.py
    2         214 B  /usr/local/lib/python3.11/site-packages/pip/_internal/metadata/importlib/__init__.py
..
    2           0 B  /bin/tar
    4           0 B  /lib/apk/exec
...
Results:
  FAIL: highestUserWastedPercent: too many bytes wasted, relative to the user bytes added (%-user-wasted-bytes=0.26076964664250546 > threshold=0.1)
  SKIP: highestWastedBytes: rule disabled
  FAIL: lowestEfficiency: image efficiency is too low (efficiency=0.8647676588992164 < threshold=0.9)
Result:FAIL [Total:3] [Passed:0] [Failed:2] [Warn:0] [Skipped:1]
Enter fullscreen mode Exit fullscreen mode

As I was writing this up, something occurred to me.

I think this is telling me that I have upgraded some files from the base image during my build and it's
counting those as waste.

But

    2         582 B  /usr/src/app/roundup_healthcheck
Enter fullscreen mode Exit fullscreen mode

should only have one copy added by the dockerfile command:

COPY scripts/Docker/roundup_start scripts/Docker/roundup_healthcheck ./
Enter fullscreen mode Exit fullscreen mode

Does running:

RUN chmod +x roundup_start roundup_healthcheck; \
    mkdir tracker; chown ${roundup_uid:-1000}:${roundup_uid:-1000} tracker
Enter fullscreen mode Exit fullscreen mode

make it look to dive like a second copy of the file (with different perms) is present?

Thanks for looking into this for me.

Collapse
 
hossain_52 profile image
Mia Hossain

wow