DEV Community

Cover image for The Art of Small Images: Practical Techniques for Shaving Hundreds of MB Off AI and Java Containers
Vignesh Durai
Vignesh Durai

Posted on

The Art of Small Images: Practical Techniques for Shaving Hundreds of MB Off AI and Java Containers

Most teams know the feeling: the container finally works, models load, the JVM starts, and endpoints respond. But then someone points out the image size—sometimes eight hundred megabytes or more. While this isn’t surprising, the large size still causes problems. It slows local development, puts pressure on CI pipelines, and quietly shapes how systems evolve.
Eventually, another question comes up: not just whether the container runs, but whether it really needs to be this large.
This isn’t about following strict rules or aiming for the smallest possible image. It’s about making thoughtful choices in container design. Small decisions can add up and save hundreds of megabytes, all while keeping things clear and reliable.

When “It Works” Becomes the Baseline

AI and Java containers often grow large for understandable reasons. Machine learning stacks need native libraries, Python wheels with compiled extensions, CUDA dependencies, and tools for testing. Java images might include full JDKs, debugging tools, and leftover build artifacts that were once helpful but never cleaned up.
Many developers see that these containers are built quickly to meet deadlines, with the main goal of getting things working. This is normal for most projects. The problem appears later, when those early choices become the standard way of doing things.
Container bloat usually isn’t caused by one big mistake. It often happens because of convenience, like installing extra system packages just in case, keeping build tools for debugging, or adding duplicate dependencies. Each choice seems fine alone, but together they add up.

Layers Tell Stories—If You Read Them

Try thinking of container layers as telling a story, not just as technical details. Each layer should answer questions like: Why is this here? When was it needed? Is it still needed?
Multi-stage builds are often recommended as a best practice, but their main benefit is separating different purposes. Build-time dependencies, such as compilers, package managers, and test frameworks, are not the same as runtime libraries. Mixing these roles makes images larger than they need to be.
For Java containers, this often means compiling in one stage and running in another, moving only the needed runtime files to the final image. For AI workloads, it can involve installing Python dependencies and downloading models in a builder stage, then copying just the site-packages, model files, and required binaries into a smaller base image.
This approach leads to images that are not just smaller, but also easier to understand.

Base Images as Architectural Decisions

Teams often use base images as defaults. Ubuntu feels safe, and full distributions are familiar. However, the base image you choose affects everything that follows.
Many teams notice that moving from a general-purpose OS image to a focused runtime image can greatly reduce the container size. Distroless and slim images remove shells, package managers, and documentation that production containers rarely need. Alpine-based images have their own trade-offs, especially with native library compatibility, since Alpine uses the musl C standard library instead of glibc, which is used by Ubuntu, Debian, and Fedora. Teams need to check their actual dependencies when picking a base image, as Octopus points out. The goal isn’t to avoid these images, but to understand what they include. Choose runtime-only versions and make sure CUDA, driver, and framework versions match, so nothing extra is added.
The goal isn’t to make the image as small as possible. It’s about being intentional with your choices.

Dependency Discipline Over Dependency Hoarding

Another pattern shows up when teams review their dependencies. Many containers still include libraries that were helpful during testing but were never removed. In Python, transitive dependencies can quietly increase the image size. In Java, unused modules often stay on the classpath.
Some teams find it helpful to rebuild their dependency lists from scratch, starting with what the application really uses instead of what has built up over time. Others use tools to visualize dependency trees and decide what is still needed.
The key is less about the tools and more about the mindset. Containers reward discipline by making extra baggage easy to spot.

Small Optimizations, Compounding Effects

It’s tempting to look for a single big fix. In practice, reducing image size usually comes from many small changes, like cleaning package caches, reordering layers for better reuse, stripping symbols from binaries, or using runtime flags to skip extra components.
Each change might seem minor by itself, but together they make a big impact. Saving a few dozen megabytes at a time adds up, and soon the container feels lighter, both in size and in purpose.
Many developers say that once a team adopts this approach, it becomes the standard. Smaller images become the expectation, not just a one-off achievement.

A Different Kind of Craft

Making containers smaller isn’t just about appearances or meeting requirements. It shows respect for build systems, runtime environments, and the engineers who will use these images later on.
Smaller images are usually easier to understand. They make assumptions clear, encourage curiosity, and help systems feel intentional rather than accidental.
So, cutting hundreds of megabytes from AI and Java containers isn’t just about better performance. It’s a design habit that values patience, curiosity, and the willingness to rethink old decisions that no longer fit.
Perhaps the real skill is knowing when it’s time to review what already seems good enough.

Top comments (0)