When you're generating a Dockerfile for your app, you might get something that works but isn't optimized. This means it could be bloated with unnecessary packages (leading to potential vulnerabilities), not designed with dev/prod parity in mind, or not structured for faster builds.
We found the optimal way to improve Dockerfiles with Gemini CLI. Gemini CLI is great for this task, since it has such a large context window that it can take in a large chunk (if not all) of your codebase, and use that to help it write a working and optimized Dockerfile.
Why optimize your Dockerfile?
It's quite simple to write an MVP, working Dockerfile for one of your services. Writing it the fast way will leave a few considerations off the table. More specifically, speed, size, and security.
Since Dockerfiles are used for cloud-based dev and deployments, you'll want to keep them as small as you can. If you use a larger base image and add in a lot of extra tools, you'll hit your container registry and CI build minute limits much faster, and without any benefits at all. Keeping your images lean also means that there's a smaller attack surface (new vulnerabilities are found every single day, and fewer packages overall == less chance of a vulnerability).
One of the best benefits of Docker is that it helps you make your software multi-environment friendly, so you can use the same (or similar) config from local dev to production. Having a Dockerfile for every environment kind of defeats the purpose. Optimizing it means using env vars and keeping the overall architecture more abstract.
Docker's multi-stage build pattern allows images to build faster by caching the topmost layers. For example, if you have lines 1-4 to manage imports and installs, and you swap around some instructions on lines 5 and 6, your Dockerfile will rebuild only from layer 5 down. You have to be strategic with where you perform certain instructions in the Dockerfile. Unoptimized Dockerfiles often don't take this into consideration, so your builds will be slower and thus more expensive.
As with any software you're writing with a coding agent, you'll want to do several passes over your Dockerfile to iteratively improve it.
The hidden costs of "lazy" Dockerfiles
Unoptimized Dockerfiles don't really have a huge impact locally, since you aren't paying for CPU, storage, or RAM expended. However, one of the major benefits of Dockerfiles is their versatility from local machine to cloud. And of course, when there's cloud, there are cloud costs.
Cloud is typically billed usage-based. Slow-building + bloated Dockerfiles rack up your CI/CD usage faster, and consume your container registry bandwidth and storage. Inefficient resource usage also adds up in pre-production and production environments.
A couple hundred extra MB might not make a huge difference outright, but when you think about how many times you're building/rebuilding/uploading per week, you're multiplying that spend quite a bit.
Aside from literal "costs", slower, unoptimized Dockerfiles cost you when it comes to developer experience. Wait times add up, and rebuilding with updates will take longer than it needs to. Plus, they're harder/more tedious to debug.
Refining a Dockerfile through multiple prompts
You'll get the best results if you optimize your Dockerfile in several different stages. This way, Gemini can be more thorough in every step, and you can review its suggestions in smaller, more palatable batches.
Step 1: Analyzing the Dockerfile
Before having it make any changes, you can kick off Gemini CLI's task with some context on what you want to accomplish.
Here, you're basically priming it on how to understand your codebase. With its massive context window, it should have plenty of room to do so.
During this stage, you should also be able to verify whether Gemini is on track. If it isn't, you can course-correct it and give it additional (paraphrased) context, or tell it to study certain files.
"Please analyze my codebase and current Dockerfile and Docker Compose. What type of application is this, what are the main dependencies, and what optimization opportunities do you see? Here are my key files: [files]"
Review Gemini's output to make sure it understands both the task and info on the codebase it'll need to work with.
Step 2: The strategy
Now that Gemini has some background in your config + codebase, you can start using it to map out a strategy. For example, when you say "optimize", what do you mean? Speed, security, image size, all of the above? Do you want to use a minimal image and strip down to only necessary dependencies?
You'll also want to go into specs of your remote environment(s) (esp. since these might not be defined in your codebase). Are you using EC2, EKS, GKE? Kubernetes? Do you want to use the same Dockerfile across environments?
If you have any more nonnegotiables about image specs, you can list them here. This may include compliance and dev workflow needs.
"Based on your analysis, I want to prioritize [your priorities e.g. speed/size/security]. Follow the patterns [good practice] and [other good practice]. My deployment environment is [context]. I will need it to be compatible with [devtool] and have [xyz compliance]. Can you recommend the best optimization strategy and explain the trade-offs of different approaches?"
By asking Gemini for alternatives, you're helping it reason through picking a solution, and force it to think about not just the most common solution. Otherwise, it won't evaluate pros and cons of each. It may even select a different optimal solution via this approach; this helps it arrive at a more fleshed-out conclusion. (Plus you can and should have a say in which approach it takes).
Step 3: Implementation
At this stage, you can ask Gemini CLI to revise your Dockerfile so that it is production-ready. You can help it stick to stricter logic by asking it to explain every decision it makes.
You can also request that it generates a Markdown guide with start commands, test commands. Optionally, you might also want to wrap it in a Makefile.
"Please revise and optimize this Dockerfile and Docker Compose following the strategy we discussed. Include detailed comments explaining each optimization, a .dockerignore file, and a markdown guide with specs + the commands I should use to build and test it."
Step 4: Refining the Dockerfile
Now that you have an "improved" Dockerfile and Docker Compose, you'll want to make sure they actually work. Ask Gemini CLI to run the Dockerfile and analyze the logs. You should also take a look at the performance yourself, to double-check what Gemini says.
Let it know your impressions, and whether you want it to improve any aspects from here.
You may tweak your Docker config via env vars: ask Gemini what their values should be and test from there. Once its working locally, repeat in your PR, test, and staging environments.
What to avoid
As you've probably seen with LLMs, they can very much be "garbage in, garbage out" systems. You don't want to take shortcuts, and generic input will lead to generic output (not good for something as unique as your codebase). Here are a few particularly egregious things to avoid:
- Don't skip the planning/discovery phase, this will help Gemini get the context it needs to give you customized results.
- Don't be vague. Saying "make it faster" won't get you solid results, because it casts too wide a net for an LLM to narrow down to the right answer.
- Don't skip testing, and don't trust an LLM to be correct when it attests that everything is working.
- Remember that the optimal solution for one codebase won't necessarily apply to yours. Take time to understand what you're working with. Above all, you should be the expert, not the LLM.
Better Dockerfiles for better deployments
Using Gemini CLI's huge context window, you can give it enough context to optimize your Dockerfile to the best of its ability. We saw the best results with an iterative, multi-prompt approach, since that helped Gemini break each task into concrete steps, and us to steer it into the right direction along the way.
Overall, better Dockerfiles lead to improvements all throughout the SDLC. You'll have better performance when running them locally, and cost savings in the cloud. They'll build faster and take advantage of Docker's layers/caching. You can apply these practices faster with Gemini's help.
Top comments (0)