A development environment is one of those things you don't notice until it goes wrong, and then you can't think about anything else. A new hire spends two days getting the project to run. A test passes locally and fails in CI. The Postgres version on your laptop is 13 and production is 15, and a query that's fast for you takes forever in staging. Someone updates a dependency and half the team can't start the app until they delete and reinstall their entire toolchain. Every one of these is a small disaster that shouldn't have happened.
My stack for avoiding this is unfashionable and boring: Docker for the environment, Make for the workflow on top of it. It isn't the only way to do this and it isn't always the best way. But it's the combination I keep coming back to, and the reasons are worth writing down.
What Docker gives you
The pitch for Docker, after all the hype died down, is simple: the development environment is pinned, versioned, and identical to (or very close to) the production one. The image specifies the OS, the language runtime, the system libraries, the binaries. Two developers running the same image are running the same environment. The CI server running the same image is running the same environment. The container you deploy to production is running the same environment. "Works on my machine" stops being a meaningful thing to say, because the machine isn't yours — it's the image.
The same property extends to dependent services. With docker compose, adding a Postgres or a Redis or a RabbitMQ to your local stack is a few lines of YAML, and the version is pinned the same way. Nobody on the team has to install Postgres on their laptop. Nobody has to remember which Postgres version the project needs. The configuration is in the repo, and it stays in sync with the application code that depends on it.
Once you have this, you stop being surprised by environment differences. That alone is worth a lot.
Why Docker alone isn't enough
The catch is that Docker commands are verbose, unmemorable, and easy to invoke wrong. Running tests inside the container looks something like:
docker compose run --rm app bundle exec rspec spec/
That's four pieces of information — the orchestrator, the subcommand, the service name, the actual command — and you have to remember all of them in the right order. The first time you type it, you'll fight with it. The tenth time, you'll have it muscle-memorized. The hundredth time, you'll mistype it under pressure and wonder why the test runner can't find your file.
It gets worse. Most developers will have bundle or npm or composer installed locally too, because their editor wants it for IntelliSense or because of some other tool that needs it. The temptation to run bundle exec rspec outside the container (because it's shorter and faster to type) is constant. And the moment some people on the team are running tests inside the container and others are running them outside, you're back to "works on my machine," just with extra steps. Discipline isn't a feature of the stack at that point; it's something each developer has to provide individually, which means at least one of them will provide less of it on a tired afternoon.
Make as the single front door
Make is what I use to fix this. It is older than most of the things on my computer, ubiquitous on Unix systems, and almost embarrassingly simple. A Makefile is a list of named targets, each of which expands to a command (or several). You type make test and the Makefile knows what that means.
A small Makefile might look like:
.PHONY: up down test lint shell migrate
up:
docker compose up -d
down:
docker compose down
test:
docker compose run --rm app bundle exec rspec
lint:
docker compose run --rm app bundle exec rubocop
shell:
docker compose run --rm app bash
migrate:
docker compose run --rm app bundle exec rails db:migrate
That's the whole game. The verbose Docker commands are now invoked by their semantic names. make test is shorter, more memorable, and harder to get wrong than the underlying command. It also doesn't matter whether you have Ruby or Node installed on your laptop, because make test always goes through the container, every time, for everyone. The discipline isn't something each developer carries; it's built into the workflow.
There's a secondary benefit I value a lot: the Makefile is documentation. A new developer reading the Makefile learns what the project's workflow is — these are the operations the team performs, with these names. The README can say "run make test" and the implementation details of how that actually happens are one click away, but invisible to anyone who just wants to do the thing.
The agent angle
This is where the stack pays an extra dividend. When a coding agent works on your project, it has to figure out how to do basic operations — run the tests, format the code, start the database, apply migrations. Without a clear convention, it'll improvise, and it'll improvise differently every time. Sometimes it'll find your Docker commands. Sometimes it'll try to run things on the host and fail because the language runtime isn't installed there. Sometimes it'll invent a new approach that almost works.
With a well-documented Makefile and an instruction in the agent's harness that says "always use make targets; never invoke Docker or language tooling directly," you've given the agent a stable, narrow interface to the project. It doesn't have to figure out the environment. It just has to read the Makefile.
This is the same property that helps human developers, applied to a contributor that doesn't get tired or bored of typing make test for the thousandth time. The agent gets the same single front door. Your CI uses the same targets. Your local development uses the same targets. The whole team, humans and agents both, converges on one way to do each thing, and that one way runs inside the same environment as production.
The honest caveats
Make isn't fashionable. The syntax is finicky (tabs, not spaces, and the rules around variable expansion will bite you eventually). For very complex workflows, you'll outgrow it, and you'll either learn to live with its quirks or move to a task runner like grunt, rake, or just, which is essentially Make with the rough edges sanded off. The reason why I prefer Make over many other task runners is that it does not depend on a runtime.
Docker has its own costs. The image build cycle adds friction. File watching across the container boundary can be slow. On macOS, the virtualization layer has historically been a source of mysterious performance issues, though it's gotten much better. None of this is free.
But neither is the alternative. The alternative is the new hire on day one, the version skew on Postgres, the test that passes for one developer and fails for another, the agent that can't figure out how to run anything. Docker plus Make is a way of paying these costs once, up front, in the form of a Dockerfile and a Makefile, instead of paying them again and again as small ongoing surprises. That trade has consistently worked out in my favor.
Top comments (0)