DEV Community

Jeremy Moore
Jeremy Moore

Posted on

Docker-first Python development

I've always been a bit annoyed at how difficult it can be to avoid shipping test code and dependencies with Python applications. A typical build process might look something like:

  1. create a virtual environment
  2. install service dependencies
  3. install test dependencies
  4. run tests
  5. package up code and dependencies into an RPM.

At this point, my service dependencies and test dependencies are intermingled in the virtual environment. To detangle them, I now have to do something like destroy the venv and create a new one, reinstalling the service dependencies.
Regardless of the packaging method, I don't want to pull down dependencies when I deploy my service.

At Twilio, we are in the process of embracing container-based deployments. Docker containers are great for Python services as you no longer have to worry about multiple python versions or virtual environments. You just use an image with exactly the version of Python your service needs and install your dependencies directly into the system.

One thing I've noticed is that while many services are built and packaged as Docker images, few use exclusively Docker-based development environments. Virtual environments and pyenv .python-version files abound!

I recently started writing a new Python service with the knowledge that this would be exclusively deployed via containers. This felt like the right opportunity to go all in on containers and build out a strategy for Docker-first localdev. I set out with the following goals:

  1. don't ship tests and test dependencies with the final image
  2. tests run as part of the Docker build
  3. failing tests will fail the build
  4. IDE (PyCharm) integration

A bit of research (aka Googling) suggested that multi-stage builds might be useful in this endeavor. Eventually I ended up with a Dockerfile that looks something like this:

FROM python3 as builder

COPY requirements.txt ./

RUN pip install -r requirements.txt

COPY src ./src



FROM builder as tests

COPY test_requirements.txt ./

RUN pip install -r test_requirements.txt

COPY tests ./tests

RUN pytest tests



FROM builder as service

COPY docker-entrypoint.sh ./

ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 3000

When building an image from this Dockerfile, Docker will build 3 images, one for each of the FROM statements in the docker file. If you've worked with Dockerfiles before, you know that statement ordering is critical for making efficient use of layer cacheing, and multi-stage builds are no different. Docker builds each of the images in the order they are defined. All of the intermediate stages are ephemeral, only the last image is output by the build process.

In this case, the first stage (builder) builds an image with all the service dependencies and code. The second stage (tests) installs the test requirements and test code, and runs the tests. If the tests pass, the build process will continue on to the next stage. If the tests fail, the entire build will fail. This ensures that only images with passing tests are built! Finally, the last stage (service) builds on top of our builder image, adding the entrypoint script, defining the entrypoint command and exposing port 3000.

So how did I do wrt the initial goals?

  1. don't ship tests and test dependencies with the final image ✓
  2. tests run as part of the Docker build ✓
  3. failing tests will fail the build ✓
  4. IDE (PyCharm) integration ❌

I've met most of the goals, but what about the actual development experience? If I open up PyCharm and import my source code, it complains that I have unsatisfied dependencies :( Fortunately PyCharm Professional has the ability to select a python interpreter from inside a Docker image! Cool, but I have to build the image before I can use its interpreter. But thanks to goal #3, if my tests are failing, I can't build my image...

Lucky for us, we can tell docker build to build one of our intermediate stages explicitly, stopping the build after the desired stage. Now if I run docker build --target builder -t builder ., I can select the interpreter from the builder image.

Uh oh! The builder image doesn't include my test dependencies! Of course, that's the whole point of the builder image. Let's add another stage we can use for running and debugging our tests.

FROM python3 as builder

COPY requirements.txt ./

RUN pip install -r requirements.txt

COPY src ./src


FROM builder as localdev

COPY test_requirements.txt ./

RUN pip install -r test_requirements.txt

COPY tests ./tests

ENTRYPOINT ["pytest"]

CMD ["tests"]



FROM localdev as tests

RUN pytest tests



FROM builder as service

COPY docker-entrypoint.sh ./

ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 3000

With the localdev stage, I can build and image with all my service and test code and dependencies. I can even make the localdev container run the tests by default when the container is run. By using the interpreter from this image, I can now debug my failing tests.

Let's take a look again at the initial goals:

  1. don't ship tests and test dependencies with the final image ✓
  2. tests run as part of the Docker build ✓
  3. failing tests will fail the build ✓
  4. IDE (PyCharm) integration ✓

Hooray!

Except there's one thing still bothering me: changes to the service code trigger a reinstallation of our test dependencies. Yuck! Let's take another whack at our Dockerfile:

FROM python3 as service_deps

COPY requirements.txt ./

RUN pip install -r requirements.txt



FROM deps as test_deps

COPY test_requirements.txt ./

RUN pip install -r test_requirements.txt



FROM deps as builder

COPY src ./src



FROM test_deps as tests_builder

COPY src ./src

COPY tests ./tests



FROM tests_builder as localdev

ENTRYPOINT ["pytest"]

CMD ["tests"]



FROM tests_builder as tests

RUN pytest tests



FROM builder as service

COPY docker-entrypoint.sh ./

ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 3000

Ok that seems pretty complicated, here's a graph of our image topology:


 service_deps                    
    |   \                        
    |    -\                      
    |      \                     
    |       -test_deps           
    |            |               
    |            |               
 builder     tests_builder       
    |            |  -\           
    |            |    -\         
    |        localdev   -\       
    |                     --tests
 service                         

I don't love that the builder and tests_builder stages both copy over the source directory, but the real question is, does this still meet our initial goals while avoiding excessive re-installs of test dependencies? Yeah, it seems to work pretty well. Thanks to Docker's layer caching, we rarely have to re-install dependencies.

That it! If you have any questions or suggestions, please let me know!

Discussion (3)

Collapse
davidcoy profile image
David Coy

This is pretty awesome. I wonder if you could copy from the builder stage to the tests_builder stage by modifying tests_builder to copy from the builder stage like so:

FROM test_deps as tests_builder

COPY --from=builder ./src ./src

COPY tests ./tests

Disclaimer: I'm not 100% sure this would fit your use case, but it's worth a shot. If you try this out, let us know how it worked for you.

Collapse
jeremywmoore profile image
Jeremy Moore Author

You can definitely do this, but Docker still has to create a new layer for it. Copying it from the builder stage could definitely help if building the src dir is more complicated than copying over a single directory.

Collapse
davidcoy profile image
David Coy

That's true, I didn't think of the extra layer Docker would add in the process.