Vladimir Dementyev for Evil Martians

Posted on Jul 24, 2019 • Originally published at evilmartians.com

Ruby on Whales: Dockerizing Ruby and Rails development

#ruby #rails #docker

Originally posted in Martian Chronicles.

This post is a b-side of my recent RailsConf talk "Terraforming legacy Rails applications" (video, slides).

In this post, I am not going to convince you to switch to Docker for application development (though you can check the RailsConf video for some arguments). My goal is to share the configuration I currently use for Rails projects, and which was born in ~~production~~ development at Evil Martians. Feel free to use it!

I've started using Docker in my development environment about three years ago (instead of Vagrant which was too heavy for my 4GB RAM laptop). It wasn't all roses since the start, of course—I spent two years trying to find a configuration that is good enough, suitable not only for myself but also for my team.

Let me present this config here and explain (almost) every line of it, because we've all had enough of cryptic tutorials that just assume you know stuff.

The source code could be found in the evilmartians/terraforming-rails repository on GitHub.

We use the following stack in this example:

Ruby 2.6.3
PostgreSQL 11
NodeJS 11 & Yarn (for Webpacker-backed assets compilation)

`Dockerfile`

Dockerfile defines the environment for our Ruby application: this is where we run servers, console (rails c), tests, Rake tasks, interact with our code in any way as developers:

ARG RUBY_VERSION
# See explanation below
FROM ruby:$RUBY_VERSION

ARG PG_MAJOR
ARG NODE_MAJOR
ARG BUNDLER_VERSION
ARG YARN_VERSION

# Add PostgreSQL to sources list
RUN curl -sSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
  && echo 'deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main' $PG_MAJOR > /etc/apt/sources.list.d/pgdg.list

# Add NodeJS to sources list
RUN curl -sL https://deb.nodesource.com/setup_$NODE_MAJOR.x | bash -

# Add Yarn to the sources list
RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
  && echo 'deb http://dl.yarnpkg.com/debian/ stable main' > /etc/apt/sources.list.d/yarn.list

# Install dependencies
# We use an external Aptfile for that, stay tuned
COPY .dockerdev/Aptfile /tmp/Aptfile
RUN apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get -yq dist-upgrade && \
  DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends \
    build-essential \
    postgresql-client-$PG_MAJOR \
    nodejs \
    yarn=$YARN_VERSION-1 \
    $(cat /tmp/Aptfile | xargs) && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
    truncate -s 0 /var/log/*log

# Configure bundler and PATH
ENV LANG=C.UTF-8 \
  GEM_HOME=/bundle \
  BUNDLE_JOBS=4 \
  BUNDLE_RETRY=3
ENV BUNDLE_PATH $GEM_HOME
ENV BUNDLE_APP_CONFIG=$BUNDLE_PATH \
  BUNDLE_BIN=$BUNDLE_PATH/bin
ENV PATH /app/bin:$BUNDLE_BIN:$PATH

# Upgrade RubyGems and install required Bundler version
RUN gem update --system && \
    gem install bundler:$BUNDLER_VERSION

# Create a directory for the app code
RUN mkdir -p /app

WORKDIR /app

This configuration contains the essentials only and could be used as a starting point. Let me show what we are doing here.

The first two lines could look a bit strange:

ARG RUBY_VERSION
FROM ruby:$RUBY_VERSION

Why not just FROM ruby:2.6.3, or whatever Ruby stable version du jour it is? We want to make our environment configurable from the outside using Dockerfile as a sort of a template:

the exact versions of runtime dependencies are specified in the docker-compose.yml (see below);
the list of apt-installable dependencies is stored in a separate file (also see below).

The following three lines define arguments for PostgreSQL, NodeJS, Yarn, and Bundler versions:

ARG PG_MAJOR
ARG NODE_MAJOR
ARG BUNDLER_VERSION
ARG YARN_VERSION

Since we do not expect anyone to use this Dockerfile without Docker Compose, we do not provide default values.

Installing PostgreSQL, NodeJS, Yarn via apt requires adding their deb packages repos to the sources list.

For PostgreSQL (based in the official documentation):

RUN curl -sSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
  && echo 'deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main' $PG_MAJOR > /etc/apt/sources.list.d/pgdg.list

For NodeJS (from NodeSource repo):

RUN curl -sL https://deb.nodesource.com/setup_$NODE_MAJOR.x | bash -

For Yarn (from the official website):

RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
  && echo 'deb http://dl.yarnpkg.com/debian/ stable main' > /etc/apt/sources.list.d/yarn.list

Now it's time to install the dependencies, i.e. run apt-get install:

COPY .dockerdev/Aptfile /tmp/Aptfile
RUN apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get -yq dist-upgrade && \
  DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends \
    build-essential \
    postgresql-client-$PG_MAJOR \
    nodejs \
    yarn \
    $(cat /tmp/Aptfile | xargs) && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
    truncate -s 0 /var/log/*log

First, let's talk about the Aptfile trick:

COPY .dockerdev/Aptfile /tmp/Aptfile
RUN apt-get install \
    $(cat /tmp/Aptfile | xargs)

I borrowed this idea from heroku-buildpack-apt, which allows installing additional packages on Heroku. If you're using this buildpack, you can even re-use the same Aptfile for local and production environment (though the buildpack's one provides more functionality).

Our default Aptfile contains only a single package (we use Vim to edit Rails Credentials):

vim

In one of the previous project I worked on, we generated PDFs using LaTeX and TexLive. Our Aptfile might look like this (those days I didn't use this trick):

vim
texlive
texlive-latex-recommended
texlive-fonts-recommended
texlive-lang-cyrillic

This way, we keep the task-specific dependencies in a separate file, making our Dockerfile more universal.

With regards to DEBIAN_FRONTEND=noninteractive, I kindly ask you to take a look at answer on Ask Ubuntu.

The --no-install-recommends switch helps us to save some space (and make our image slimmer) by not installing recommended packages. See more here.

The last part of this RUN (apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && truncate -s 0 /var/log/*log) also serves the same purpose—clears out the local repository of retrieved package files (we installed everything, we don't need them anymore) and all the temporary files and logs created during the installation. We need this cleanup to be in the same RUN statement to make sure this particular Docker layer doesn't contain any garbage.

The final part is mostly devoted to Bundler:

ENV LANG=C.UTF-8 \
  GEM_HOME=/bundle \
  BUNDLE_JOBS=4 \
  BUNDLE_RETRY=3
ENV BUNDLE_PATH $GEM_HOME
ENV BUNDLE_APP_CONFIG=$BUNDLE_PATH \
  BUNDLE_BIN=$BUNDLE_PATH/bin
ENV PATH /app/bin:$BUNDLE_BIN:$PATH

# Upgrade RubyGems and install required Bundler version
RUN gem update --system && \
    gem install bundler:$BUNDLER_VERSION

The LANG=C.UTF-8 sets the default locale to UTF-8. Otherwise Ruby uses US-ASCII for strings and bye-bye those sweet sweet emojis 👋

We set the path for gem installations via GEM_HOME=/bundle. What is /bundle? That's the path where we're going to mount as a volume later to persist the dependencies on the host system, i.e., your development machine (see below in docker-compose.yml).

The BUNDLE_PATH and BUNDLE_BIN variables tell Bundler where to look for gems and Ruby executables.

Finally, we expose Ruby and application binaries globally:

ENV PATH /app/bin:$BUNDLE_BIN:$PATH

That allows us to run rails, rake, rspec and other binstubbed commands without prefixing them with bundle exec.

`docker-compose.yml`

Docker Compose is a tool to orchestrate our containerized environment. It allows us to link containers to each other, define persistent volumes and services.

Below is the compose file for a typical Rails application development with PostgreSQL as a database, and Sidekiq background job processor:

version: '3.4'

services:
  app: &app
    build:
      context: .
      dockerfile: ./.dockerdev/Dockerfile
      args:
        RUBY_VERSION: '2.6.3'
        PG_MAJOR: '11'
        NODE_MAJOR: '11'
        YARN_VERSION: '1.13.0'
        BUNDLER_VERSION: '2.0.2'
    image: example-dev:1.0.0
    tmpfs:
      - /tmp

  backend: &backend
    <<: *app
    stdin_open: true
    tty: true
    volumes:
      - .:/app:cached
      - rails_cache:/app/tmp/cache
      - bundle:/bundle
      - node_modules:/app/node_modules
      - packs:/app/public/packs
      - .dockerdev/.psqlrc:/root/.psqlrc:ro
    environment:
      - NODE_ENV=development
      - RAILS_ENV=${RAILS_ENV:-development}
      - REDIS_URL=redis://redis:6379/
      - DATABASE_URL=postgres://postgres:postgres@postgres:5432
      - BOOTSNAP_CACHE_DIR=/bundle/bootsnap
      - WEBPACKER_DEV_SERVER_HOST=webpacker
      - WEB_CONCURRENCY=1
      - HISTFILE=/app/log/.bash_history
      - PSQL_HISTFILE=/app/log/.psql_history
      - EDITOR=vi
    depends_on:
      - postgres
      - redis

  runner:
    <<: *backend
    command: /bin/bash
    ports:
      - '3000:3000'
      - '3002:3002'

  rails:
    <<: *backend
    command: bundle exec rails server -b 0.0.0.0
    ports:
      - '3000:3000'

  sidekiq:
    <<: *backend
    command: bundle exec sidekiq -C config/sidekiq.yml

  postgres:
    image: postgres:11.1
    volumes:
      - .psqlrc:/root/.psqlrc:ro
      - postgres:/var/lib/postgresql/data
      - ./log:/root/log:cached
    environment:
      - PSQL_HISTFILE=/root/log/.psql_history
    ports:
      - 5432

  redis:
    image: redis:3.2-alpine
    volumes:
      - redis:/data
    ports:
      - 6379

  webpacker:
    <<: *app
    command: ./bin/webpack-dev-server
    ports:
      - '3035:3035'
    volumes:
      - .:/app:cached
      - bundle:/bundle
      - node_modules:/app/node_modules
      - packs:/app/public/packs
    environment:
      - NODE_ENV=${NODE_ENV:-development}
      - RAILS_ENV=${RAILS_ENV:-development}
      - WEBPACKER_DEV_SERVER_HOST=0.0.0.0

volumes:
  postgres:
  redis:
  bundle:
  node_modules:
  rails_cache:
  packs:

We define eight services. Why so many? Some of them only define shared configuration for others (abstract services, e.g., app and backend), others are used to specific commands using the application container (e.g., runner).

With this approach, we do not use docker-compose up command to run our application, but always specify the exact service we want to run (e.g., docker-compose up rails). That makes sense: in development, you rarely need all of the services up and running (Webpacker, Sidekiq, etc.).

Let's take a thorough look at each service.

`app`

The main purpose of this service is to provide all the required information to build our application container (the one defined in the Dockerfile above):

build:
  context: .
  dockerfile: ./.dockerdev/Dockerfile
  args:
    RUBY_VERSION: '2.6.3'
    PG_MAJOR: '11'
    NODE_MAJOR: '11'
    YARN_VERSION: '1.13.0'
    BUNDLER_VERSION: '2.0.2'

The context directory defines the build context for Docker: this is something like a working directory for the build process, it's used by the COPY command, for example.

We explicitly specify the path to Dockerfile since we do not keep it in the project root, packing all Docker-related files inside a hidden .dockerdev directory.

And, as we mentioned earlier, we specify the exact version of dependencies using args declared in the Dockerfile.

One thing that we should pay attention to is the way we tag images:

image: example-dev:1.0.0

One of the benefits of using Docker for development is the ability to synchronize the configuration changes across the team automatically. You only need to upgrade the local image version every time you make changes to it (or to the arguments or files it relies on). The worst thing you can do is to use example-dev:latest as your build tag.

Keeping an image version also helps to work with two different environments without any additional hassle. For example, when you work on a long-running "chore/upgrade-to-ruby-3" branch, you can easily switch to master and use the older image with the older Ruby, no need to rebuild anything.

The worst thing you can do is to use latest tags for images in your docker-compose.yml.

We also tell Docker to use tmpfs for /tmp folder within a container to speed things up:

tmpfs:
  - /tmp

`backend`

We reached the most interesting part of this post.

This service defines the shared behavior of all Ruby services.

Let's talk about the volumes first:

volumes:
  - .:/app:cached
  - bundle:/bundle
  - rails_cache:/app/tmp/cache
  - node_modules:/app/node_modules
  - packs:/app/public/packs
  - .dockerdev/.psqlrc:/root/.psqlrc:ro

The first item in the volumes list mounts the current working directory (the project's root) to the /app folder within a container using the cached strategy. This cached modifier is the key to efficient Docker development on MacOS. We're not going to dig deeper in this post (we're working on a separate one on this subject 😉), but you can take a look at the docs.

The next line tells our container to use a volume named bundle to store /bundle contents. This way we persist our gems data across runs: all the volumes defined in the docker-compose.yml stay put until we run docker-compose down --volumes.

The following three lines are also there to get rid of the "Docker is slow on Mac" curse. We put all the generated files into Docker volumes to avoid heavy disk operations on the host machine:

- rails_cache:/app/tmp/cache
- node_modules:/app/node_modules
- packs:/app/public/packs

To make Docker fast enough on MacOS follow these two rules: use :cached to mount source files and use volumes for generated content (assets, bundle, etc.).

The last line adds a specific psql configuration to the container. We mostly need it to persist the commands history by storing it in the app's log/.psql_history file. Why psql in the Ruby container? It's used internally when you run rails dbconsole.

Our .psqlrc file contains the following trick to make it possible to specify the path to the history file via the env variable (allow specifying the path to history file via PSQL_HISTFILE env variable, and fallback to the defaukt $HOME/.psql_history otherwise):

\set HISTFILE `[[ -z $PSQL_HISTFILE ]] && echo $HOME/.psql_history || echo $PSQL_HISTFILE`

Let's talk about the environment variables:

environment:
  - NODE_ENV=${NODE_ENV:-development}
  - RAILS_ENV=${RAILS_ENV:-development}
  - REDIS_URL=redis://redis:6379/
  - DATABASE_URL=postgres://postgres:postgres@postgres:5432
  - WEBPACKER_DEV_SERVER_HOST=webpacker
  - BOOTSNAP_CACHE_DIR=/bundle/bootsnap
  - HISTFILE=/app/log/.bash_history
  - PSQL_HISTFILE=/app/log/.psql_history
  - EDITOR=vi
  - MALLOC_ARENA_MAX=2
  - WEB_CONCURRENCY=${WEB_CONCURRENCY:-1}

There are several things here, and I'd like to focus one.

First, the X=${X:-smth} syntax. It could be translated as "For X variable within the container use the host machine X env variable value if present and another value otherwise". Thus, we make it possible to run a service in a different environment provided along with the command, e.g., RAILS_ENV=test docker-compose up rails.

The DATABASE_URL, REDIS_URL, and WEBPACKER_DEV_SERVER_HOST variables connect our Ruby application to other services. The DATABASE_URL and WEBPACKER_DEV_SERVER_HOST variables are supported by Rails (ActiveRecord and Webpacker respectively) out-of-the-box. Some libraries support REDIS_URL as well (Sidekiq) but not all of them (for instance, Action Cable must be configured explicitly).

We use bootsnap to speed up the application load time. We store its cache in the same volume as the Bundler data because this cache mostly contains the gems data; thus, we should drop everything altogether in case we do another Ruby version upgrade, for instance.

The HISTFILE=/app/log/.bash_history is the significant setting from the developer's UX point of view: it tells Bash to store its history in the specified location, thus making it persistent.

The EDITOR=vi is used, for example, by rails credentials:edit command to manage credentials files.

Finally, the last two settings, MALLOC_ARENA_MAX and WEB_CONCURRENCY, are there to help you keep Rails memory handling in check.

The only lines in this service yet to cover are:

stdin_open: true
tty: true

They make this service interactive, i.e., provide a TTY. We need it, for example, to run Rails console or Bash within a container.

It is the same as running a Docker container with the -it options.

`webpacker`

The only thing I want to mention here is the WEBPACKER_DEV_SERVER_HOST=0.0.0.0 setting: it makes Webpack dev server accessible from the outside (by default it runs on localhost).

`runner`

To explain what is this service for, let me share the way I use Docker for development:

I start a Docker daemon running a custom docker-start script:

#!/bin/sh

if ! $(docker info > /dev/null 2>&1); then
  echo "Opening Docker for Mac..."
  open -a /Applications/Docker.app
  while ! docker system info > /dev/null 2>&1; do sleep 1; done
  echo "Docker is ready to rock!"
else
  echo "Docker is up and running."
fi

Then I run dcr runner (dcr is an alias for docker-compose run) in the project directory to log into the container's shell; this is an alias for:

$ docker-compose run --rm runner

I run (almost) everything from within this container: tests, migrations, Rake tasks, whatever.

As you can see, I do not spin a new container every time I need to run a task, and I'm always using the same one.

Thus, I'm using dcr runner the same way I used vagrant ssh years ago.

The only reason why it's called runner and not shell, for example, is that it also could be used to run arbitrary commands within a container.

Note: The runner service is a matter of taste, it doesn't bring anything new comparing to the web service, except from the default command (/bin/bash); thus, docker-compose run runner is exactly the same as docker-compose run web /bin/bash (but shorter 😉).

Bonus: dip.yml

If you still think that the Docker Compose way is too complicated, there is a tool called Dip developed by one of my colleages at Evil Martians that aims to make the developer experience smoother.

It is especially useful if you have multiple compose files or platform-dependent configurations because it could glue them together and provide a universal interface to manage the Docker development environment.

We're going to tell you more about it in the future. Stay tuned!

P.S. Special thanks to Sergey Ponomarev and Mikhail Merkushin for sharing their tips on the subject. 🤘

Read more dev articles on https://evilmartians.com/chronicles!

Top comments (3)

Victor Hazbun • Jul 25 '19

Thanks for sharing.

Daniel Golant • Jul 26 '19

Hey man, great writeup! Out of curiosity, have you managed to hook up a debugger to the app running in your rails container? We use a similar setup at work, and I've tried multiple times to get the VSCode debugger attached to our Rails servers with no luck.

Vladimir Dementyev • Jul 27 '19

You mean IDE debugger? Then I can't help here a lot: I'm using binding.pry dropped in the code and Terminal.
Not sure about VS Code but I heard that RubyMine's debugger works with Dockerized dev env.