I'm currently taking a Docker MOOC course. In part 3 of the course, there is an exercise about deploying a dockerized software to Heroku with CI/CD pipeline. The software can be anything, so I decided to deploy the course materials, and use Github Actions to make the deployment. Github Actions was very straightforward to use in my opinion, and I had no problems with that. Instead, I ran into some issues to get the materials available in Heroku. In this blog post, I thought I'd share the challenges I faced and tell how I solved them.
Course materials and how they're built
So the course material is a simple website built with Jekyll. The website was containerized, so it already had a Dockerfile in the course material repository. I forked the repository to be able to set up a CI/CD pipeline for it, and to deploy it to Heroku.
The Dockerfile that already existed in the course repository looks like this:
FROM jekyll/jekyll:3.8.3 as build-stage
WORKDIR /tmp
COPY Gemfile* ./
RUN bundle install
WORKDIR /usr/src/app
COPY . .
RUN chown -R jekyll .
RUN jekyll build
FROM nginx:alpine
COPY --from=build-stage /usr/src/app/_site/ /usr/share/nginx/html
So from the Dockerfile you can see that it uses a multi-stage build: in the first phase it uses Jekyll image, and in the second phase NginX. The repository doesn't contain any configuration for NginX, so it just uses the default configuration file.
The problem with NginX and Heroku
NginX is a widely used web server that can also act for example as a load balancer or reverse proxy.
As a reverse proxy, it takes a request coming from a client and forwards it to a server. This way the client doesn't communicate with the server itself, because the proxy is internet-facing, not the server. It can also act as a load balancer and send requests from clients evenly to different servers if there's multiple.
In this case, NginX is used as a basic web server, serving the static pages produced with jekyll build
command.
Heroku is a PaaS (platform as a service) where you can deploy your software and host it in the cloud. It supports multiple programming languages, and also docker containers.
By default, NginX listens to the port 80. Now when I build the docker image locally on my laptop, and then run it with the command docker run -p 8080:80 docker-course-material
I can open localhost:8080 in browser, and it will display the course materials. Now when I first deployed the materials to Heroku, the deployment to Heroku was successful, but opening the website showed a message that something went wrong. I noticed an error in the logs:
State changed from starting to crashed
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
After googling a while, I found the information that Heroku assigns a random port for the web apps when the application is deployed. It says in the Heroku documentation that
Each web process simply binds to a port, and listens for requests coming in on that port. The port to bind to is assigned by Heroku as the PORT environment variable.
This means that I would need to make NginX listen to the random port that Heroku assigns for the application. How can I accomplish that?
First version of the solution
Let me be clear, although I have worked as a software developer before I went to data engineering, I had no experience in NginX whatsoever. I however realised I had to create a configuration file for it to make it listen to another port. My first goal was to make it listen to some other port, such as 8080 locally so that I could test the configuration.
I read that NginX uses configuration file in path /etc/nginx/conf.d/default.conf. My first solution was to create a configuration file that would replace the default file parent image had added, and that in the file a different port was defined. I struggled with the configuration a bit, but the first working version of the configuration file with hardcoded port looks like this:
server {
listen 0.0.0.0:8080;
location / {
root /usr/share/nginx/html;
index index.html;
}
}
The line with listen 0.0.0.0:8080;
tells the NginX to listen localhost and port 8080. After copying the configuration file in Dockerfile, I was able to bind to the port 8080 and access the materials. This didn't solve the problem with Heroku yet, though.
Binding PORT environment variable
So now that I had succeeded in changing to port to 8080, I would still need to figure out how to define the port dynamically when the Heroku assigns it. I couldn't then just hardcode the port number, but instead I needed to use some placeholder and replace it when the container is run.
So I replaced the hardcoded 8080 in the configuration with $PORT and added a line to Dockerfile that would replace it with the environment variable. Because it needed to be done at runtime, the only option was to use CMD. I couldn't use RUN command in the file, because the environment variable PORT set by Heroku is available only when the container is started. RUN command is executed once at build time.
For replacing the placeholder with the value of environment variable I used sed s
command. Sed is a simple stream editor and can be used to transform text. sed s
is probably to most commonly known sed command, and it can be used to replace text with some other using regular expression.
The example of sed s command
sed -i 's/cat/dog/g input-file.txt'
replaces all occurrences of a string cat with string dog in the file input-file.txt. Option -i means that the substitution is done in-place.
Final version
So in the end, this is how the configuration file looks like:
server {
listen 0.0.0.0:$PORT;
location / {
root /usr/share/nginx/html;
index index.html;
}
}
and this is how the Dockerfile looks like:
FROM jekyll/jekyll:3.8.3 as build-stage
ARG PORT
WORKDIR /tmp
COPY Gemfile* ./
RUN bundle install
RUN echo $PORT
WORKDIR /usr/src/app
COPY . .
RUN chown -R jekyll .
RUN jekyll build
FROM nginx:alpine
COPY --from=build-stage /usr/src/app/_site/ /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
CMD sed -i -e 's/$PORT/'"$PORT"'/g' /etc/nginx/conf.d/default.conf && nginx -g 'daemon off;'
Dockerfile has only two new lines: copying nginx configuration file, and replacing the placeholder with value of $PORT. The final command nginx -g 'daemon off'
is needed when the Docker is used, so that the NginX stays on the foreground and the container is not stopped immediately.
The final thing that I need to mention is that the problem I encountered is not, in fact, NginX-specific, but rather Heroku-specific. This problem can occur with any web server and Heroku, and needs to be solved by binding on port that Heroku assigns. I just happened to encounter this problem for the first time with NingX.
The solution is pretty simple, but I managed to use a couple of hours with this problem. If you do encounter the issue, I hope this blog post helps!
Top comments (11)
Thanks for this - I had the same problem and it put me on the right track.
One observation I have is that you don't really need to use ARG in your Dockerfile as it would only be available at image build time rather than container runtime. It's not actually doing anything in your file (the sed command gets run at container startup so the ARG is not actually available to it).
The reason that your solution is working is that Heroku creates a PORT environment variable when it instantiates the container (think -e when running docker containers locally) so you will always have access to this variable when the container is running on the Heroku platform.
However, if you run the container locally you have to provide the evironment variable yourself e.g. docker run -e PORT=4200 ...
If you don't want to do this then you can replace the ARG entry in your Dockerfile with an ENV statement and provide a default value eg. ENV PORT=4200. ENV values are available at container instantiation and can be overridden. Heroku will always override it and you can also run it locally without providing a command line environment variable.
Once again, thanks for the insight as it gave me what I needed to solve my issue.
Thank you, I'm happy that I could help! π€©
And you're also right about the ARG, thank you for noticing it. I actually think I just forgot to remove it after I realised the PORT environment variable is defined in runtime and not in build time in Heroku π
Very nice explanation about how to define the environment variable when you run the container locally π I don't have that information in my blog post, but that is definitely useful information!
Correct me if I'm wrong.
What is the use of deploying nginx and app within the same container? Basically if you're using nginx for load balancing, then it has to be a separate container and the backend microservice can be replicated across machines as containers. So you can achieve load balancing across machines.
You are right! The app and load balancer would have a separate containers. In this case we don't have separate application, though.
In the Dockerfile
RUN jekyll build
generates static HTML pages, and inCOPY --from=build-stage /usr/src/app/_site/ /usr/share/nginx/html
we copy the generated pages for nginx to use. In nginx config, we tell nginx to serve those pages inThus, we don't need any other containers.
Did I answer to your question? π
Yes! Thanks π
Maybe I didn't go through the Dockerfile properly. For static sites this is a good approach.
Nice article. I've been using Nginx as a proxy (jwilder/nginx-proxy docker image), but I'm unfamiliar with its use as a static web server, and also Heroku is new to me, but this gave me a nice glance what to wait for from them if I'm taking a closer acquaintance with them.
Thank you Saija! I've used Heroku before for some school projects, it's in my opinion very nice service, it's easy to deploy and host your application there.
Nginx as webserver serving static web pages was a new for me as well π
Sed is (too?) often your friend when writing infra scripts π
Question though..wont it he cheaper to use an infrastructure like digital ocean to host since you can already create docker and nginx?
Saved my day! Thanks
Thanks a lot, I spent more than 2 days to figure out the core issue without any luck