DEV Community

Spyros Gi
Spyros Gi

Posted on

Recipes for scalable, cost effective web apps in Heroku with npm, NodeJS

I am using Heroku to deploy a web application. The application is starting out as an MVP and until real users use it, I want the deployment to be as cheap and simple as possible, yet future-proof.

For example, I want to have a solid foundation for the code by splitting the frontend from the backend. There are many ways to achieve that. One of them is on the development/build stage, which means that a change in the frontend for example does not require to build, run tests, restart etc. also the backend. In big projects this can increase built times and hinder significantly developer productivity.

Another (better) way is to separate builds but deploy/serve the backend and frontend from the same server. This is not very scalable nor cost effective in the long run: we may find down the road for example that we need more backend instances to handle the load without necessarily increasing the number of frontend servers. The ideal split therefore, is ensuring the frontend and backend don't share any data (apart maybe from configuration on where to access each other), communicate entirely over an API and can be built & deployed independently (aka the "micro-services" way).

For convenience and since the codebase and team is very small (em, just me actually 🙋‍♂) I want to use the monorepo approach. We are still in an MVP phase and the API as well as database schema will evolve over time. Having everything under one repo is convenient: any full-stack developer can build features without switching to different codebases and the whole development environment can be started with a single npm start command. More importantly in the case of JavaScript, it enables also code reuse between the frontend and backend e.g. for constants, validation errors etc. The monorepo approach has scaled well for tech giants like Google and Facebook so I don't see how it wouldn't work for a small web app.

To sum up, my (prioritised) requirements are:

  • The frontend and backend are as independent as possible.
  • Simple is better than complex.
  • Stay within the free tier of Heroku (or as cheap as possible).
  • Use a single repo to hold the code.

TL;DR

Given some Heroku restrictions, it turns out it's not super trivial to satisfy all 4 requirements. I found 2 ways to accomplish this but neither are completely satisfactory. The situation would be a lot simpler if the monorepo requirement is dropped: the overhead that is required to make it work with Heroku probably outweighs the advantages for most projects.

Since others are asking too and many solutions out there don't work anymore (or require upgrading to the hobby tier), my goal with this blog post is to clarify the current situation and explore the various tradeoffs. The tech stack I am using for my app and the examples here is NodeJS and Express for the backend, Angular in the frontend, with npm scripts to build/serve everything.


Some Heroku basics first

The usual Heroku use-case is that you have a code repository that you deploy using git push heroku master. This deploys an app, so there is a 1:1 relationship between repositories and apps. Each app can run on multiple dynos (think of them as the Heroku containers). What the dynos run is defined as a process (think of processes as the dyno type/class). Heroku uses a file called Procfile to define these processes for each application, which means 1 Procfile ↔️ 1 app. Of all processes you can define, only the web process can receive traffic from the outside (the users). This is the first limitation to keep in mind.


Things I tried that don't work

Since we want the frontend to communicate with the backend over an API, we need to have a backend that gets traffic from the outside world. Fair enough, we just need 2 web processes: one for the frontend and the other for the backend, right? Unfortunately on the free tier you can create as many as 100 apps but each app can use at most 1 web and 1 worker and as we said only the web processes receive traffic.

Let’s say we relax the cost constraint and upgrade to the Hobby tier which allows for 10 process types, this still wouldn’t work: there can be only 1 web process per Procfile/application.

OK then, you say, let’s have 2 applications, each with a web process. That would work but then we are breaking the monorepo requirement since one repo equals one Heroku app. Or do we..? 💡We will get back to that idea in a second.

Backtracking, what if we have a 1 web process scaled out in 2 dynos, with an config variable so that one dyno handles frontend calls and the other backend calls. When a call is routed to the wrong dyno, it should (somehow) internally call the other. First of all, to do this we would need to use professional dynos since you can’t scale out hobby dynos. But even then, this wouldn’t work because dynos are completely isolated from one-another in the common runtime (which you get by default).

The Heroku (?) way (async) - could work

A way to achieve what we want would be to use 2 different processes (web and worker) within the same Procfile, communicating over a queue/datastore. This solution is within the free tier limitations and is what is depicted in the Heroku docs. To adapt it to our model, the web dyno is the one that receives HTTP requests from the outside world: it delivers the (minified, uglified, bundled…) frontend code (HTML, CSS, JS) and in the case of API calls it writes the request to the queue. The worker dyno picks up requests and does the backend work. The web dyno keeps polling the queue for updates on the request and updates the UI based on the result (or uses optimistic updates).

Obviously this is a very complex setup for a simple web application: there are additional components that need to be configured (queue, websocket etc) and many edge cases to be covered in the application code (eg. what happens if a worker process is terminated abruptly while handling an async task?). While asynchronous processing makes sense for some tasks (eg. sending notifications, logging or computationally intensive tasks) most web applications won’t benefit from it (certainly not the app I am building). So I rejected this option due to the complexity.


What actually works

1. The "manual" way - without independent deployment

One of the requirements has been to build and deploy independently the frontend from the backend. Since at the moment there are no users however we can relax the independent deployment requirement by building the frontend and then serving it from the backend server. This is the official recommendation in the Angular docs.

To see it in practice, given the following project structure:

fullstack/                  # top level folder
├── backend/             
│   ├── package.json  
│   ├── api/                # API files 
│   └── ...       
├── frontend/                  
│   ├── package.json
│   └── ...├── package.json
├── ...

The top level package.json includes this:

"scripts": {
    "install": "(cd backend && npm i) & (cd frontend && npm i)",   
    "heroku-postbuild": "cd frontend && npm run build-prod && mv dist/frontend ../backend/",
    "start": "if [ \"$NODE_ENV\" == \"production\" ]; then cd backend && npm run start-prod; else cd backend && npm start & (cd frontend && npm start); fi"
}

Notice there is no Procfile. This is because Heroku also supports npm scripts to start a web process.

The independent builds are achieved using different npm modules for backend and frontend that install dependencies, watch files for changes and serve files.

For the deployment after the install step the heroku-postbuild script runs: it builds the production version of the frontend (with e.g. ng build --prod) and moves the output to the backend/ folder. Then we start the production backend server (Express) that contains something like this:

const app = express();
if (process.env.NODE_ENV === 'production') {
   app.use(express.static(path.join(__dirname, '/frontend')));
}

which serves static files from the frontend/ folder, while the Angular app (frontend) is configured to use /api to access data.

2. The Multi-Procfile way

The other option I found while researching is the Multi-Procfile buildpack created by Heroku engineers. This essentially removes the Heroku requirement that we encountered before: a repo does not have to correspond to one Heroku app anymore (and we are still within the free tier!) 🎉

Applying the instructions on how to use the buildpack:

  • We create 2 Heroku apps e.g. awesomeapp (frontend) and awesomeapp-backend.
  • We set fullstack/Procfile for the frontend and fullstack/backend/Procfile for the backend.
  • Whenever we deploy a new version we need to push to both Git-Heroku endpoints.

The last part can be made easier by specifying 2 different remotes with git config -e

[remote "heroku"]
    url = https://git.heroku.com/**awesomeapp**.git
    fetch = +refs/heads/*:refs/remotes/heroku/*
[remote "heroku-backend"]
    url = https://git.heroku.com/**awesomeapp-backend**.git
    fetch = +refs/heads/*:refs/remotes/heroku/*

and then use git push heroku master and git push heroku-backend master for the frontend and backend respectively (or automate both on git push).

The Procfile used for the frontend is web: cd frontend && npm run start-prod. The start-prod script starts an Express server that serves the frontend assets.

The backend/Procfile is exactly the same: web: cd backend && npm run start-prod. The start-prod script starts an Express server that serves the api folder. Note that cd backend is actually wrong here and won’t work locally with heroku local. It works in Heroku because the buildpack copies the Procfile in the root folder fullstack/: unfortunately we have to give up dev-prod parity.

Since the frontend is in a different domain (awesomeapp.herokuapp.com), we also have to enable CORS in the backend now:

app.use(function(req, res, next) {
  res.header('Access-Control-Allow-Origin', process.env.NODE_ENV === 'production' ? 'https://awesomeapp.herokuapp.com' : 'http://localhost:4200/');
  res.header('Access-Control-Allow-Headers', 'Origin, X-Requested-With, Content-Type, Accept');
  next();
});

It’s also worth noting that in both Heroku apps, the same code is being committed and the install step installs both the frontend and backend, even if only one is being used: certainly not ideal but acceptable.


In this blog post we explored various options to structure, build and deploy a web application in Heroku. Both of the solutions presented here are a bit “hacky” and neither of them achieves parity between dev and production environments: the “manual” way is probably simpler to understand (no magic coming from the buildpack) and easier to develop with (no need to push and setup 2 applications) but would also need more work to deploy fully independently in the future. The multi-procfile way on the other hand comes with the some overhead but allows fully independent deploys of the frontend from the backend, using a single Git repository.


What are your Heroku best practices to deploy a microservices web application? Let me know in the comments!

This is my first post here, originally published on my Medium:

Top comments (4)

Collapse
 
rhymes profile image
rhymes

Have you considered going serverless instead of using Heroku? I might be wrong here but it seems the perfect use case for independently scalable frontends and backends

Collapse
 
spygi profile image
Spyros Gi

I am not sure I understand what you suggest: running a serverless function on every request from a user? Not sure how suitable it would be for a regular CRUD application, can you elaborate?

Collapse
 
rhymes profile image
rhymes

My bad, I haven't been correct enough in my initial statement. Theoretically Heroku is also serverless, as is Google App Engine for example. We tend to conflate serverless with functions but the term is broader.

While you could probably do it with GAE as well, I was indeed referring to serverless functions. If you can work within the constraints of the paradigm (stateleness, short requests, cold starts mitigated by pinging), why not?

You're trying to save money and functions tend to be free for the first millions of requests.

Thread Thread
 
spygi profile image
Spyros Gi • Edited

Thanks for the clarification. Yeah, Heroku is a Paas (Platform as a Service) like GAE. I chose it over GAE because it gives you more freedom on what you run on top of it so not so strong of a lock-in effect (at the cost of having more to setup of course)

Theoretically you could work with serverless functions and it might make sense from a cost perspective for the very first MVP (<100 users) but I don't think they are the right candidate for this kind of application and setting them up is not trivial either (e.g. you still need to persist their output or ping them periodically to avoid cold starts so you need a "backend" anyway). I am not so much into the details but there are also other limitations like 1000 concurrent executions (at least in AWS).