When I deployed my first web app, trailstatus.nz, I realised that there are very little beginner resources on how to go from your perfect
localhost app to a properly deployed site with a custom domain and redirects in place. That's why I wrote this article - to tell you about all the other bits that no one talks about when you deploy your web app to the public, that I ran into in my journey to get my app online.
In New Zealand where I live, mountain biking is a popular sport. Unfortunately, trails are often closed without warning and if you are not local to an area, it is hard to know whether a trail you want to ride is open or not. Therefore I created a web app which would keep track of trail status in NZ.
GitHub repo: ryan-mooore/trailscrape
The website runs based on many scraping scripts, which pull information from websites of individual regions across the country. Users can also update any individual trail status from my web app. The app is a single page application using React and React-router.
This might not come as a surprise, but scraping data from other people's websites is not the most simple process. Websites frequently change their layout and UI design. Even more often, a website will change the class name of a tag or slightly alter the HTML structure, and just like that, your script no longer works. This happened multiple times during the development of my app, where seemingly randomly, one of my scrapers would stop working. I would then find that a class name had changed or a p had been moved out its parent element, breaking my script.
Unfortunately there's not really a good solution for this. My advice would be:
- Scrape based on elements that are least likely to change such as elements with
- Use regex or fuzzy matching instead of exact string matching. If the website owner adds a full stop at the end of a text field, you don't want it to break your scraping script . Match for keywords that are less likely to change.
- Make sure you have good error handling in the front-end. Make sure your app doesn't crash completely if a scraper fails as this could be a common occurrence. In my case, I just hid the region if there was an error. An alternative solution would have been to display the last successful data and have a disclaimer saying when it was last updated.
The solution to this is to use a browser driver like
chromedriver instead of just making a request as they will act as a browser, forcing the website to load the dynamic content. Many scraping libraries like Selenium support using a browser driver. If you want the scrape to work reliably, set a decent timeout between when you load the website and return its HTML. I used ~5s and it works every time.
Previously in other projects I've done and in my education I've used SQL databases, so I am more familiar with them. However, for this project I decided to use MongoDB as I had heard good things about it. I couldn't recommend it enough for small projects that do not use a lot of data. It's super easy to get set up as it is schema-less and you don't have to worry about your data structure. Deploying to a production environment is also super easy with MongoDB Atlas, and because the data storage is essentially just JSON, it's super easy to manage the data, especially if you are using node in your backend/api.
When I'd finished programming my app, I thought it wouldn't be too much more work to get it up and running - I was very wrong. There's so much more to do when you finish the programming side of your website - automatic deploys, analytics, SEO, domain names, redirects, SSL, and of course gaining a user base. Along the way, I ran into what felt like was every bug and issue imaginable - getting my website up the way I wanted took almost as much work as the coding in the first place.
I decided to use Heroku for deployment and immediately ran into my first issue - how to run my backend and frontend on the same Heroku dyno. I couldn't use
serve as create-react-app suggests, and ended up serving my static build files from my express backend. However, this then overrode my react-router client side routing. This was fixed by sending my homepage no matter what the user has requested using the
* wildcard, and then letting react-router handle the routing logic.
I then needed to make sure Heroku rebuilt the static files every time I deployed. I did this by attaching my build script to the nodejs buildpack's heroku-postbuild. This may differ depending on your buildpack and backend language.
I then tried to deploy but was served with every Heroku users' nightmare: "Application Error". If this ever happens I strongly recommended doing a local deploy. This consisted of downloading the Heroku CLI, and running
heroku local. This is supposed to deploy your app in exactly the same way as it does on their servers, except to
localhost and you can debug the issue with more ease.
But it worked fine locally! The only time my deploy would fail is if I deployed it to the Heroku servers themselves.
If this ever happens to you I can say that the issue is 95% likely to be environment variables. The issue was that when I deployed to Heroku, it was using my environment variables that I had set, but when I ran
heroku local, it wasn't using my environment variables and fell back to my localhost database. After running
heroku local -e .env with identical env vars in the .env file as on Heroku, I could debug the problem locally and fix the problematic variable.
After all this, I got a successful deploy, and after adding automatic deploys from my GitHub repo, which Heroku made super easy, moved on to adding my custom domain.
Heroku gives you a default domain name of
*.herokuapp.com for free which is nice but I wanted something more memorable, so I bought the domain name trailstatus.nz. Adding a custom domain on Heroku is more difficult than just resolving to an IP address, as servers are constantly reallocated internally. This means that you cannot use
"A" records to resolve to an IP. Instead Heroku gives you a DNS target in the form
<haiku>.herokudns.com which then resolves to the IPs your website is hosted on. Simple right? Not really.
Setting up resolution for the
www. subdomain of my domain name was easy. All nameservers provide a
"CNAME" record type, which is basically just a redirect to another DNS target instead of a static IP. However, the difficult part was setting up DNS resolution for when the user just types
trailstatus.nz instead of
"CNAME" records doesn't support root domains, and only works with subdomains like
www.. If you want to resolve for users that type your domain without the
www. subdomain you need to use an
"A" record to a static IP. But Heroku doesn't use static IPs... so what is the solution?
Well it turns out that this is becoming a very common use case with many people hosting their web apps on PaaS services which don't give you an IP, so many DNS providers have created a new type of record which is functionally similar to
"CNAME" but for root domains. However this record type is not standardised and will have different names on different providers which may be called
"ALIAS". For most developers, I would strongly recommend buying a domain from a registrar that provides this record type, like GoDaddy.
Unfortunately for me, there are only a handful of New Zealand companies that sell
.nz TLDs, and none of them support
"ANAME" DNS records as far as I am aware. To get around this, I created my DNS records on an external DNS provider that did support these records - I used NameCheap, and then added their free DNS servers to my domain name through the settings on my registrar's page under "Nameservers". I then had working DNS resolution whether the user entered
www.trailstatus.nz or just
trailstatus.nz. However, DNS propagation did take some time - my website could be found from Google's public DNS resolver (
188.8.131.52) within minutes and appeared on Cloudflare's (
184.108.40.206) a few hours later, but it took over a day to resolve using my default ISP's resolver. Your experience may vary though. I recommend using this very useful website to check whether your DNS is propagating and where.
Now there was only one thing left - SSL. Heroku provides SSL certificates for all of your custom domains free if you are using a paid dyno from their service, giving you
https:// protocols for your domain names. However, if a user requests your domain without explicitly entering
https:// then they will, by default, be served the insecure
http:// version of your app. HTTPS redirection cannot be done through Heroku, so you will have to build it in to your backend code. In my Express backend, I had to add a few lines of code that would check for a HTTP request and replace it with a HTTPS request before it is fulfilled, so that users will always be redirected to the SSL certified version of my website. After a change to the backend and a redeploy, my custom domains and redirects were sorted!
Buying and setting up a custom domain name really opened my eyes to a lot of the technology that runs when you simply enter a URL into your browser, that I had always taken for granted until now. It was very interesting and I will be looking into it more in the future.
Before I started mentioning the web app to users, I added analytics with Google Analytics. I found a very helpful npm package called react-router-ga, which consisted of not much more than just wrapping my
<Router /> in a custom component. Note that
react-router-ga, at the time of writing, only works with Google Universal Analytics properties and not the new GA4 properties - yes, there is a difference, and yes, this took me more hours than I'd like to admit to figure out!
In order to get some users to the website, I announced the project on a couple of relevant Facebook groups. I picked mountain biking groups in New Zealand which would be interested in the project in the hope to get recurring users. I also advertised the app on some other small places like my Twitter (which currently has almost no followers - follow me!) - and on dev.to - which is part of the reason I am writing this article right now!
Although there was a very steep learning curve to set up my first web app properly, it was extremely rewarding and satisfying to see hundreds of users flooding my site from my analytics and leaving positive comments on the project.
I hope you learnt something about deploying and hosting a web app that you didn't know before, and that this inspired you to get that last project you programmed but didn't quite get deployed on to the web! It really is rewarding having your web app online for anyone to visit, and even if it is a quite a process sometime, if I can, I'm sure you can! :)