My Cardinal Rules of Writing Software That Scales

We're in the final days of the holiday season which brings swaths of traffic to websites where someone may be able to purchase something, ad spend buys are up for the last push of the year, and some businesses go through their last "make or break" moments.

Hopefully, as a developer you've come out mostly unscathed, and if you haven't then just know that it gets better, and every bug in production is a lesson learned.

Over time, I've been in position to write software that'd have to scale anywhere from millions of impressions a day to millions of impressions every few seconds, and there are things you have to do to go from 0 - 1 million impressions/day, then there are some equally challenging but not impossible things you need to do to go from 1 million impressions/day to 1 million impressions/min.

1. Cache

If this is the only step you read, let it be this one. Caching is undoubtedly the most important step to writing software that scales. The further you can cache from the slowest part of your system, typically a datastore, the better. This means implementing a CDN like Fastly, Cloudflare, or Cloudfront and learning how to use the Cache-Control header.

Obviously some of this cannot be tested in your development environment, but the browser is great at caching so please in to setting appropriate headers and using etags where possible.

2. Proper development environment

Make sure your development environment matches your productions as closely as possible - Docker and similar tools have obviously make this much easier to do than it was when I first started and was running XAMPP locally. Beyond just your web server and pinning your language version, make sure if you use a database that you're running the same version locally that you are on production. If you're not spinning up your own infrastructure, the version you're running is most likely specified for you if it's through a managed service.

3. Proper release/deployment process

Thankfully, the days of having to spin up your own server to run your web app are mostly over, with the plethora of services out there like fly, render, heroku, beanstalk on top of the handful of serverless function hosting companies that seem to be cropping up week over week. Make sure you're paying attention to a few things:

You should release your new code alongside your old code, and only decommission the old build once the new build is confirmed healthy. Most services have ways of setting up deploys like this - it could be a bit more costly because you may have to temporarily double your infra costs for frontend resources, but it's worth its' weight in gold to have zero downtime deployments.
Use tools to do the releases for you - as mentioned above there are a ton of resources out there that will automate the release process for you. Unless you want to get into the role of managing servers, it's highly unlikely that you're going to have the bandwidth to both manage infrastructure and write code if you're on a small team. If you still want a bit more control over your infrastructure but want some assistance in the release process, I highly recommend dokku - it's a very similar release process to heroku and similar services, but you self host it. You'll be on the hook for server maintenance, updates, and debugging issues but for smaller, lower traffic applications this is a great cost effective option.
Know how to do a roll-back, platform update and scale resources. Shit's going to happen, upgrades are still going to be required and a server is still something with finite resources - you're going to want to take the time to learn how to do basic infra updates to whatever platform you're deploying to before you actually need to do it.

4. Map Reduce Data

When I was working full time writing analytics software, I noticed that there is a lot of duplicate data out there - I may just need to know how many times a user fired an event on a page in the last hour, and that may be granular enough data to get by to where I don't need a record for each impression.

While there are big data pipelines that make it easier than ever to aggregate data, I still see a need for compacting data for performance reasons.

For example, if I'm tracking a next_button impression on a webpage, if it's a page that is frequently hit, then maybe I want to keep a list somewhere of all the impressions coming through that my app can write to in realtime, and behind the scenes either within app or a queue manager, that list is iterated through and reduced to a smaller map, so instead of writing to a database every record, I might just write one record to the database with an additional field of the count of impressions that occurred.

In that same vein, if I'm writing these records to a table, and I only need to keep a record of the events happened on a per day level, then I may only be writing one row of data per day per user per event and the count field is incrementing. In my code, instead of attempting to write a new field when 99% of the time I'm going to be doing an update, I may want to do an update first, and upon fail then do a write - this will reduce the thrash of update/write queries happening.

Another trick to use here is that over time you may want to track more information, which requires adding more fields and then may require your unique index growing. This is not sustainable long term, so one technique I've used in the past is to add the fields but be conservative with the indexing I do and hash the fields that I want indexed to be stored in it's own field that has a unique constraint against it - that way if another data point needs to be considered unique then I'm not indexing and therefore querying against that field.

5. Don't be verbose

Well, if this isn't the most appropriate bullet point to end on in a very long post, I don't know what is. Be cognizant to the data you are sending - if you are sending JSON data then realize that it doesn't necessarily need to be human readable. By shortening fieldnames from something like userName or firstName to u or f it may only show a few bites over the wire that are saved, but when you add those up over time you're going to see a savings in bandwidth throughput and therefore cost savings.

====

Well, that's all I got - hopefully someone finds this useful. If you disagree with any of this, agree with any of this, have your own tips, or found something useful - i'd really appreciate the feedback. As a dev it's our job to always be learning.

DEV Community

My Cardinal Rules of Writing Software That Scales

Top comments (0)

Read next

Introducing Really: A New Policy as Code DSL… that doesn't suck

Desktop/Laptop Computer for Beginners

Issue 46 of AWS Cloud Security Weekly

[Programming, Opinion] Should you read the GoF(design pattern) book?