Making dev.to insanely fast
Ben Halpern Feb 02, 2017
It makes me smile when someone raves about how fast this website loads, because that's no accident. We put a lot of effort into making it so. It is the sort of thing that usually goes unnoticed, but when your readers are developers, there's a better chance they notice and appreciate it. I have written about this in the past, but it's worth re-examining because these ideas are always evolving.
From the beginning, the idea was that if we could make everything insanely fast, every other UX consideration was going to be a lot easier going forward. Performance of a webpage is the most important UX consideration for me. Nobody wants to spend their time staring at a blank screen. Regardless of the little touches, folks are going to want to come back to a site that isn't going to waste their time on a white screen.
But speed is not free. In order to ensure performance, a project must have a set of reasonable constraints. Focusing on performance early helped map out the site's architecture to fit within the constraints that ensure great time-to-render speeds. Future decisions are less likely to have to weigh performance, because the performance consideration is center to the whole structure of the website.
Most performance issues on the web come down to understanding where the bottlenecks are. From there we need to determine what tradeoffs we can make to eliminate them. Tradeoffs are really hard to communicate between people with different concerns. I believe a big part of this project's success has been because I had a pretty good grasp of the top-to-bottom problem and how to implement it. It would have been really hard to explain to a designer right off the bat why we can't always have custom content on a page or have a designer explain to me why they need it.
What can we understand about the application to determine what decisions we can make about architecture? For dev.to, I took to thinking that this is a very read-heavy application, and there was a great opportunity to perform a lot of caching. It was an opportunity to be fairly minimalist in features and min-max the hell out of what's important: Content consumption. It is also vital to have an understanding of the available infrastructure tools. Don't build a project around infrastructure that nobody is providing as a reasonable service.
Fundamentally, maximizing web performance is about caching and latency mitigation. This application is a series of blog posts. There is a lot of dynamic read-write action any time a post is updated or a comment is made, etc. But most of the behavior is that of a user coming to the site, reading, or glancing over the content, and leaving. In order to best service this behavior, we make good use of edge caching. That means that if you visit this site from New York, you will be served a static, gzipped HTML page right from New York. If you visit from London, you get the London version of the cached page. If you visit from Nigeria, you get Spain's version, as that is the closest node in that case. This is a far better experience than when a site serves all its content from Utah, or something.
If you cannot serve fully static content, the same principles would apply with region-based data replication. As our the feature set of our site grows, I am excited to expand our capabilities in this sense, but starting with optimal performance in a simple way was key for a one-person operation. We use Fastly for our CDN edge-caching needs. We really like their service. Full disclosure, Fastly is now a sponsor of ours, but we made that happen because we love them, not the other way around.
Initial page request
A typical request first returns a fully-cached, pre-gzipped HTML page served from the closest Fastly node if one is available. Additional "dynamic" info, such as which comments you've liked, etc. go through a second, lightweight, request that hits the application database and returns the relevant information. A lot of this data is, itself, cached, but its cache keys are based on the specific user session, so we could not return this from the edge as of now. We plan to store more of this data on the client as well so we do not have to fetch from the source as often, but that is not implemented as of yet.
Eliminating render-blocking styles and scripts
Yes, all the CSS relevant for the initial render come over the wire in a
Images are automatically optimized for compression and served from the most efficient format depending on the browser (webp for Chrome, jpeg for Safari, etc.). This is a service provided by Cloudinary. Cloudinary also fully leverages HTTP2 where possible so we do not have to think about it.
Large cover images have their background color set to the approximate color of the image, so when the image loads, it is a nicer transition. I played around with a very low-res, blurred image inlined into the page, but that itself added a extra page weight and increased the time it took for the rendering engine to do its job. There may be other solutions for this in the future.
I am excited to play around with the newish native
<picture> element, which should help us serve even smaller images. However, we have not yet made use of that.
The bottom line
Performance on the web is the most important user experience consideration. When people click a link to this site, I don't want them to habitually go to another tab while the page loads. I want them to be confident in clicking through on mobile and expecting to get a very fast page load time, even if network conditions are spotty. Try hitting a page with lots of external CSS and custom fonts on the network. You're going to have a bad time. I want our readers all over the world to have the same lightning-quick page request times regardless of their devices, networks, and distance from our main region. I want developers who put their blog posts up on our site to trust that we are continuously drilling to improve the important parts of user experience.
I hope this was helpful. Please feel free to ask questions or leave comments below.