This is a copy/paste job from the Remix newsletter. I just couldn't stand reading it inside of Gmail, so I touched up the formatting, headings, and added some emoji, etc. and published it here.
Remix Newsletter #7
Don’t forget, October 28th (two weeks!) is our Beta Launch. Tell your friends!
When we build a website, pretty much everybody involved wants it to load fast. One of the best ways to make something fast is to cache it.
Today let's talk about HTTP caching.
Overview
Caching, generally, is just keeping a copy close to the user so you don’t have to calculate it again or go get it from somewhere farther away. You might have a stash of treats by your desk so you don’t have to go to the kitchen when you’re feeling snacky 😋.
A cache is like a stash full of copied stuff (wish we could make endless copies of chili cheese Fritos though).
Here’s a great video that goes along with this newsletter. Watch as Ryan builds a bare-bones http
server with Node.js to discuss HTTP caching.
HTTP Caching is part of the HTTP/1.1 spec from like 25 years ago. It gives us a way to keep a stash of responses right next to the user, in their browser’s memory. You might be thinking we’re talking about “server worker caching”. We’re not. This is WAY older and yet it seems newer devs know very little about it.
The way you make a Remix app fast is not by sitting around waiting for a static site build to finish and getting distracted by your empty stash of Fritos, but by actually knowing a bit about HTTP caching. You don’t have to become an expert–we’ve provided some nice APIs to simplify it all–but you do need to learn a few standard web APIs and be able to make good cache decisions to be successful.
So let’s get started and talk about what HTTP caching is. If you already know, skip to the end and check out what it means for Remix.
What is HTTP Caching
Let’s say the user visits your website at "example.com/docs". Here’s what happens:
- Browser sends a request
- Server
- receives the request
- renders the page into HTML
- sends a response with the HTML in the body and a
status code of
200
- Browser downloads the body of the response
- Browser renders the page
The next time the user visits the page, all of that has to happen all over again unless the server sends a specific response header that tells the browser to cache it: Cache-Control
.
That’s right, your browser already knows how to cache responses so it doesn’t have to download them more than once.
So if your server was sending a response like this:
return new Response(body, {
headers: {
"Content-Type": "text/html"
}
})
You just add the Cache Control header:
return new Response(body, {
headers: {
"Content-Type": "text/html",
"Cache-Control": "max-age=0, must-revalidate",
"Etag": md5(body)
}
})
Those two lines change quite a bit about the next time the user visits this resource. Let’s check it out:
A Request Cycle with Caching
First visit:
- Browser sends request
- Server:
- receives the request
- renders the page into HTML
- sends a response with the HTML in the body and a
status code of
200
- sends
“Cache-Control”
and“Etag”
headers
- Browser:
- downloads the body of the response
- browser renders the page
- browser caches the page on disk or in memory (up to the browser)
Second Visit
- Browser sends request. This time, it also sends request header
“If-None-Match”
with theEtag
from the last request - this is like the identifier for the content of the page. - Server:
- receives the request
- renders the page into HTML
- compares the
Etag
header to its newEtag
- (If)
Etag
matches, (it) sends an empty response and a status code of304
- Browser:
- Sees
304
, - downloads nothing,
- reads from cache
- renders the page
- Sees
So how is this faster? The browser doesn’t have to download the page again!
You’ll also note 🎵, however, that our server still had to generate the content to compare the Etag
to know if it could send a 304
or not. In the case of Remix, this is when we use React to render the HTML. We didn’t get to skip that part (yet).
Changing max-age
We used “max-age=0, must-revalidate”
last time. What happens if we use “max-age=3600”
? This tells the browser to cache this for an hour (3,600 seconds).
Second request for this resource
- Browser:
- looks at max-age from last request
- sees it’s within an hour, doesn’t even make a request,
- reads from cache
- renders the page
Woah! That’s pretty cool 😎. For the next hour, the user is on your website, they won’t be downloading any of the same pages twice. Anybody remember “Temporary Internet Files”? Yep, that was the cache. It’s that old.
What about Other Visitors and CDNs?
This is where things get really interesting. So far we’ve optimized the second visit to a resource for a single user. But when another visitor shows up, we’re going to have to do the whole request, server builds the page, sends (a) response cycle again.
That’s where a CDN can come in. Think of a CDN as a shared cache among your visitors.
A CDN is a “content delivery network” that puts servers (and your website) closer to users. There’s your “origin server”, that’s the one you deployed somewhere, and then a bunch of CDN servers that your visitors’ browsers actually talk to.
Let’s take our last example, with max-age=3600
and see how it looks with a CDN:
First (a) visit by anybody to a CDN server:
- Browser sends request
- CDN server receives the request (and) makes a request to your origin server (not cached yet)
- Origin server:
- receives the request from CDN
- renders the page into HTML
- sends a response with the HTML in the body and a status code of
200
- sends
“Cache-Control”
and“Etag”
headers
- CDN server:
- downloads the body of the response
- caches the page for the next visitor
- sends to the browser
- Browser:
- downloads the
body
of theresponse
- ...renders the page
- ...caches the page
- downloads the
Now if the same person visits that page, the browser will just read it from (the) cache like before, but check out what it looks like for a totally different visitor!
- Browser sends request
- CDN server receives the request, pulls the document from the cache, - does not hit (the) origin server - sends to the browser
- Browser:
- downloads the body of the response
- ...renders the page
- ...browser caches the page
Not only does the CDN put your website closer geographically to the user, but it also lets you skip hitting your server to build dynamic pages for as long as you specify in max-age
🤯. To get all this, you’ll need to change your header to be “public, max-age=3600”
.
If this is an old page that isn’t expected to change very often, set the max-age
to a week or a month and avoid doing render work on your origin server no matter who visits the page, and avoid hits to your CDN when the same person keeps visiting the page day after day.
If it’s a page that changes often, set a shorter max-age
so that people see your updates sooner.
What Does This Mean for Remix?
The Remix APIs here are built around the Web Fetch API. In this case, we’ll be using the Response and Headers constructors. (Their docs could use some more examples, we’ll have plenty in ours though.)
You return
responses from route loaders (the functions that only run server-side that fetch
data for a page route or nested route) and you can specify headers on routes themselves. These two APIs give you total control over cache control.
Specifying Cache-Control in a Route
It’s pretty simple. Export a “headers” function from a route and we’ll use them.
export function headers() {
return {
"cache-control": "max-age=3600, public"
}
}
That’s it. If the page changes often, keep it low. If it rarely changes, pump those numbers up. But be careful! You don’t want to cache something for WAY too long. You can clear your CDN but you can’t clear your user’s browser. We'll talk more about other strategies in the next newsletter (like stale-while-revalidate and setting different max-ages for your CDN vs. the user's browser).
Specifying Cache Control in Route Loaders
This part is pretty 🆒. When the user is navigating around your app, we’re not doing full HTML renders on the server, we’re just fetching the data from your route loaders and updating the page with client-side routing. Because you return full Responses from route loaders, you can specify the cache control and make these navigations super fast.
let db = require("./db");
let toHTML = require("./markdownToHtml");
module.exports = async function privacyPolicy() {
let { lastUpdated, markdown } = await db.get(`markdown/pages/privacy`);
let body = { lastUpdated, html: toHTML(markdown) };
return new Response(body, {
headers: {
"Content-Type": "application/json",
// cache for one week for everybody
"Cache-Control": "public, max-age=604800",
},
});
};
The first person to ever request the privacy policy in a client transition will get it cached for a week. Their browser won’t ask for it again.
The next visitor will hit the CDN, getting a cached version of it from the last visitor. Your route loader won’t even get called, meaning you won’t hit your database nor build the HTML out of MarkDown for another week.
(Note, you can also just return
objects from your route loaders, you just can’t specify any headers that way.)
This stuff matters because the modern React apps we use for our business, even the really great ones, have way slower transitions than they need to have. When I pop open the network tab, I can see they keep fetching the same resources with no cache control. That’s probably because it’s not straightforward to specify on a route-by-route, or data-endpoint-by-data-endpoint basis. Remix makes this critical (and 25-year-old) piece of the web much more accessible.
Now, go subscribe to their newsletter!
Top comments (0)