DEV Community

Cover image for Browser Caching Explained: From Principles to Practice
Ben
Ben

Posted on

Browser Caching Explained: From Principles to Practice

In today's internet era, website performance optimization has become an indispensable part of frontend development. Among various optimization techniques, browser caching is one of the most effective ways to improve website performance. This article will explain browser caching mechanisms in depth, helping you better understand and use caching strategies.


What is Browser Caching?

Code and resources on web pages are downloaded from servers. If the server and the user's browser are far apart, the download process can be time-consuming, making the webpage load slowly. When you visit the same webpage again, it needs to be downloaded again. If the resources haven't changed, this re-download is unnecessary. Therefore, HTTP designed a caching feature that can save downloaded resources locally. When you open the webpage again, it reads directly from the cache, which naturally speeds things up significantly.

Why Do We Need Browser Caching?

In practical applications, browser caching offers several important advantages:

  1. Improve Access Speed

    • Read resources directly from local storage, eliminating network request time
    • Reduce page white screen time, improving user experience
  2. Save Network Resources

    • Reduce duplicate network requests
    • Lower server pressure
    • Save user bandwidth
  3. Enhance Website Performance

    • Reduce server load
    • Improve website response speed
    • Optimize overall user experience

Browser Caching Mechanisms Explained

Browser caching mechanisms are mainly divided into two categories: Force Cache and Negotiation Cache. It's like shopping - some products have fixed expiration dates (force cache), while others need to be opened and checked to know if they're still usable (negotiation cache).

1. Force Cache

Force cache is the process of looking up the request result in the browser cache and deciding whether to use the cached result based on the cache rules. However, we can't cache forever, otherwise when resources change, users will still see old resources. So we need to set a cache expiration time. Expires and Cache-Control are used to set the cache expiration time.

Control Fields: Cache-Control and Expires

1. Expires (HTTP/1.0)

Expires is a header in HTTP/1.0 that represents the expiration time of a resource.

Expires: Wed, 21 Oct 2025 07:28:00 GMT
Enter fullscreen mode Exit fullscreen mode

Example:

const data = fs.readFileSync('./01.png');
res.writeHead(200,{
   expires:new Date("2024-10-15 22:39:00").toUTCString()
 })
 res.end(data)
Enter fullscreen mode Exit fullscreen mode

We set the expiration time to 2024-10-15 22:39:00. On the first request, the server returns a 200 status code and the resource, and the browser caches this resource.

If you refresh before 39 minutes, it displays directly from disk cache.

If you refresh after 39 minutes, it re-requests the resource and displays from the server.

The obvious drawback is that it depends on a fixed time (Greenwich Mean Time, no timezone issues). What if the client's local time is inconsistent with the server time? For example, I can directly modify my computer time to 2023 or 2025, then the cache becomes invalid.

2. Cache-Control (HTTP/1.1)

To solve the problem of client-server time inconsistency, Cache-Control was introduced, which uses relative time to solve this problem, allowing the browser to calculate whether it has expired.

Cache-Control: max-age=2000
Enter fullscreen mode Exit fullscreen mode

The above means the resource is cached for 2000 seconds, which is 33 minutes.

  • no-cache means negotiation cache needs to be used to verify if it has expired
  • no-store means browser caching is prohibited, always request from the server
  • public means it can be cached by any intermediary
  • private means it can only be cached by the user's browser

cache-control:public,max-age=2000 means it can be cached by any intermediary and cached for 2000s

cache-control:private,max-age=2000 means it can only be cached by the user's browser and cached for 2000s

cache-control:no-cache,max-age=2000 means negotiation cache needs to be used to verify if it has expired and cached for 2000s

cache-control:no-store means browser caching is prohibited, always request from the server

Note: Previously, all forms were key:value format, but HTTP 1.1's cache control header Cache-Control can set multiple parameters, centralizing all cache-related headers together, separated by commas.

  • Expires: xxx is called a message header (header)
  • max-age in Cache-Control: max-age=xxx is called a directive

Example:

const data = fs.readFileSync('./01.png');
res.writeHead(200,{
   "Cache-Control":"max-age=60"
 })
 res.end(data)
Enter fullscreen mode Exit fullscreen mode

max-age=60 means force cache can be used within 60 seconds. If more than 60 seconds have passed, it needs to be re-fetched.

The browser records the Date time. Within one minute, subsequent fetches go directly to cache.

If more than 60 seconds have passed, it re-fetches.

Note: Advantages of Cache-Control over Expires:

  • Uses relative time, avoiding the problem of client time being out of sync with server time
  • Provides more control options

However, even when the expiration time is reached, the resource doesn't necessarily become invalid. For example, if I modify the resource but don't change the resource name, the browser will still use the cache. So we need to use negotiation cache to solve this problem.

2. Negotiation Cache

Negotiation cache is the process where, after force cache expires, the browser sends a request to the server with cache identifiers, and the server decides whether to use the cache based on these identifiers.

Control Fields: Last-Modified and ETag

1. Time-based Negotiation Cache

Last-Modified is a header in HTTP/1.0 that represents the last modification time of a resource.

// Server response header
Last-Modified: Wed, 21 Oct 2023 07:28:00 GMT

// Client request header
If-Modified-Since: Wed, 21 Oct 2023 07:28:00 GMT
Enter fullscreen mode Exit fullscreen mode

The server checks if the resource has changed. If it has, it returns 200 with the new content in the response body, and the browser uses this newly downloaded resource. If there's no change, it returns 304 with an empty response body, and the browser reads directly from cache.

Example:

const data = fs.readFileSync('./01.png');
 const {mtime} =  fs.statSync("./01.png")
 res.setHeader("Last-Modified", mtime.toUTCString())
 res.end(data)
Enter fullscreen mode Exit fullscreen mode

Refresh again:

You can see it's read from disk cache, meaning there's no interaction with the server, equivalent to force cache.

You can see that when using negotiation cache for the second time, the request header includes if-modified-since, which is the value of the previous last-modified. If the server finds that the request header's if-modified-since matches the file's modification time, it tells the browser to use the cache, otherwise use the new file.

  const data = fs.readFileSync('./01.png');
    const { mtime } = fs.statSync("./01.png")
    res.setHeader("Last-Modified", mtime.toUTCString())
    res.setHeader("cache-control", "no-cache")

    const ifModifiedSince = req.headers["if-modified-since"];
    //  Check if browser last-modified equals file modification time
    if (mtime.toUTCString() == ifModifiedSince) {
      res.statusCode = 304;
      res.end()
      return
    }

    res.end(data)
Enter fullscreen mode Exit fullscreen mode

You can see that if the browser's last-modified matches the file's modification time from the last request, it returns 304, otherwise returns 200 with the new resource.

However, if the file modification time is very short but the content has changed, Last-Modified cannot accurately determine if the resource has changed. In this case, we need to use ETag to solve this problem.

2. Content-based Negotiation Cache

ETag is a header in HTTP/1.1 that represents a unique identifier for a resource.

Its main function is to perform hash operations on different resources. As long as files are different, the corresponding hash operation results will be different.

// Server response header
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"

// Client request header
If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Enter fullscreen mode Exit fullscreen mode

Example:

const etag =  require("etag");
if (pathname == "/02.png") {
    const data = fs.readFileSync('./02.png');
    const etagContent = etag(data);
    res.setHeader("etag", etagContent);
    res.setHeader("cache-control", "no-cache")
    res.end(data)
}
Enter fullscreen mode Exit fullscreen mode

Perform hash operation on the 02.png file. On the first request, the response header includes etag.

On the second request, the request header adds if-none-match.

If the etag matches the file's hash operation result, it returns 304, otherwise returns 200 with the new resource.

   const data = fs.readFileSync('./02.png');
    const etagContent = etag(data);
    const ifNoneMatch = req.headers["if-none-match"];
    if (ifNoneMatch == etagContent) {
      res.statusCode = 304;
      res.end()
      return
    }
Enter fullscreen mode Exit fullscreen mode

Proxy Server Caching

Caches in browsers are user-specific and called private caches, while caches on proxy servers can be accessed by everyone and are called public caches. If you only want the resource cached in the browser and not on the proxy server, set it to private, otherwise set it to public:

For example, this setting allows the resource to be cached on the proxy server for one year (proxy server's max-age is set with s-maxage), and cached in the browser for 10 minutes:

Cache-control:public, max-age=600,s-maxage:31536000
Enter fullscreen mode Exit fullscreen mode

This setting means only the browser can cache:

Cache-control:private, max-age=600
Enter fullscreen mode Exit fullscreen mode

Also, when the cache expires, is it completely unusable? No, actually expired resources can still be used. There are directives for this:

Cache-control: max-stale=600
Enter fullscreen mode Exit fullscreen mode

"Stale" means not fresh. Including max-stale in the request with 600s means if it's expired for 10 minutes, it can still be used, but not longer.

Cache-control: stale-while-revalidate=600
Enter fullscreen mode Exit fullscreen mode

You can also set stale-while-revalidate, which means while the browser negotiation hasn't finished, just use the expired cache first.

Cache-control: stale-if-error=600
Enter fullscreen mode Exit fullscreen mode

Or set stale-if-error, which means if the negotiation fails, use the expired cache first.

So, the expiration time of max-age is not completely mandatory - it can allow using expired resources for a period of time.

Caching Strategy Examples for Different Scenarios

1. HTML File Caching

HTML files usually need to stay up-to-date, so it's recommended to use negotiation cache:

// Node.js server configuration
app.get('/*.html', (req, res) => {
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Last-Modified', new Date().toUTCString());
  // ... return HTML content
});

// Nginx configuration
location ~ \.html$ {
  add_header Cache-Control "no-cache";
  etag on;
  if_modified_since exact;
}
Enter fullscreen mode Exit fullscreen mode

2. Static Resource Caching

2.1 JavaScript Files

// webpack configuration
module.exports = {
  output: {
    filename: '[name].[contenthash].js'  // Generate filename with hash
  }
}

// Node.js server configuration
app.get('/*.js', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=31536000');  // Cache for one year
  // ... return JS content
});
Enter fullscreen mode Exit fullscreen mode

2.2 Image Resources

// Node.js server configuration
// Fixed images like logos
app.get('/static/images/logo.png', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=31536000');
  // ... return image content
});

// Images that may change, like user avatars
app.get('/avatars/*', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=3600'); // Cache for 1 hour
  res.setHeader('ETag', generateETag(imageContent));
  // ... return image content
});

// Images that shouldn't be cached, like captcha images
app.get('/captcha', (req, res) => {
  res.setHeader('Cache-Control', 'no-store');
  // ... return captcha image
});
Enter fullscreen mode Exit fullscreen mode

3. API Interface Caching

3.1 Real-time Data Interfaces

// Real-time data like stock prices
app.get('/api/stock-price', (req, res) => {
  res.setHeader('Cache-Control', 'no-store');
  // ... return real-time data
});
Enter fullscreen mode Exit fullscreen mode

3.2 Short-term Cache Interfaces

// Short-term cached data like product lists
app.get('/api/products', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=300'); // Cache for 5 minutes
  // ... return product list
});
Enter fullscreen mode Exit fullscreen mode

4. CDN Cache Configuration

# Nginx CDN node configuration
proxy_cache_path /tmp/cache levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path=off;

server {
    location / {
        proxy_cache my_cache;
        proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
        proxy_cache_valid 200 302 1d;    # Cache successful responses for 1 day
        proxy_cache_valid 404 1m;        # Cache 404 responses for 1 minute
        proxy_cache_key $scheme$proxy_host$request_uri;

        # Add cache status header
        add_header X-Cache-Status $upstream_cache_status;
    }
}
Enter fullscreen mode Exit fullscreen mode

Cache Effect Verification

Chrome Developer Tools Verification

  1. Open Chrome Developer Tools (F12)
  2. Switch to the Network panel
  3. Observe the Size column:
    • (memory cache): indicates loading from memory cache
    • (disk cache): indicates loading from disk cache
    • 304: indicates negotiation cache is active

Caching Strategy Best Practices

  1. HTML Files

    • Use negotiation cache
    • Ensure content freshness
  2. Static Resources (JS/CSS/Images)

    • Use force cache
    • Include content hash in filenames
    • Set longer expiration times
  3. API Requests

    • Generally don't cache
    • Can use negotiation cache in special scenarios

Summary

Caching can speed up page loading and reduce server pressure, so HTTP designed a caching mechanism.

In HTTP 1.0, the Expires header was used to control caching, specifying a GMT expiration time. However, this caused problems when browser time was inaccurate.

In HTTP 1.1, it was changed to the max-age method to set expiration time, letting the browser calculate it itself. All cache-related controls were put into the Cache-Control header, with things like max-age called directives.

After cache expiration, HTTP 1.1 also designed a negotiation phase, which sends the resource's Etag and Last-Modified to the server through the If-None-Match and If-Modified-Since headers to ask if it has expired. If expired, it returns 200 with new content, otherwise returns 304, telling the browser to use the cache.

Besides the max-age directive, we also learned these directives:

  • public: Allow proxy servers to cache resources
  • s-maxage: Resource expiration time for proxy servers
  • private: Don't allow proxy servers to cache resources, only browsers can cache
  • immutable: Even if expired, no negotiation needed, the resource is unchanged
  • max-stale: Resources can still be used if expired for a period of time
  • stale-while-revalidate: During validation (negotiation), return expired resources
  • stale-if-error: If validation (negotiation) fails, return expired resources
  • must-revalidate: Don't allow using expired resources after expiration, must wait for negotiation to finish
  • no-store: Prohibit caching and negotiation
  • no-cache: Allow caching, but must negotiate every time

References

  1. MDN Web Docs - HTTP Caching
  2. RFC 7234 - HTTP/1.1 Caching
  3. Google Web Fundamentals - HTTP Caching

Top comments (0)