Francesco Leardini

Posted on Jun 30, 2019 • Updated on Mar 9, 2022

Service workers and caching strategies explained

#pwa #frontend #webdev #javascript

This episode will cover other PWAs important topics: service workers (SW) and caching strategies that we can implement to further leverage the full potential of PWAs.

Excited? Let's start!

What is a service worker?

A SW is similar to a web worker, both a simple javascript file.
A web worker does not have a specific task and it is typically used to offload the main thread (where the main web app is running on).

On the other side, a service worker has a specific task and it is to act as a proxy between our web application and the network. It can intercept http requests and serve the responses from the network or from a local cache, according to which caching strategy we implemented (more details later).

Let's list some SW characteristics:

Secure

Service workers will function only on HTTPs connection.

This is a safe decision, because otherwise we would easily expose our application to man-in-the-middle attacks. Let's just imagine what might happen if anybody could substitute our SW with a manipulated one...scary, isn't it?

On the other side, localhost is considered secure, allowing to test the application before deploying it.
If we work with Angular though, we cannot use the ng serve command to build and serve our application locally, as it does not work with service workers. In this case we have to use an HTTP Server of our choice, for example http-server package or the Web Server Chrome extension.

No direct DOM interaction

Service workers cannot access the DOM directly. They can, however, communicate with the pages under their scope through the postMessage interface. Those pages can then manipulate the DOM allowing an indirect access.

Non blocking

SW runs on an independent thread, separate from the one used by our application. From this the definition "non blocking".
Indeed we do not want that heavy load operations or crashes would affect in any way the performance of our web application.

The capability of running in the background is also the reason why we can show push notifications to our users, even when they are not actively visiting our web site.

Life cycle

if ('serviceWorker' in navigator) {
    navigator.serviceWorker.register('/serviceWorker.js')
      .then(function(registration) { 
         // The registration was successful       
      })
      .catch(function(err) {
         // The registration failed
      });
  }

As we saw previously, not all browsers support SW. Therefore we first need to check whether the service worker API is available before attempting to register the SW when the user accesses our app and the page is loaded.

The schema above describes the different lifecycle steps of a service worker.
During the registration the whole operation is canceled if an error occurs or the SW file cannot be fetched.
The register method will be newly triggered when the user loads the page again. The browser is able to identify whether the SW is already installed or not and call the method accordingly.

Once registered, a SW does not remain constantly active. The browser can unpredictably terminate it and reactivate it again when an event needs to be triggered. That's the reason why, if we need to persist a state used within the service worker (I do not mean caching assets or API requests here), we should better use IndexeDB, or a similar solution.

In the install step, pre-fecth operations are typically executed. Their goal is to ensure target assets are downloaded and made already available in the cache for the SW. These assets are commonly static files (eg. js, css) representing the core shell of our application, the minimum files and styles that should be available immediately to the user, even when offline.

⚠️ We have to be aware though, to not cache too many assets in this phase. In fact, if an error occurs or the SW cannot cache all the specified resources, then the whole installation phase will be aborted and the SW won't be activated on the client side. The install step will be triggered again once the user newly accesses the web page or reload it.

This step happens only at the beginning of a SW lifetime or when a new version is available on the server.

var urlsToCache = [
  '/',
  '/styles/styles.css',
  '/script/home.js'
];

self.addEventListener('install', function(event) {
  event.waitUntil(
    caches.open('my-cache')
      .then(function(cache) {
        return cache.addAll(urlsToCache);
      })
  );
});

Once the installation ends, the SW gets activated. However the SW will not take immediately control of the page where we registered it. This is to keep consistency with the page's content. Once we access/reload the page a second time, our service worker will take full control of it and eventually new assets will be fetched, according to our implementation.

In the activate step we typically wipe old caches. We cannot do this in the installation step, otherwise the currently existing service workers that still use the old caches would behave unpredictably and might lead to errors.

The snippet below shows how we can remove all the caches that are not white-listed.

self.addEventListener('activate', event => {

var validCaches = ['home-cache-v2', 'articles-cache-v2'];

  event.waitUntil(
    caches.keys().then(keys => 
    Promise.all(keys.map(key => {
        if (validCaches.indexOf(key) === -1) {
          return caches.delete(key);
        }
      })
    )).then(() => {
      // We successfully deleted all the obsolete caches
    })
  );
});

At this point, if you open the DevTools, access the Application tab and click on the service worker section, you will find the DEV SW (dev.io) registered in our browser:

Updating a service worker

If a new service worker version is available (a simple byte difference make it a new version), it will be downloaded and installed when the user visits our web application. However the new service worker does not replace immediately the old one, it remains in the install step, waiting to be activated.

The browser ensures that there is only one service worker version active on the client. It is only when all the tabs where the PWA is running are closed or the user navigates to a different URL and then comes back to our PWA that the new service worker gets finally activated. It is good to know this as simply refreshing the page is not sufficient and it often brings to confusion.

Unregister a service worker

To remove a SW, you can click on the Unregister link inside the developer tools of the browser.

Chrome: click F12 > Application Tab > Service worker section

Firefox: digit about:debugging#workers in the address bar:

Or it is also possible programmatically:

navigator.serviceWorker.getRegistration()
   .then(function(registration) {
       if(registration){
          registration.unregister()
          .then(
               function(success) {
                 // if success = true, unregister was successful
                });
          }
    });

Note: unregistering a SW will not free its cache. For this we have to click the “Clear storage” button in the Application panel of the developer tools (Chrome):

Via code we can use caches.delete():

if ('caches' in window) {
    caches.keys()
      .then(function(keyList) {
          return Promise.all(keyList.map(function(key) {
              return caches.delete(key);
          }));
      })
}

The caches.keys() method returns the keys of the CacheStorage, an interface representing the storage for the Cache objects that can be accessed by the service worker.

Caching strategies

There are different caching strategies we can adopt to improve the performance of our project.
In fact, if a data request is cached, we can deliver it without the need to access the network at all. This brings the following advantages: on one side the response is much faster, and on the other side we can provide the data even when the client is offline, since we have it already available locally.

Before starting though, we need to understand that we need to instruct the SW on how to cache data requests, since this isn't done by default.

General caching strategies

Below some common caching strategies, not bound to any framework.

Cache only

Diagrams source: Google web fundamentals guide

self.addEventListener('fetch', function(event) {

  event.respondWith(caches.match(event.request));
  // If the requested data isn't in the cache, the response
  // will look like a connection error
});

Probably the simplest one. The SW expects to find the requested assets already here. This strategy can be used for the static resources that constitute our "app shell". Usually those are fetched while the SW is installing, in order to be available in the cache after this phase.

Network only

self.addEventListener('fetch', function(event) {
   // We can put some custom logic here, otherwise
   // the request will follow the default browser behaviour
});

This strategy is exactly the opposite of the previous one: we always access the network, without even querying the cache. This is best suited for logs or anything we do not need to make it available offline.

Stale while revalidate

self.addEventListener('fetch', function(event) {

    event.respondWith(async function() {
        const cache = await caches.open('cache-v1');
        const cachedResponse = await cache.match(event.request);
        const fetchPromise = fetch(event.request);

        event.waitUntil(async function () {
            const networkResponse = await fetchPromise;
            // Update the cache with a newer version
            await cache.put(request, networkResponse.clone());
        }());

        // The response contains cached data, if available
        return cachedResponse || networkResponse;
    }());
});

Similarly to the cache only strategy, the goal is to ensure a fast responses by delivering the data from the cache.
However, while the client request is served, a separate request is triggered to the server to fetch a newer version, if available, and store it into the cache. This way, while we guarantee fast data delivery on one side, we also update the cached data on the other, so next requests will receive a more actual version.

Angular caching strategies

Angular provides only two kind of caching strategy:

Performance (default)

Here the goal is to optimise the response time. If a resource is available in the cache, this version is delivered. Otherwise a network request is executed to fetch and then cache it.
This strategy is adapt for resources that do not change often, like the user profile images. In these cases we want to provide the fastest response to the user without worries to deliver potentially obsolete data.

Freshness

When it is required to deliver the latest data from the network. We can specify a timeout after which the request falls back to the cache and tries to deliver the required data from there.
A typical use of this strategy is when we need to deliver the mostly up to date information that changes frequently. We can think to an application dealing with stock prices or blog articles written.

I won't go too much in detail for the Angular strategies since we will see them more in detail in the next post, where we will also write some code to implement both of them.

How to get rid of "zombie" service workers?

If we won't work with a service worker anymore and want to get rid of all the old, installed ones in our clients, we could use the following approach:

1 - Delete the code where we register the SW in our new app (so no new SW will be registered)

2 - Replace the (old) SW file content with the code below:

caches.keys()
    .then(keys =>
        Promise.all(keys.map(async key => await caches.delete(key)))
    .then(async () => await registration.unregister())
    .catch((err) => console.error('Something went wrong: ', err));

This will have 2 effects:

1- No new SW will be installed when new clients load our PWA

2 - Clients with already installed SW will download the new SW once they load the application again. Otherwise, the browser automatically checks (after a max of 24 hours since the previous check) if a new SW version is available and will replace the old SW code with the new one.

In both cases, the new code will delete the caches and uninstall the current SW.

How much data can we store?

The amount available is not the same in any context, but it differs for each browser according to the device and storage conditions.

Chrome and Opera base the available storage per origin (our site domain). To verify the remaining space we can use the Quota Management API:

navigator.storageQuota.queryInfo("temporary")
  .then(function(info) {
     console.log(info.quota);
     // It gives us the quota in bytes

     console.log(info.usage);
     // It gives us the used data in bytes
});

Some browsers starts prompting the users whether they agree in continuing storing further data when specific thresholds are reached.

Firefox after 50MB of data stored
Safari mobile can use only 50MB max
Safari desktop does not have storage limits (😳), but starts requesting confirms after 5MB stored.

These initial posts focused on the theoretical fundaments of PWAs.
Next article will present us tools to create PWAs.
Moreover we will create a demo with Angular and I will guide you step by step to make it a complete progressive web app. You can then use it as starting point for your next project!

You can follow me on:

Top comments (4)

martin rojas • Mar 8 '22

This is really helpful. Been running into an issue with a site that used to be Gatsby and I am replacing with Nuxt. However they Gatsby site had service workers. Is there a way to invalidate those and load the new site.

Francesco Leardini • Mar 9 '22

If you mean to invalidate the cache, in the "activate" service worker lifecycle, you can delete all "other" service worker caches on the target client (the code snippet is in this article).

Otherwise, if you don't work with a service worker anymore and want to get rid of all the old, installed ones, you can refer to the new section the article: "How to get rid of 'zombie' service workers".

martin rojas • Mar 9 '22

Thank you for adding that section. I had done some research and wanted to tweak your code a bit. Currenthing the registration.unregister() throws and error because registration is never defined. So this is what I have ended up using.

// Unregister the service worker
if ('serviceWorker' in navigator) {
  navigator.serviceWorker.getRegistrations().then(function (registrations) {
    for (let registration of registrations) {
      //unregister service worker for old domain
      registration.unregister()
    }
  }).catch(function (err) {
    // fail state, which is fine as we just don't want a service worker.
    console.log('Fail: ', err);
  });
}

caches.keys()
    .then(keys =>
        Promise.all(keys.map(async key => await caches.delete(key))))
    .catch((err) => console.error('Something went wrong: ', err));

Francesco Leardini • Mar 11 '22

Yeah, you need to invoke the code within the registration "scope", as you need it to invoke the unregister method.

DEV Community