Aleksandrovich Dmitrii

Posted on Oct 15 • Edited on Nov 9

How to make your app indefinitely lazy – Part 3: Vendors and Cache

#javascript #react #frontend #webdev

Well, hello there! And welcome to part 3 of my ultimate guide! Brace yourself, because you are about to become a real pro.

⏱️ Reading time: ~18-20 minutes
🎓 Level: Intermediate+

Series Contents:

How to make your app indefinitely lazy – Part 1: Why lazy loading is important
How to make your app indefinitely lazy – Part 2: Dependency Graphs
How to make your app indefinitely lazy – Part 3: Vendors and Cache
How to make your app indefinitely lazy – Part 4: Preload in Advance

Earlier we covered how to make our project's dependency tree as clean as possible and why it is important for lazy loading. And in this article, we will cover the following:

How we should download vendor files to ensure the best lazy loading.
What do "Lazy Loading" and "Cache" optimization strategies have in common, and how does using one affect another.
What is cacheability and how to make our application as cacheable as possible.
As well as how to correctly set up Webpack's cache groups and not mess up performance.

How to split your vendors

Lazy loading isn't just about splitting our own source files. It also applies to everything our website delivers, including external NPM packages. And it also applies to using cache properly.

Terms "vendors" and "cache" will be related in this article. So, let's briefly talk about what caching is first. Cache is yet another strategy to boost the loading time. In simplified terms, it works this way:

When a user opens our website for the very first time, no cache is applied; thus, all the files must be downloaded. At the same time, all the files the browser has downloaded are saved locally in cache.
Later, when a user reloads the page or opens a website a second time and onward, the files are retrieved from the local cache, which significantly reduces the loading time.

Now, let's go back to vendors. Imagine we use a few NPM packages in our project and we import different members from them in different lazy chunks.

// Title (loaded initially)
import { chunk, difference, intersection, sortedUniq, takeWhile } from 'lodash-es';
import { addDays, addHours, addYears, addMonths } from 'date-fns';

// Chapter 1 (loaded lazily)
import { countBy, partition, sample, sampleSize, orderBy } from 'lodash-es';
import { differenceInBusinessDays, differenceInCalendarDays, differenceInCalendarQuarters, differenceInHours } from 'date-fns';

// Chapter 2 (loaded lazily)
import { debounce, memoize, throttle, once, curry } from 'lodash-es';
import { isToday, isTomorrow, isAfter, isDate } from 'date-fns';

How should the browser download the members?

Common configuration mistakes

Some developers believe that creating a single vendor file for the assembly is a good approach for delivering vendors. And the reason why they do that is that:

It is easier to save and retrieve from cache a single file rather than several.

But that's actually a bad choice. Because:

The difference in the speed of downloading one and several files from the cache is negligible.
If the user does not have a cache, they will be required to download vendors during the initial download. Moreover, since none of the vendors load lazily, additional delays occur during the initial download. And, again, it is extremely important from the user's point of view.
Plus, since all the vendors are included in a single file, any time we import a new member from any NPM package, the hash code of this file changes. Which causes cache to be lost. Which causes problem #2.

Let's discuss the 3rd point in more detail. Why cache can be "lost"?

Usually, in real production applications, we need to control the cache of our files. Real websites change over time when introducing new features and fixing bugs. If we make our caching strategy too aggressive, users may stop downloading files from the server and start extracting files only from the cache. Because of this, users may miss the changes that we are implementing.

To avoid such problems, usually developers include some unique identifiers into generated filenames. So, once we deploy a new update:

filenames of our application change;
users lose their cache;
and the browser is able to download the required changes.

And there are various ways to achieve it, but the ideal way is to include file content hash into its name. And the easiest way to do this in Webpack is to add a template placeholder [contenthash] to output.filename and output.chunkName.

module.exports = {
  // ...
  output: {
    // ...
    filename: '[name].[contenthash].js',
  },
};

In output, besides [contenthash], you can also use [hash], [chunkhash], or [fullhash]. You can also provide some unique strings yourself like a date timestamp of when the build was happening. Or you can also use hash configuration from the HTMLWebpackPlugin. And none of these options, besides using [contenthash], are ideal to maintain the best cache strategy. But we will cover it a little later.

But for now, you should understand that using [contenthash] is considered the best practice to build an application. When we use this placeholder, once the content of any generated file changes, its name changes too, which results in cache being lost.

And if we store all vendors inside a single file, any additional import from any NPM library will lead this file to be changed.

So, remember: if you see something like this in your assembly, chunk groups are probably set up incorrectly.

Another similar mistake some developers make, is generating vendor chunks based on the name of the NPM package. And such an approach is actually much better in general but may have similar problems of violating the lazy loading principle, because chances are you will be downloading these files initially. Plus, again, this strategy is still not the best in terms of caching.

Correct configuration

So, how can we make our vendors load truly lazily? We need to split our vendors into lazy-loaded parts:

If any exported entities (not packages) are required to be downloaded initially, only they should be done so.
But the rest should be loaded only when they are actually in use. For the example from the very beginning, the countBy and differenceInBusinessDays methods should only be loaded when Chapter 1 is opened.

The ideal assembly for the example from the beginning of the article should look like this:

We have an initial chunk for vendors, plus we have asynchronous vendor chunks. Chapter1 would need to download only its own vendor chunk, while Chapter2 would need to download its own. As in lazy loading it should be.

I would like to additionally note that in the example above, the creation of chunks does not depend on NPM packages in any way. The content of date-fns and lodash-es is split into 3 parts: they load partially during the initial download and partially during the loading of each of the lazy pages.

In Webpack, such vendor splitting is really easy to achieve:

module.exports = {
  // ...
  optimization: {
    splitChunks: {
      chunks: 'all',
    },
  },
}

📌 Set up your Webpack to split vendor packages and load them lazily.

Make it cacheable

But what do I mean by saying "cacheability"? We do need to reset cache of our files every time we deploy an update. However, we don't need to lose cache for all the files. In fact, we can make only required files to lose their cache. And by "cacheability", I mean how many files don't lose their cache after we make a change.

Use Webpack correctly

Let's go back to how to properly configure file names in an assembly. As I said, most of the existing strategies other than using [contenthash] are not ideal to maintain a decent cache strategy.

Using [fullhash] or HTMLWebpackPlugin's hash leads to removing cache from all the files each time we make the slightest change.
Using [hash] and [chunkhash] will result in slightly more stable files. Although, they still will not be as stable as when using [contenthash].
As creating your own strategy, it's either very complicated or you will end up losing cache for all the files by the smallest change.
- For example, in my previous company for some period of time, we used Date.now() instead of the built-in hash functions. Which made all files update their names in every single build. Even if we didn't have any changes.

module.exports = {
  // ...
  output: {
    // Bad examples
-    filename: `[name].${Date.now()}.js`,
-    filename: '[name].[fullhash].js',
-    filename: '[name].[hash].js',
-    filename: '[name].[chunkhash].js',
    // Good example
+    filename: '[name].[contenthash].js',
  },
  plugins: [
    new HtmlWebpackPlugin({
      // Bad example
-     hash: true,
    })
  ],
};

📌 Use only Webpack's [contenthash] to make your application as cacheable as possible.

Clean your dependency tree

From the first glance, "lazy loading" and "caching" seem to be unrelated optimization strategies. However, they do share some commonalities. For example, in the previous article, we found out how optimizing the dependency graph can improve the number and size of uploaded files. But also the "clean" dependency graph affects how much our site cacheability.

Let's regard the following example. Due to a terrible source file dependency graph, Webpack created an entangled execution dependency graph. Like on the picture below.

Let me remind you how to read such a graph. The black lines indicate the "hard" dependencies. Every time the browser needs to download a file, the hard-dependent files will also be downloaded. That is, in our bad example, when downloading Page 1, we need to download chunks 1-5. And to download chunks 1-5, we need to download chunks 6-10.

Let's say we made a one-line change that would affect the content of [id].[hash].chunk6.js. We'd expect only this file to lose its cache. However, in reality, chunks 1-5, as well as pages 1-3, as well as main.js, - they all also update their content hash. And therefore, with the execution graph above, a single one-line change may lead most of the files to lose their cache.

Webpack is not able to change the name of just one file. The dependencies I mentioned, both "hard" and "soft", are stored inside the files themselves. Page 1 stores links to all chunks 1-5 inside itself. And if the name of one of them changes, page1.[hash].js is also required to update its content in order to be ready to download a new file.

To understand which files will be affected when certain changes are made, we can also visually analyze the execution dependency graph "in reverse order". Fragments 1-5 depend on fragment 6. Pages 1-3 depend on fragments 1-5. And main always depends on pages 1-3. And therefore, they all lose their cache.

Now, let's imagine we fixed the graph.

In reality, the number of generated files should've been changed in such a case. But to make it simpler let's imagine that the files are generated completely the same, while the number of connections became much lower.

What happens when we make exactly the same one-line change?

If we analyze the dependency graph backwards, we'll see that now only chunk1, chunk5, page1, and main, will have their hash updated. Therefore, only 5 files lose their cache.

Now, a small task for you: what files will be changed, if chunk8 is updated?

Answer

Files main, page 2, page 3, chunk 2, chunk 3, and chunks 8, will lose their cache.

📌 Keep your dependency graph clean to improve cacheability of your project. The cleaner the graph is, the fewer files lose their cache each time we make a change.

Beware: you have lost the cache of initial files.

You've already noticed that in all the examples above, main.js always loses its cache. Even when we fixed our dependency tree. This is because the main JavaScript entry file will always depend on all other generated files from the assembly. Thus, this file always loses its cache every time any change is made.

However, I repeat that cache of main and other chunks is lost in this way only when our changes affect the dependency graph, i.e. when we add or remove new imports. However, files can be modified without creating new imports. For example, you can simply correct a constant or a typo. And in this case, there will be significantly fewer files to lose the cache. But at least 2. And one of them is bound to be an initially downloadable file.

And this happens because of such a thing as runtime chunk. This is a piece of code that stores all the information about which files exist in the project. If any file name changes, the runtime chunk changes its contents to include the new file name. Therefore, if in our example with [id].[hash].chunk6.js we don't change the dependency graph, then only 2 files will lose the cache: chunk6 and main, because by default main includes runtime chunk.

This is another reason why we should pay special attention to the size of the initially loaded files. First of all, we need to reduce the loading time like there is no cache. And then, in addition to the already reduced loading time, apply caching on top.

ℹ️ Regardless of the size of our change, the main javascript file will always lose its cache. That's the second most important reason why we should make our initial javascript files as lean as possible.

Set up vendors cache groups

And now, since we already know what "cacheability" is and what defines how good it is, we are ready to continue with our vendors discussion.

You might have noticed that in our "correct configuration" example the size of the initial vendor file is quite big. And with the approach I described it may become even bigger, especially when we use multiple NPM libraries: react, react-dom, zustand, zod, axios, etc. Because even if some libraries can be downloaded lazily, quite many of them still must be downloaded initially. And the large size of the initial vendor file can negatively affect the download speed of the application. However, we can fix this issue by setting up cache groups.

With the cache groups configuration, we can tell Webpack how to generate JavaScript chunks, including what source code files and/or vendors should be included in certain chunks. By configuring cache groups, we can split one file into several or, conversely, combine them. Also, if we set them up correctly, we will improve cacheability.

In the example above, we can use cache groups to split the original vendor file into 2-3 files. Thus, these files can be downloaded in parallel, which can have a very good effect on download speed. However, the name "cache groups" implies that groups should be created to manage the cache, not to speed up downloads.

It is considered good practice to create cache groups only when we are sure that the group we are creating will be stable over time.

ℹ️ In order to make cache groups stable, we should include only "stable" vendors in the groups we create.

For example, react, react-dom, axios and zod, all must be used entirely in our application, and therefore can be included in a cache group. But date-fns or lodash-es can vary in their content based on what exported entities are used in our project, therefore we should not create cache groups for them.

It is often not worth bothering too much about cache groups. It is enough to set up groups for initially loaded vendors. Creating other groups can negatively affect caching capabilities if we don't have a complete understanding of what's going on with our build.

In order to create such groups, we should use splitChunks.cacheGroups:

  module.exports = {
    optimization: {
     splitChunks: {
       cacheGroups: {
        react: {
          filename: `react.[contenthash:8].js`,
          // it's better to create groups for initial vendors only,
          //  but don't use it in micro-fe apps
          chunks: 'initial',
          // Will include `react`, `react-dom`, and `react-router-dom`
          //  in a single chunk
          test: /react/,
        },
       },
     },
    },
  };

And voila! We killed two birds with one stone.:

Our initially uploaded vendor file is split into 2 files, which makes it load 2 times faster.
The chunk created through cache groups will be stored in the cache until the expiration date, because react, react-dom and react-router-dom will rarely change in the application.

📌 Configure Webpack cache groups for stable vendor packages to further split files to improve download speed and cachability.

Conclusion

Alright, that was long, but that's it for today. Thank you for joining me once again on our journey to make our web applications indefinitely lazy. If you have any questions feel free to ask them in the comments. And you can also read the following articles from this series.

And to summarize this article, let's list the rules we learned today:

📌 Set up your Webpack to split vendor packages and load them lazily.
📌 Use only Webpack's [contenthash] to make your application as cacheable as possible.
📌 Keep your dependency graph clean to improve cacheability of your project. The cleaner the graph is, the fewer files lose their cache each time we make a change.
📌 Configure Webpack cache groups for stable vendor packages to further split files to improve download speed and cachability.

You are almost there, just one more push:
How to make your app indefinitely lazy – Part 4: Preload in Advance

Here are my social links: LinkedIn Telegram GitHub. See you ✌️