DEV Community

Cover image for Web Perf - Large Files
Francesco Di Donato
Francesco Di Donato

Posted on

Web Perf - Large Files

When you have a lot of information to show on the screen, there are two common choices:

  • pagination
  • streaming

But you know what? You can also just send the whole thing to the user. Let's explore this straightforward approach together.

Headers

We'll explore three crucial headers - Content-Length, Content-Encoding, and Transfer-Encoding - and see how different combinations of these headers can influence the rendering on the browser.

Content-Length

It specifies the size of the payload in bytes. It helps the recipient (in our case, the browser) know how much data to expect. It's like a heads-up, ensuring everyone is on the same page about the amount of content coming their way.

Content-Encoding

It tells the browser how the content is encoded or compressed. It's like a secret code that both the server and the browser understand. Common encodings include gzip and deflate. When the browser sees this header, it knows how to decode the content for a smooth display.

Transfer-Encoding

This less-known header handles the encoding of the message itself during transmission. It can be sent all at once (like a single big parcel) or in smaller pieces (like breaking it into multiple smaller packages).


Server

To see how the browser reacts to different combinations of these headers, let's set up a basic Node.js HTTP server.

The server is designed to handle incoming requests for any path, offering flexibility through optional query parameters:

  • content-length: If set to true, this parameter adds the Content-Length header to the response.

  • transfer: This parameter sets the Transfer-Encoding header, with options for:

    • identity: Instruct the server to send the document as a whole.
    • chunked: Directs the server to send the document in chunks.
  • gzip: If set to true, the server retrieves the zipped version of the file. In this case, the Content-Length (if set) reflects the size of the zipped version.

Support Code

index.mjs
import { createServer } from "http";
import { getView, combinations } from "./src/utils.mjs";

createServer(async (req, res) => {
  const { searchParams: sp } = new URL(req.url, "http://127.0.0.1:8000");

  // Ready query params
  const length = sp.get("content-length") === "true" || false;
  const gzip = sp.get("gzip") === "true" || false;
  const identity = sp.get("identity") === "true" || false;

  // Either index.html or index.html.gz
  let view = "index.html";
  if (gzip) view += ".gz";

  const [data, stats, _] = await getView(view);

  res.setHeader("Content-Type", "text/html");

  if (length) res.setHeader("Content-Length", stats.size);

  if (gzip) res.setHeader("Content-Encoding", "gzip");

  if (identity) res.setHeader("Transfer-Encoding", "identity")

  return res.writeHead(200).end(data);
}).listen(8000, "127.0.0.1", () => {
  console.info(combinations().join("\n"));
});
Enter fullscreen mode Exit fullscreen mode

src/utils.mjs
import { resolve } from "path";
import { readFile, stat } from "fs/promises";

const root = resolve(new URL(".", import.meta.url).pathname, "..");

export async function getView(name, encoding) {
  const filepath = resolve(root, "src", "views", name);

  const [data, stats] = await Promise.allSettled([
    readFile(filepath, encoding),
    stat(filepath),
  ]);
  if (data.status === "rejected")
    return [null, null, new Error("could not retrieve data")];
  if (stats.status === "rejected")
    return [null, null, new Error("could not retrieve stats")];

  return [data.value, stats.value, null];
}

// Ignore if not running on your machine.
export function createURL(
  { gzip, length, identity } = {
    gzip: false,
    length: false,
    identity: false,
  }
) {
  // Use pathname so that in Developer Tools you can filter out the favicon.
  const url = new URL("big", "http://127.0.0.1:8000");
  if (gzip) url.searchParams.set("gzip", "true");
  if (length) url.searchParams.set("content-length", "true");
  if (identity) url.searchParams.set("identity", "true");
  return url.toString();
}

// Ignore if not running on your machine.
export function combinations() {
  const urls = [createURL()];
  for (const gzip of [false, true]) {
    for (const length of [false, true]) {
      for (const identity of [false, true]) {
        urls.push(
          createURL({
            gzip,
            length,
            identity,
          })
        );
      }
    }
  }
  return urls;
}
Enter fullscreen mode Exit fullscreen mode

The content within the HTML files is not the focus; their substantial size is (AKA they're LARGE).

These files are maintained in two versions: plain and gzipped. While it's generally not a recommended practice for real servers to adopt dual storage, here the intent is to avoid introducing additional overhead in the case of zipping.


Testing

I'll now test the server using Firefox, monitoring how response times vary with changes in the headers. If you're following along on your machine, make sure to disable the cache and consider checking the 'Preserve logs' checkbox for a more accurate observation.

Screenshot of Network tab showing what is descripted below.

Initially, we make a request without query parameters, letting the Node.js HTTP server handle it by default. The default Transfer-Encoding observed is chunked, aligning with the behavior of passing ?transfer=chunked. Node.js aims to be non-blocking, and this choice ensures smoother processing.

Now, let's spice things up by passing the query parameter ?transfer=identity. This time, the request takes notably longer to complete.

To remedy this, we introduce the Content-Length header with ?content-length=true&identity=true, resulting in a significant reduction in duration. It's like mailing a package in one piece. Including the Content-Length header is the friendly note that says, 'Hey, your package is this big!'. Without it, the client might fumble guessing the size, leading to some awkward data processing moments.

🔑 message
In 'identity' mode, be a good server and always attach that Content-Length header.

As a final observation, we note that the presence of the Content-Length has no impact when the transfer method is set to chunked.

🔑 message
Not only there's no need for the Content-Length header`, but using both is actually contradictory.
In 'chunked' encoding, the size of each chunk is self-contained, and a final zero-size chunk does the job of marking the end of the response.

Compression

When using the gzip-compressed resource, the behavior aligns with what we've just explored. In the identity transfer mode, it remains crucial to provide information about the content length, regardless of the content encoding.

Screenshot of Network tab showing what is descripted below.

Now, let's talk about compression benefits and a trade-off. Opting for gzip compression offers two wins:

  1. it conserves disk storage on your server.
  2. it trims down on bandwidth usage.

However, there's a catch - the browser has to roll up its sleeves and put in a bit more effort to decompress.


If you are interested in Web Performance you definitely need to know about Web Caching (posts series).

Top comments (0)