DEV Community

Cover image for Beyond Static Resources: Delta Compression for Dynamic HTML
Carlos Mateo
Carlos Mateo

Posted on

Beyond Static Resources: Delta Compression for Dynamic HTML

Compression Dictionary Transport (RFC 9842), authored by Patrick Meenan and Yoav Weiss, lets browsers use previously seen content as compression dictionaries, dramatically reducing transfer sizes. The standard is shipped in Chromium-based browsers (Chrome, Edge, Opera, etc.), with Firefox actively working toward enabling it. Early production deployments have shown double-digit percentage reductions in HTML payload sizes and measurable improvements to Largest Contentful Paint.

And there's a natural next step—one that could extend these benefits to the majority of the web's traffic.

(Note: While the standard supports multiple algorithms including Zstandard, this article will use Brotli and its dcb encoding for the purposes of demonstration.)

Where Compression Dictionaries shine today

Compression Dictionary Transport was designed around two powerful use cases:

Use case 1: Delta compression for versioned static resources. Your browser has app.v1.js cached. You deploy app.v2.js. The browser uses the old version as a compression dictionary for the new one, and the server only sends the diff. This works beautifully—but it requires the previous version to be sitting in the HTTP cache with a long max-age.

Use case 2: Pre-built shared dictionaries. You author a separate dictionary file, the browser fetches it during idle time via <link rel="compression-dictionary">, and subsequent responses compress against it. This captures general page structure—common HTML tags, CSS patterns, boilerplate—but it cannot include anything specific to a particular user session or request.

Both use cases are valuable. They also share a structural requirement: the dictionary—whether it's a previous version of a static file or a separately fetched dictionary resource—needs to remain available in the browser's dictionary cache long enough for a future response to reference it. Under the standard, that availability is tied to the response's Cache-Control lifetime. A resource with max-age=86400 stays usable as a dictionary for a day. A resource with no-cache or max-age=0 is evicted from the dictionary storage immediately—before the user even clicks the next link.

That's the starting point for what this article explores: what happens when you decouple those two lifetimes?

The opportunity: dynamic content

Think about the pages you visit most often: search results, social media feeds, news sites, ecommerce product listings, dashboards, email inboxes. Many of them share a common characteristic: they are often served with Cache-Control: no-cache or max-age=0. The HTML is personalized, timestamped, and never the same twice.

As currently specified, a response's usability as a compression dictionary is tied to its HTTP cache lifetime. When a dynamic page is served with no-cache, the browser discards it from the dictionary cache immediately. The previous response is gone before the user clicks.

This isn't a flaw in the standard—it's a deliberate scope boundary. While working on the standard, the HTTP Working Group actually discussed decoupling dictionary and cache lifetimes (issue #2649). However, without a compelling use case demonstrated at the time, they scoped it out of the initial specification but explicitly left the door open for future extensions.

Now, with the initial implementation having shipped and demonstrating the massive potential for bandwidth savings, we want to look at the more complicated case, as we believe dynamic pages have an equally compelling use case. The same underlying mechanism (compression with a custom dictionary) can do more. The previous response from the same session is, in many ways, the ideal dictionary for dynamic content. We just need a way to keep it around long enough.

A two-by-two matrix showing RFC 9842's current coverage. The horizontal axis is dictionary source (previous version vs. pre-built shared dictionary) and the vertical axis is content type (static cached vs. dynamic no-cache). The two static-content cells are checked as supported today. The dynamic + previous response cell is marked as the opportunity—dictionaries are evicted immediately under current cache-control coupling. The dynamic + Dictionary TTL cell shows the solution: previous responses survive as dictionaries, achieving 10–27× smaller payloads than standard Brotli.

Dictionary TTL decouples dictionary availability from Cache-Control, enabling delta compression for dynamic content.

Why this matters: what's actually in a dynamic page?

To understand why this opportunity is significant, you need to look at what a real dynamic page contains. I mean a page like the one you'd get from Google Search, Facebook, Amazon or any modern web application that serves customized content in the HTML document.

A production dynamic page is not just "a template plus some content." It's a complex mixture of:

  • Session-stable content that stays identical across every page the user visits in a session: CSS (often with hashed class names from CSS modules), user profile blocks, navigation elements, feature flag configurations, A/B test variant assignments, the JavaScript framework runtime, and inline configuration objects.

  • Per-request metadata that changes on every single page load: CSP nonces on every <script> and <style> tag, CSRF tokens, request IDs, trace IDs, server timestamps, ads data with unique bid IDs or inline JSON state blobs.

  • Dynamic content that varies per query/interaction: the actual search results, feed items, product cards—the part of the page the user is there to see.

Standard compression compresses each of these components from scratch on every request. A pre-built shared dictionary captures some of the general HTML and CSS patterns, but it can't match session-specific hashed class names, and it certainly can't deduplicate the 4-5 KB of inline JSON state that's unique per request but structurally identical to the previous request's state blob.

The previous response can. It has the exact same hashed class names, the same CSS values, the same navigation structure, the same inline JS configuration, and a structurally identical (though content-different) state blob. If you use it as the dictionary, the encoder matches everything that hasn't changed and only encodes the diff.

A stacked bar breaks down a ~100 KB dynamic page into segments: hashed CSS (14 KB), session state and JSON payload (35 KB), HTML structure (8 KB), per-request tokens (2 KB), dynamic content (25 KB), and scripts/other (16 KB). Below the bar, three overlay rows show what each compression approach can match. Standard Brotli matches only scattered internal repetitions. A static dictionary matches generic patterns like navigation and scripts but misses hashed classes and session state. Delta compression using the previous response matches nearly the entire page—only genuinely new content and tokens need encoding—yielding roughly 97% savings compared to 68% for standard Brotli and 76% for a static dictionary.

The previous response from the same session matches CSS, state, navigation, and scripts byte-for-byte. Only the dynamic content and per-request tokens need to be encoded from scratch.

Dictionary TTL: extending the standard to dynamic content

Dictionary TTL is an experimental extension to the Use-As-Dictionary response header that decouples dictionary lifetime from HTTP cache lifetime:

Use-As-Dictionary: match="/search*", ttl=600
Cache-Control: no-cache
Enter fullscreen mode Exit fullscreen mode

This tells the browser: "Yes, this response should be revalidated on every request (no-cache). But keep it available as a compression dictionary for 600 seconds, regardless." The dictionary cache becomes independent of the HTTP cache.

Dictionary TTL was part of the original Compression Dictionary Transport design. It was scoped out during the IETF standards process to keep the initial specification focused on the two well-understood use cases. It's currently available as an experimental flag in Chromium (Chrome Canary and Beta), ready for developer testing. It can be found in chrome://flags/#enable-compression-dictionary-ttl

Demo

To demonstrate this, I built a live interactive prototype that simulates a realistic dynamic web application and measures compression ratios in real time.

The demo generates pages that model what production dynamic sites actually look like:

  • Session-dependent class name hashing. Every CSS class is hashed per session, like CSS modules in a real build pipeline: .result-card becomes .rC_9c3a64 for one session and .rC_c28ff9 for another. This means a static dictionary built from someone else's session literally cannot match any of your CSS selectors.

  • Session-variant CSS values. The primary color hue, accent colors, border radii, spacing units, font stacks, shadow values, and layout dimensions all vary per session. Two sessions visiting the same page get structurally different CSS.

  • Conditional component rendering. Feature flags (stable per session) determine which HTML sections exist: knowledge panels, filter chips, personalized recommendation blocks, feedback prompts. Different sessions get different DOM structures.

  • Large inline data blobs. A substantial window.__INITIAL_STATE__ JSON payload containing user profile data, activity metrics, notification feeds, and session configuration—mimicking the heavy personalized state that frameworks like Next.js and React inject into every server-rendered page.

  • Per-request tokens everywhere. CSP nonces, CSRF tokens, request IDs, trace IDs, server timestamps, ad auction IDs, and preload hints that change on every single request.

The combination means that zero CSS class names are shared between different sessions:

Session "user-abc-123":   .result-card → .rC_9c3a64
Session "dict-sample-0":  .result-card → .rC_c28ff9

Class names shared between sessions: 0/59
Enter fullscreen mode Exit fullscreen mode

This is deliberately hostile to static dictionaries, but it's exactly how many production sites work.

A screenshot of the live metrics panel from the delta compression prototype. The panel shows real-time compression results for a dynamic page served with three encoding strategies side by side: standard Brotli, static dictionary, and delta compression using the previous response. Each column displays the compressed size in bytes, the compression ratio, and the savings percentage. The delta compression column shows dramatically smaller payloads—on the order of 1–2 KB for new queries and as low as 52 bytes for duplicate queries—compared to roughly 24 KB for standard Brotli and 17 KB for the static dictionary.

The metrics panel shows compression results in real time. Standard Brotli, static dictionary, and delta compression are compared on every page load.

Results

The demo measures three compression modes on every page load:

Mode Dictionary Source Content-Encoding
Standard Brotli None br
Static Dictionary Pre-built from 5 sample pages (different sessions) dcb
Delta Compression Previous response from the same session dcb

Here are the results across navigation patterns:

A grouped bar chart comparing compressed output size across three navigation patterns. For a duplicate query (tab restore or refresh), standard Brotli produces 24,323 bytes, a static dictionary produces 16,887 bytes, and delta compression produces just 52 bytes. For a new query (different search term), the three methods produce roughly 23,500, 16,000, and 1,800 bytes respectively. For a different topic (cross-domain navigation with a larger page), they produce roughly 38,000, 30,000, and 2,500 bytes. Delta compression bars are barely visible at this scale.

Real measurements from the live demo. Raw page sizes range from 50–122 KB.

Cross-query navigation (the most common case—user searches for something new):

  • Raw page size: ~72–122 KB
  • Standard Brotli: ~23–38 KB (68% savings)
  • Static Dictionary: ~15–30 KB (76–80% savings)
  • Delta Compression: ~1,000–2,700 B (96–98% savings)

Delta beats standard Brotli by 10–27×. Not 10–27 percent—ten to twenty-seven times smaller.

Duplicate queries and tab restores (the extreme case—user refreshes, hits back, or restores a tab):

  • Raw page size: ~78 KB
  • Standard Brotli: ~24,323 B
  • Delta Compression: 52 bytes

Fifty-two bytes. The entire page—CSS, HTML, JavaScript, inline state blobs, sidebar, footer, everything—compresses to 52 bytes. That's a 468× reduction over standard Brotli. The only bytes that changed were the per-request CSRF token, CSP nonce, and server timestamp. Brotli encodes the rest as a single "copy from dictionary" instruction.

This isn't an edge case. Tab restores, non-cached back-button navigations, and pagination within the same query all produce near-identical pages. Any time the dynamic content hasn't changed between requests, delta compression reduces the transfer to effectively zero.

Horizontal bars illustrate the extreme case of a duplicate query within the same session—simulating a tab restore, back-button press, or page refresh. Raw HTML is 78,166 bytes. Standard Brotli compresses it to 24,323 bytes (68.9% savings). A static dictionary compresses it to 16,887 bytes (78.4% savings). Delta compression reduces it to just 52 bytes (99.93% savings)—a 468× improvement over standard Brotli and 1,503× over the raw page.

Same query, same session. The entire 78 KB page compresses to 52 bytes—only the per-request tokens changed. (Network overhead not included.)

Why are the results so dramatic? Because the static dictionary was built from different sessions. It sees different hashed class names, different CSS property values, and different DOM structures. It can match some general HTML patterns (<div class=", </article>) but nothing session-specific. The delta dictionary is from the same session—identical class names, identical CSS, identical navigation structure. Brotli's sliding window matches them byte-for-byte and only encodes what actually changed.

The key insight

Standard Brotli exploits repetition within a single response. A pre-built static dictionary captures general structural patterns across the web.

Delta compression using the previous response exploits repetition across sequential responses from the same session, achieving near-optimal deduplication of everything that hasn't changed.

The insight is that for dynamic web applications, the best compression dictionary isn't a carefully authored generic file—it's the page the user just visited. It contains exactly the right bytes: the user's session state, their CSS variant, their feature flags, the framework boilerplate, the navigation structure. All the compressor has to do is encode what's different.

A horizontal spectrum from less context (left) to more context (right) with three nodes. On the left, standard Brotli with no dictionary compresses to roughly 24 KB by exploiting repetition within a single response. In the middle, a shared pre-built dictionary compresses to roughly 17 KB by capturing common patterns across users—a 1.4× improvement. On the right, delta compression using the previous response compresses to roughly 1.8 KB (52 bytes on duplicate queries) by capturing exact session state byte-for-byte—a 10–27× improvement over the shared dictionary.

For dynamic content, the jump from shared dictionary to previous-response dictionary is where the dramatic gains are.

Measurement methodology

Getting honest compression numbers requires some care. The demo uses a two-phase rendering approach:

  1. Clean render: Generate the full page (skeleton + dynamic content) with no metrics overlay. This is what gets compressed, measured, and cached as a dictionary.
  2. Display render: Inject the metrics panel into the clean page for the user to see.

This ensures the compression ratios reflect what you'd see in production. The metrics panel is purely educational—it wouldn't exist on a real site, and it doesn't inflate the measured page size.

The static dictionary is built from pages generated with different session IDs—simulating the realistic scenario where a CDN or origin server pre-builds a dictionary from diverse traffic. This is the best a static dictionary can do: capture cross-user structural patterns. It's still missing every session-specific byte.

Since Dictionary TTL is not yet enabled in all browsers, the demo uses a server-side cookie fallback to simulate the Available-Dictionary header, ensuring the protocol flow and compression measurements work regardless of your browser's current CDT support. You can inspect the real CDT headers (Use-As-Dictionary, Content-Dictionary, Vary) in DevTools on every response.

Implications and what comes next

Dictionary TTL extends compression dictionary benefits to the largest class of web content that Compression Dictionaries hasn't reached yet. The technique doesn't require new compression algorithms, new wire formats, or new browser APIs. It uses the exact same compression-with-custom-dictionary mechanism (like dcb or dcz encoding) that RFC 9842 already standardizes. The only additional piece is a way for the dictionary to outlive the HTTP cache entry—which is exactly what Dictionary TTL provides.

The potential impact is significant. Dynamic HTML pages are commonly used on some of the most visited sites on the web. Every Google search, every Facebook scroll, every Amazon product page, every dashboard refresh. If even a fraction of these could use delta compression, the aggregate bandwidth savings across the web would be substantial.

But this isn't just about bandwidth. Fewer bytes on the wire means less time spent transferring content, which can translate to faster parsing and rendering. On constrained networks—mobile connections in emerging markets, satellite links, congested public WiFi—a 2× reduction in transfer size translates directly to user-visible performance improvements.

The technique also composes well with existing optimizations. Sites already using shared dictionaries for their static resources (JS, CSS bundles) can additionally use delta compression for their dynamic HTML. The two approaches are complementary, not competing.

The server-side cost

While the bandwidth savings are compelling, keeping a unique HTML payload cached for every active session introduces a significant server-side storage challenge. Unlike static shared dictionaries (which are a single file served to everyone), dynamic delta compression requires the server to hold onto each user's previous response so it can compress the next one against it. At scale, this means potentially millions of unique dictionaries in memory simultaneously.

To manage this explosion of dictionaries, infrastructure will require smart caching strategies. Sticky routing—tying a user's session to a specific edge node or origin server—ensures the dictionary is already local when the next request arrives, avoiding cross-node lookups entirely. Alternatively, fast ephemeral key-value stores (Redis, Memcached) can hold dictionaries with TTL-based eviction that mirrors the ttl value in the Use-As-Dictionary header. This bounds memory growth to roughly active_sessions × avg_page_size × TTL_window—for a site with 100,000 concurrent sessions, a 600-second TTL, and ~100 KB pages, that's approximately 10 GB, well within the capacity of a single Redis instance or a modest edge KV deployment.

The protocol does handle missing dictionaries gracefully—if the server evicts a dictionary due to memory pressure, it simply ignores the Available-Dictionary hash and falls back to standard compression without a dictionary, so the page still loads flawlessly.

However, this fallback comes with a steep performance cost. Reverting to standard no-dictionary compression means the payload jumps back to its completely unoptimized size (in our demo, from ~2.6 KB all the way up to ~26 KB). When this happens, you completely miss out on the baseline savings that even a generic static dictionary (~18.5 KB) would have provided. Because of this significant drop-off in compression efficiency, the server-side dictionary cache must be highly reliable to deliver consistent, user-visible performance improvements. Eviction should be the exception, not the norm—and the TTL-based expiry model helps here, since dictionaries that have also expired in the browser's dictionary cache can be safely reclaimed without triggering a fallback.

Dictionary TTL is currently in experimental testing in Chromium. Demonstrations like this one can help build the case for bringing TTL back into the standards track—extending Compression Dictionary's reach to the dynamic content that dominates the web.

Try it yourself

The demo is live at delta-compression-demo.onrender.com. Search for anything. The first page shows standard Brotli only. The second page shows all three compression modes side by side. Keep searching to see how the ratios hold up across different queries.

The source code is on GitHub. It's a Flask server, about 500 lines total, and you can run it locally or deploy it with one click. The README walks through the protocol flow and measurement methodology.

I'd love to hear from anyone working on Compression Dictionary Transport (RFC 9842) implementations, content delivery, or web performance measurement. Is this use case relevant to your workload? Does the Dictionary TTL extension solve a problem you've encountered? The more real-world evidence we can gather, the stronger the case for bringing this capability into the standard.


Carlos Mateo Muñoz works on web performance and compression technologies. He co-authored the Chrome for Developers case study on compression dictionaries and works on cross-browser interoperability for IETF standards. You can find him on LinkedIn.

References

Top comments (0)