Carlos Mateo

Posted on Mar 6

Beyond Static Resources: Delta Compression for Dynamic HTML

#webperf #http #compression

Compression Dictionary Transport (RFC 9842), authored by Patrick Meenan and Yoav Weiss, lets browsers use previously seen content as compression dictionaries, dramatically reducing transfer sizes. The standard is shipped in Chromium-based browsers (Chrome, Edge, Opera, etc.), with Firefox actively working toward enabling it. Early production deployments have shown double-digit percentage reductions in HTML payload sizes and measurable improvements to Largest Contentful Paint.

And there's a natural next step—one that could extend these benefits to the majority of the web's traffic.

(Note: While the standard supports multiple algorithms including Zstandard, this article will use Brotli and its dcb encoding for the purposes of demonstration.)

Where Compression Dictionaries shine today

Compression Dictionary Transport was designed around two powerful use cases:

Use case 1: Delta compression for versioned static resources. Your browser has app.v1.js cached. You deploy app.v2.js. The browser uses the old version as a compression dictionary for the new one, and the server only sends the diff. This works beautifully—but it requires the previous version to be sitting in the HTTP cache with a long max-age.

Use case 2: Pre-built shared dictionaries. You author a separate dictionary file, the browser fetches it during idle time via <link rel="compression-dictionary">, and subsequent responses compress against it. This captures general page structure—common HTML tags, CSS patterns, boilerplate—but it cannot include anything specific to a particular user session or request.

Both use cases are valuable. They also share a structural requirement: the dictionary—whether it's a previous version of a static file or a separately fetched dictionary resource—needs to remain available in the browser's dictionary cache long enough for a future response to reference it. Under the standard, that availability is tied to the response's Cache-Control lifetime. A resource with max-age=86400 stays usable as a dictionary for a day. A resource with no-cache or max-age=0 is evicted from the dictionary storage immediately—before the user even clicks the next link.

That's the starting point for what this article explores: what happens when you decouple those two lifetimes?

The opportunity: dynamic content

Think about the pages you visit most often: search results, social media feeds, news sites, ecommerce product listings, dashboards, email inboxes. Many of them share a common characteristic: they are often served with Cache-Control: no-cache or max-age=0. The HTML is personalized, timestamped, and never the same twice.

As currently specified, a response's usability as a compression dictionary is tied to its HTTP cache lifetime. When a dynamic page is served with no-cache, the browser discards it from the dictionary cache immediately. The previous response is gone before the user clicks.

This isn't a flaw in the standard—it's a deliberate scope boundary. While working on the standard, the HTTP Working Group actually discussed decoupling dictionary and cache lifetimes (issue #2649). However, without a compelling use case demonstrated at the time, they scoped it out of the initial specification but explicitly left the door open for future extensions.

Now, with the initial implementation having shipped and demonstrating the massive potential for bandwidth savings, we want to look at the more complicated case, as we believe dynamic pages have an equally compelling use case. The same underlying mechanism (compression with a custom dictionary) can do more. The previous response from the same session is, in many ways, the ideal dictionary for dynamic content. We just need a way to keep it around long enough.

Dictionary TTL decouples dictionary availability from Cache-Control, enabling delta compression for dynamic content.

Why this matters: what's actually in a dynamic page?

To understand why this opportunity is significant, you need to look at what a real dynamic page contains. I mean a page like the one you'd get from Google Search, Facebook, Amazon or any modern web application that serves customized content in the HTML document.

A production dynamic page is not just "a template plus some content." It's a complex mixture of:

Session-stable content that stays identical across every page the user visits in a session: CSS (often with hashed class names from CSS modules), user profile blocks, navigation elements, feature flag configurations, A/B test variant assignments, the JavaScript framework runtime, and inline configuration objects.
Per-request metadata that changes on every single page load: CSP nonces on every <script> and <style> tag, CSRF tokens, request IDs, trace IDs, server timestamps, ads data with unique bid IDs or inline JSON state blobs.
Dynamic content that varies per query/interaction: the actual search results, feed items, product cards—the part of the page the user is there to see.

Standard compression compresses each of these components from scratch on every request. A pre-built shared dictionary captures some of the general HTML and CSS patterns, but it can't match session-specific hashed class names, and it certainly can't deduplicate the 4-5 KB of inline JSON state that's unique per request but structurally identical to the previous request's state blob.

The previous response can. It has the exact same hashed class names, the same CSS values, the same navigation structure, the same inline JS configuration, and a structurally identical (though content-different) state blob. If you use it as the dictionary, the encoder matches everything that hasn't changed and only encodes the diff.

The previous response from the same session matches CSS, state, navigation, and scripts byte-for-byte. Only the dynamic content and per-request tokens need to be encoded from scratch.

Dictionary TTL: extending the standard to dynamic content

Dictionary TTL is an experimental extension to the Use-As-Dictionary response header that decouples dictionary lifetime from HTTP cache lifetime:

Use-As-Dictionary: match="/search*", ttl=600
Cache-Control: no-cache

This tells the browser: "Yes, this response should be revalidated on every request (no-cache). But keep it available as a compression dictionary for 600 seconds, regardless." The dictionary cache becomes independent of the HTTP cache.

Dictionary TTL was part of the original Compression Dictionary Transport design. It was scoped out during the IETF standards process to keep the initial specification focused on the two well-understood use cases. It's currently available as an experimental flag in Chromium (Chrome Canary and Beta), ready for developer testing. It can be found in chrome://flags/#enable-compression-dictionary-ttl

Demo

To demonstrate this, I built a live interactive prototype that simulates a realistic dynamic web application and measures compression ratios in real time.

The demo generates pages that model what production dynamic sites actually look like:

Session-dependent class name hashing. Every CSS class is hashed per session, like CSS modules in a real build pipeline: .result-card becomes .rC_9c3a64 for one session and .rC_c28ff9 for another. This means a static dictionary built from someone else's session literally cannot match any of your CSS selectors.
Session-variant CSS values. The primary color hue, accent colors, border radii, spacing units, font stacks, shadow values, and layout dimensions all vary per session. Two sessions visiting the same page get structurally different CSS.
Conditional component rendering. Feature flags (stable per session) determine which HTML sections exist: knowledge panels, filter chips, personalized recommendation blocks, feedback prompts. Different sessions get different DOM structures.
Large inline data blobs. A substantial window.__INITIAL_STATE__ JSON payload containing user profile data, activity metrics, notification feeds, and session configuration—mimicking the heavy personalized state that frameworks like Next.js and React inject into every server-rendered page.
Per-request tokens everywhere. CSP nonces, CSRF tokens, request IDs, trace IDs, server timestamps, ad auction IDs, and preload hints that change on every single request.

The combination means that zero CSS class names are shared between different sessions:

Session "user-abc-123":   .result-card → .rC_9c3a64
Session "dict-sample-0":  .result-card → .rC_c28ff9

Class names shared between sessions: 0/59

This is deliberately hostile to static dictionaries, but it's exactly how many production sites work.

The metrics panel shows compression results in real time. Standard Brotli, static dictionary, and delta compression are compared on every page load.

Results

The demo measures three compression modes on every page load:

Mode	Dictionary Source	Content-Encoding
Standard Brotli	None	`br`
Static Dictionary	Pre-built from 5 sample pages (different sessions)	`dcb`
Delta Compression	Previous response from the same session	`dcb`

Here are the results across navigation patterns:

Real measurements from the live demo. Raw page sizes range from 50–122 KB.

Cross-query navigation (the most common case—user searches for something new):

Raw page size: ~72–122 KB
Standard Brotli: ~23–38 KB (68% savings)
Static Dictionary: ~15–30 KB (76–80% savings)
Delta Compression: ~1,000–2,700 B (96–98% savings)

Delta beats standard Brotli by 10–27×. Not 10–27 percent—ten to twenty-seven times smaller.

Duplicate queries and tab restores (the extreme case—user refreshes, hits back, or restores a tab):

Raw page size: ~78 KB
Standard Brotli: ~24,323 B
Delta Compression: 52 bytes

Fifty-two bytes. The entire page—CSS, HTML, JavaScript, inline state blobs, sidebar, footer, everything—compresses to 52 bytes. That's a 468× reduction over standard Brotli. The only bytes that changed were the per-request CSRF token, CSP nonce, and server timestamp. Brotli encodes the rest as a single "copy from dictionary" instruction.

This isn't an edge case. Tab restores, non-cached back-button navigations, and pagination within the same query all produce near-identical pages. Any time the dynamic content hasn't changed between requests, delta compression reduces the transfer to effectively zero.

Same query, same session. The entire 78 KB page compresses to 52 bytes—only the per-request tokens changed. (Network overhead not included.)

Why are the results so dramatic? Because the static dictionary was built from different sessions. It sees different hashed class names, different CSS property values, and different DOM structures. It can match some general HTML patterns (<div class=", </article>) but nothing session-specific. The delta dictionary is from the same session—identical class names, identical CSS, identical navigation structure. Brotli's sliding window matches them byte-for-byte and only encodes what actually changed.

The key insight

Standard Brotli exploits repetition within a single response. A pre-built static dictionary captures general structural patterns across the web.

Delta compression using the previous response exploits repetition across sequential responses from the same session, achieving near-optimal deduplication of everything that hasn't changed.

The insight is that for dynamic web applications, the best compression dictionary isn't a carefully authored generic file—it's the page the user just visited. It contains exactly the right bytes: the user's session state, their CSS variant, their feature flags, the framework boilerplate, the navigation structure. All the compressor has to do is encode what's different.

For dynamic content, the jump from shared dictionary to previous-response dictionary is where the dramatic gains are.

Measurement methodology

Getting honest compression numbers requires some care. The demo uses a two-phase rendering approach:

Clean render: Generate the full page (skeleton + dynamic content) with no metrics overlay. This is what gets compressed, measured, and cached as a dictionary.
Display render: Inject the metrics panel into the clean page for the user to see.

This ensures the compression ratios reflect what you'd see in production. The metrics panel is purely educational—it wouldn't exist on a real site, and it doesn't inflate the measured page size.

The static dictionary is built from pages generated with different session IDs—simulating the realistic scenario where a CDN or origin server pre-builds a dictionary from diverse traffic. This is the best a static dictionary can do: capture cross-user structural patterns. It's still missing every session-specific byte.

Since Dictionary TTL is not yet enabled in all browsers, the demo uses a server-side cookie fallback to simulate the Available-Dictionary header, ensuring the protocol flow and compression measurements work regardless of your browser's current CDT support. You can inspect the real CDT headers (Use-As-Dictionary, Content-Dictionary, Vary) in DevTools on every response.

Implications and what comes next

Dictionary TTL extends compression dictionary benefits to the largest class of web content that Compression Dictionaries hasn't reached yet. The technique doesn't require new compression algorithms, new wire formats, or new browser APIs. It uses the exact same compression-with-custom-dictionary mechanism (like dcb or dcz encoding) that RFC 9842 already standardizes. The only additional piece is a way for the dictionary to outlive the HTTP cache entry—which is exactly what Dictionary TTL provides.

The potential impact is significant. Dynamic HTML pages are commonly used on some of the most visited sites on the web. Every Google search, every Facebook scroll, every Amazon product page, every dashboard refresh. If even a fraction of these could use delta compression, the aggregate bandwidth savings across the web would be substantial.

But this isn't just about bandwidth. Fewer bytes on the wire means less time spent transferring content, which can translate to faster parsing and rendering. On constrained networks—mobile connections in emerging markets, satellite links, congested public WiFi—a 2× reduction in transfer size translates directly to user-visible performance improvements.

The technique also composes well with existing optimizations. Sites already using shared dictionaries for their static resources (JS, CSS bundles) can additionally use delta compression for their dynamic HTML. The two approaches are complementary, not competing.

The server-side cost

While the bandwidth savings are compelling, keeping a unique HTML payload cached for every active session introduces a significant server-side storage challenge. Unlike static shared dictionaries (which are a single file served to everyone), dynamic delta compression requires the server to hold onto each user's previous response so it can compress the next one against it. At scale, this means potentially millions of unique dictionaries in memory simultaneously.

To manage this explosion of dictionaries, infrastructure will require smart caching strategies. Sticky routing—tying a user's session to a specific edge node or origin server—ensures the dictionary is already local when the next request arrives, avoiding cross-node lookups entirely. Alternatively, fast ephemeral key-value stores (Redis, Memcached) can hold dictionaries with TTL-based eviction that mirrors the ttl value in the Use-As-Dictionary header. This bounds memory growth to roughly active_sessions × avg_page_size × TTL_window—for a site with 100,000 concurrent sessions, a 600-second TTL, and ~100 KB pages, that's approximately 10 GB, well within the capacity of a single Redis instance or a modest edge KV deployment.

The protocol does handle missing dictionaries gracefully—if the server evicts a dictionary due to memory pressure, it simply ignores the Available-Dictionary hash and falls back to standard compression without a dictionary, so the page still loads flawlessly.

However, this fallback comes with a steep performance cost. Reverting to standard no-dictionary compression means the payload jumps back to its completely unoptimized size (in our demo, from ~2.6 KB all the way up to ~26 KB). When this happens, you completely miss out on the baseline savings that even a generic static dictionary (~18.5 KB) would have provided. Because of this significant drop-off in compression efficiency, the server-side dictionary cache must be highly reliable to deliver consistent, user-visible performance improvements. Eviction should be the exception, not the norm—and the TTL-based expiry model helps here, since dictionaries that have also expired in the browser's dictionary cache can be safely reclaimed without triggering a fallback.

Dictionary TTL is currently in experimental testing in Chromium. Demonstrations like this one can help build the case for bringing TTL back into the standards track—extending Compression Dictionary's reach to the dynamic content that dominates the web.

Try it yourself

The demo is live at delta-compression-demo.onrender.com. Search for anything. The first page shows standard Brotli only. The second page shows all three compression modes side by side. Keep searching to see how the ratios hold up across different queries.

The source code is on GitHub. It's a Flask server, about 500 lines total, and you can run it locally or deploy it with one click. The README walks through the protocol flow and measurement methodology.

I'd love to hear from anyone working on Compression Dictionary Transport (RFC 9842) implementations, content delivery, or web performance measurement. Is this use case relevant to your workload? Does the Dictionary TTL extension solve a problem you've encountered? The more real-world evidence we can gather, the stronger the case for bringing this capability into the standard.

Carlos Mateo Muñoz works on web performance and compression technologies. He co-authored the Chrome for Developers case study on compression dictionaries and works on cross-browser interoperability for IETF standards. You can find him on LinkedIn.

DEV Community