In the previous part, we saw that a website can load in 15 different ways when visited from the Google results page and how to measure the performance impact of each load type.
In this post, I will share the results of my website's performance measurement, along with my comments.
I assume you are familiar with the differences between various page load types. If you need a refresher, please read the previous post.
This post is part of a series on Signed Exchanges (SXG). To assess SXG’s impact, I measured the performance of different page load types. This article is based on my research and summarizes my findings.
SXG is a technology to make your website load faster for Google-referred users. If you want to implement it on your website, start here.
TL;DR
For the impatient, here are the results showing the Largest Contentful Paint (LCP) 75th percentiles I measured (or estimated as stated in the labels and explained later in the text).
I show that it’s possible to go below half a second, but SXG side effects may worsen the experience for some users. HTML-only prefetching improved the performance, but sometimes only slightly.
Your mileage may vary
These results are based on real user data from my specific website, measured from Polish users. Your results may vary significantly based on your website's architecture, user geography, CDN configuration, and other factors. The patterns I observed, particularly the desktop TTFB issues, may be unique to my setup.
Methodology
Measured pages
I share my findings focusing on a specific section of my website: the vendor index page (similar to a product index page in e-commerce). While I measured other sections as well, including them here would add length and complexity without providing significant additional value.
The vendor index page receives significant traffic from Google and has strong performance metrics, making it an ideal candidate for comparison testing.
Chosen page load types
The results include all the page load types I identified in the previous post, except for types related to:
Accelerated Mobile Pages (AMP), as my website doesn’t use it
Google Ads (no ad campaigns for the measured section of the website at the moment of data collection)
Early Hints , because my website uses HTML edge caching (more on this later)
Data collection conditions
I collected data only from visits that met the following conditions. For the explanation of why these specific conditions were necessary, see my previous post.
It’s a Google-referred visit
The user visited the website for the first time (checked with a local storage marker item)
The page has been opened in an existing tab
The browser supported SXG
All of the above data was sent to DebugBear, a monitoring platform.
Filters applied to collected data
Then I used its filtering functionality to include only:
Users visiting the website from Poland , as Polish users are my target population, so I wanted to understand how they experience website performance.
Additionally, filtering by location helps exclude bots that typically visit my website from other countries.Visits not using cached assets , as the local storage check (described above) did not always work properly.
Visits with Time To First Byte (TTFB) equaling zero for load types not involving prefetching. It’s practically impossible to receive the first byte of the response below 1 millisecond; therefore, I treated those samples as invalid.
Page load types that can be reliably measured by excluding SXG redirects. I'll explain this exclusion later.
After the above filtering, I was left with over 14k data points.
Obtained performance metrics
I segmented the data by page load type and device category to generate LCP histograms and calculate 75th percentiles and averages for each combination.
For calculating averages, I removed outliers by excluding visits with LCP over 5 seconds.
I rely on 75th percentiles by default. When using averages, I always explicitly mention it.
Estimations
I estimated LCP for page load types that use SXG redirects:
When estimating averages, I added a bias correction—which I determined was necessary for accuracy—to the reference average.
For estimating 75th percentiles, I observed a correlation between percentiles and averages. I then leveraged this relationship to calculate missing percentiles from the available averages.
Note that the estimated percentiles were derived from estimated averages, creating a dependency chain in the calculations.
I’ll provide more details later in the text.
Spreadsheet with data
The spreadsheet below contains the LCP data I collected from DebugBear, along with comparison charts. It’s composed of 2 worksheets for:
- 75th percentiles
- averages
Each worksheet contains a configuration section where you can experiment with the estimation parameters to see how the results change.
Download LCP analysis of page load types
Disclaimer
I’m not a data scientist, but I did my best to ensure the data is reliable and the conclusions sound.
LCP results
Below, you can see the comparison of the speed (LCP) of page load types. It’s the same chart as the one at the beginning of this post. I put it here, so you can reference it easily while reading my observations.
Desktop vs mobile
Looking at the chart, you can see that half of the page load types work better on desktop, while the other half works better on mobile:
When the page was loaded from Google (SXG On-Demand Load) or prefetched, the desktop won.
When the browser was talking to Cloudflare (Server Load, Edge Cache Load, and both SXG Redirects), mobile came out on top.
I thought desktop users should have a better experience than mobile users because of better connection quality and more CPU power. Why then it’s not universally applicable to all page load types?
TTFB component of LCP
When I dug deeper into the data, I found that for Cloudflare loads, on desktop, TTFB contributed to 36-51% of LCP, while on mobile, it accounted for 22-33%. Compare the TTFB histograms below—mobile looks much better:
For prefetched pages, TTFB is zero or near zero, so it’s useless for comparisons. But when the pages were loaded from Google SXG cache on demand, the TTFB was better on desktop (the opposite of the previous case). On desktop, TTFB contributed to 25% of LCP, while on mobile, it accounted for 30%.
Why did waiting for the first byte take longer on a desktop than on a mobile for Cloudflare loads?
I hypothesize that in Poland, Cloudflare edge servers have better network connectivity with mobile ISPs than with residential ones. This could be because there are only a few major mobile operators, while many residential ISPs exist.
From Cloudflare's perspective as a global infrastructure provider, Poland might be considered a relatively small market, leading them to potentially prioritize peering agreements with the larger operators.
Note that this is specific to my measurements in Poland and may not apply elsewhere.
If you have a better explanation, drop a comment below.
Reference page load type for comparisons
In this post, I compare the P75 LCP of each page load type to that of a baseline on-demand page load from the server:
Mobile: 1.43 seconds
Desktop: 1.82 seconds
The gap between mobile and desktop is 390 ms.
Keep in mind that the desktop LCP in this study is heavily impacted by the increased TTFB, as discussed above. This impact may not be present in different countries and/or in the future. This makes it difficult to draw general conclusions (i.e., unrelated to my specific website) based on comparisons of page load types on desktop and between device categories.
On the other hand, the mobile performance characteristics should be fairly universal.
Below, you can see the LCP histograms with the TTFB impact on desktop clearly visible:
SXG prefetching with subresources is unbeatable
You probably won’t be surprised that prefetching the entire website (document and subresources) using SXG is a definitive winner in terms of performance. Compared to the reference, on-demand load from the server:
Mobile: -846 ms
Desktop: -1356 ms
As you observed above, I mark LCP improvements with a minus sign (LCP decrease, which is what we want). I use plus sign for LCP degradations (LCP increases, which we should avoid). I don’t use signs before absolute values, as you can see below.
As a result, the LCP is as low as:
Mobile: 584 ms
Desktop: 464 ms
A half-second LCP feels almost unreal? If you’d like more insights like this, subscribe to my newsletter.
Below you will find LCP histograms. Please note that the data points with values higher than 1.5 seconds are almost non-existent!
Can this be improved even further?
My data shows that for SXG-prefetched pages with subresources, 95% of LCP time is spent on rendering.
For extremely low LCP: use large images or videos freely (they're prefetched anyway) and keep your page structure lean and simple (to minimize rendering). The first won't hurt; the second fixes the real bottleneck.
Second place belongs to HTML-only prefetching
While this may seem obvious, it's worth stating clearly: prefetching improves performance regardless of whether you use Speculation Rules or SXG, even when only the HTML document is prefetched. Compared to the reference on-demand load from the server, HTML-only prefetching is faster by:
Mobile: from -20 to -100 ms
Desktop: from -670 to -740 ms
The mobile results appear subpar, but they align roughly with the results reported by Google itself. The desktop performance improvement is significant, mostly because prefetching is a workaround for Cloudflare’s TTFB issue.
I believe inlining critical subresources, such as important CSS fragments, should make HTML-only prefetching more performant. However, I haven’t implemented it on my website (yet).
Speculation Rules are slightly better than SXG for prefetching HTML
A page prefetched using Speculation Rules loads a bit faster than the one prefetched using SXG without subresources. That’s interesting, since both methods technically do the same thing.
On the histograms below, you can see that SXG has more extreme samples (below 250 milliseconds and 5+ seconds), while Speculation Rules’ samples are much more concentrated below 1 second.
I suspect the difference may be explained in part by the cryptography-related CPU overhead of SXG.
Mobile histograms
Desktop histograms
Edge caching was worth it
Introducing HTML edge caching lets me skip processing frequently accessed pages on my server. Also, as Cloudflare works as a reverse proxy, these pages don’t need to be transferred between my server and Cloudflare on each request. They are kept on an optimized infrastructure and retrieved in milliseconds.
This is visible in LCP measurements, when comparing on-demand page loads from the server and the edge cache:
Mobile: -120 ms
Desktop: -350 ms
Given that edge caching improves all the traffic, not only Google-referred, I think the improvement is satisfactory, especially on desktop.
On the histogram above, you can see the absolute LCP values:
Mobile: 1.31 seconds
Desktop: 1.47 seconds
This time, the gap between mobile and desktop is 160 ms—a 2.5x decrease compared to the reference Server Load. I found it’s caused by TTFB again, but why there is less difference now?
I don’t know. If you have an explanation, drop a comment below.
Proper HTML edge caching makes Early Hints' impact negligible
As I mentioned at the beginning, I didn't include measurements for the Early Hints page load type.
Cloudflare caches the Link HTTP header from the server’s first response and then reuses it to send Early Hints in later responses for the same URL.
However, when the edge cache is in use, then not only the Link header, but the entire HTTP response is cached. In effect, those later responses almost always include the full page directly from the cache. In that case, the origin server is never contacted, so there’s no gap between sending Early Hints and delivering the complete response. As a result, Early Hints provide no performance benefit.
Because of this, I wasn’t able to collect enough samples of requests that actually went to the origin server, where Early Hints could make a difference.
SXG On-Demand Load impact varies between device categories
When the SXG version of a page isn't prefetched, it must be fetched on demand. The reasons for this were explained in the previous part.
In this scenario, compared to the reference, the LCP difference was as follows:
Mobile: +240 ms
Desktop: -210 ms
Mobile
The slowdown on mobile can be attributed to the network overhead of downloading SXG subresources on demand. Each subresource may require its own certificate file to be fetched, which in the worst case can double the number of files that need to be downloaded. The higher latency of mobile connections makes this effect more visible.
Also, SXG processing requires performing cryptographic operations. Mobile CPUs often have less processing power, which can make the overhead more visible and further contribute to LCP degradation.
Desktop
I attribute the desktop LCP improvement to the TTFB issue, as explained earlier. Normally, I would expect a slight performance degradation.
The dark side of SXG
As you have seen above, loading the page on demand from the Google SXG cache is not optimal, at least on mobile devices. That’s not good for the technology that promised to improve page load speed. But the real problem is much worse.
SXG fallback client-side redirect
It occurs when the page is missing from Google’s SXG cache. When the user decides to navigate to the target page, the SXG cache serves a fallback page to the browser. This page contains a client-side JavaScript code that redirects the browser to the target page.
It would be interesting to know how much this degrades performance!
RUM tools blind you to the SXG performance problems users experience
I collected many samples with detailed performance measurements for page loads redirected from the Google SXG cache. Unfortunately, I decided to throw them out entirely.
I noticed they don’t make sense at all: they seemed to have no impact on LCP. No improvement (that’s obvious), but also no expected degradation.
Why? The answer is straightforward; however, it took me a while to understand. The bottom line is that it's impossible to measure this in production. No matter which Real User Monitoring (RUM) solution you use, you won’t be able to say if you have a performance issue!
Start of navigation bias
When a user is on page A and clicks a standard link to page B, LCP measurement begins at the moment of the click. This way, the LCP value reflects the entire wait time for the largest element on the target page, including network delays, HTTP redirects, and other overhead.
The situation changes when an SXG-fallback intermediary page uses JavaScript to redirect the user to the target page. In this case, the browser cannot distinguish whether the redirect was triggered by the user or automatically. It assumes a user action and starts measuring time only from the moment the client-side redirect occurs—not from the original click on the Google search results page.
This means the interval between the click and the client-side redirect—I'll call this click-to-redirect—is invisible to the browser. And that gap can be significant. In my tests, it ranged anywhere from 50 to 2000 ms, depending on the device, browser, connection type, and likely other factors.
I built a SXG fallback redirect demo so you can try this yourself and see the difference. The demo also lets you simulate SXG and Speculation Rules prefetching, but in my testing, those didn’t affect the results.
Estimating the average SXG fallback LCP
The LCP of a page loaded using SXG fallback redirect should be a sum of the LCP of the underlying page load type (Server Load or Edge Cache Load) and the click-to-redirect time.
The HTML of the SXG fallback redirect page is under 350 bytes, so the time it takes for the page to load and start the redirect is almost the same as its TTFB.
If the TTFB of the fallback page is similar to the TTFB of SXG On-Demand Load (which it should be, since both responses are generated by the same system), I could use the already collected data.
I prepared a demo that measures TTFB of a SXG On-Demand Load. In my tests, I could confirm that the TTFB roughly equals click-to-redirect time. You can check it by yourself here and compare the result with the measurement from the previous demo.
For the initial estimation, I will use averages because they are easier to work with than percentiles.
The average click-to-redirect delay that should be added to the underlying, average page load type LCP is:
Mobile : +403 ms
Desktop : +311 ms
You can find the estimated absolute average LCP values in the spreadsheet mentioned earlier. Below is a chart comparing the average LCP for each page load type. The conclusions are roughly similar to those for the 75th percentile.
Estimating the 75th percentile LCP for SXG fallback
I see the 75th percentile as the standard way of assessing LCP performance. It's used by the Chrome User Experience Report (CrUX). Therefore, we should estimate it for SXG fallback page load types.
In the case of my data, I found that for each page load type, its 75th percentile can be calculated by increasing the average by 20-40%. Therefore, I assumed that the missing 75th percentile for SXG Redirect loads can be estimated using this method. For the exact calculations, see the spreadsheet included earlier.
SXG fallback significantly hurts performance
Below, you can see how the P75 LCP degrades when the page loads using SXG fallback compared to the reference:
For Server Load
For Edge Cache Load
The LCP increase for the Edge Cache Load on desktop doesn’t seem much, but keep in mind that it has erased all the edge caching gains.
In my opinion, the performance degradation for pages loaded using SXG fallback is substantial. And neither Google nor Cloudflare documentation will tell you this.
While these specific performance numbers are from my website, the measurement blind spot I've identified is a fundamental issue that affects all SXG implementations.
Enjoy uncovering hidden insights? I share experiments, lessons learned, and surprising discoveries from the problems I dig into. If you like finding out what others overlook, you’ll enjoy my newsletter.
SEO is not impacted
There’s an interesting side effect of the fact that LCP for SXG fallbacks cannot be measured accurately in production.
Chrome browsers continuously report LCP (along with the other Core Web Vitals) to Google, which then publishes the aggregated data as CrUX.
In my experiment, Chrome reported incomplete LCP data to Google when a page was loaded via an SXG fallback redirect. I throttled the network to 3G and simulated the fallback. The LCP shown in Chrome DevTools’ Performance panel (screenshot below, left) matched what Chrome reported to Google (screenshot below, right). Neither measurement included the click-to-redirect delay.
As you know, LCP is a ranking factor: a good score (≤ 2.5 seconds) should boost your SERP positions. And how does Google get that data? From CrUX.
Now imagine this scenario: your LCP sits right at 2.5 seconds. You enable SXG, but don’t configure it properly. As a result, most of your pages load through a fallback redirect. Your real LCP rises above 2.5 seconds, degrading UX. But Google still sees the optimistic value from CrUX and continues to treat your site as if it had a good LCP—effectively ranking you higher than it should.
The SXG Tradeoff
Website loading speed depends heavily on how pages are delivered. Techniques like SXG and Speculation Rules, combined with edge caching, can dramatically improve LCP. At the same time, the very SXG mechanism that enables sub-second page loads can also introduce scenarios where performance suffers.
This raises an important question: Is it acceptable to sacrifice the experience of some visitors so that others enjoy a much faster site? The answer likely depends on the ratio between those who benefit and those who are negatively affected.
The key questions are:
How many visitors experience degraded performance when SXG is enabled?
What strategies can reduce or eliminate those negative effects?
When we balance the wins against the drawbacks, is SXG’s overall impact on LCP positive or negative?
I’ll explore these questions in the next part of this series.
If you’d like to be notified when it’s published—and get more insights on performance experiments like this—make sure to follow me or subscribe to my newsletter.
Thanks
A special thanks to Michał Pokrywka and Maciej Woźny for their valuable comments.
I'd also like to thank Matt Zeunert and the DebugBear team for providing me with access to their web performance monitoring service.
And thank you for reading. I hope you enjoyed it!
I'm researching Signed Exchanges extensively - connect with me here or on my blog for more insights.
Top comments (0)