In this article we'll cover why Lighthouse scores alone don't tell the full story, what metrics actually matter at scale, how real user monitoring changes the way you think about performance, and the optimizations that have the most impact on platforms serving millions of users.
Lighthouse is a great tool. I'm not here to tell you to ignore it. But I've seen teams chase a perfect Lighthouse score while their real users were experiencing 4-second load times on mid-range android devices with a 4G connection.
The score looked great. The experience wasn't.
When you're building for 10 million users, performance stops being about a number in a report. It becomes about real people on real devices with real network conditions. And the gap between a lab score and what your users actually feel is wider than most developers realize.
This article is about closing that gap.
Lighthouse Scores Are Lab Data, Not Reality
Lighthouse runs in a controlled environment. Throttled CPU, simulated network, a clean browser with no extensions, no cached data, no background tabs. That's not how your users browse.
Your users are on a 3-year-old phone with 15 browser tabs open, on a train with patchy network, while your JavaScript is fighting with a background app for CPU time.
This is why a 90+ Lighthouse score can still result in a poor user experience. The lab doesn't lie, but it only tells you part of the truth.
Lab data: what Lighthouse gives you - is useful for catching regressions and tracking trends over time. But it should never be your only signal.
Field data: what your real users experience - is where performance work actually pays off.
The Metrics That Actually Matter at Scale
Core Web Vitals
Google's Core Web Vitals are the closest thing we have to a standardized set of user-centric performance metrics. Three of them matter most:
LCP - Largest Contentful Paint: How long does it take for the largest visible element to render? For most platforms this is a hero image, a video thumbnail or a headline. This is what the user perceives as "the page loaded."
Target: under 2.5 seconds.
INP - Interaction to Next Paint: How quickly does the page respond after a user interaction? Click a button, tap a menu, submit a form - how long before the page visually responds? This replaced FID (First Input Delay) in 2024 and is a much better measure of real interactivity.
Target: under 200 milliseconds.
CLS - Cumulative Layout Shift How much does the page jump around while loading? Ads loading late, images without dimensions, fonts swapping - these all contribute to CLS. On a content-heavy platform this can quietly destroy the user experience.
Target: under 0.1.
These three metrics directly impact SEO ranking and user retention. At scale, a 0.1 improvement in CLS or a 500ms reduction in LCP translates to measurable engagement and conversion improvements.
TTFB - Time to First Byte
Before the browser can render anything, it needs a response from the server. TTFB measures that wait time. High TTFB usually points to server-side issues - slow API responses, no CDN or unoptimized server rendering.
On platforms with a global audience, CDN configuration alone can cut TTFB from 800ms to under 100ms for a significant portion of your users.
TTI - Time to Interactive
When can the user actually use the page? Not just see it, but interact with it without the UI freezing. This is where JavaScript bundle size and execution time have the most direct impact.
A page that looks loaded but isn't responding to clicks is one of the most frustrating experiences a user can have.
Real User Monitoring - The Signal You Can't Ignore
If you're only running Lighthouse, you're flying partially blind. Real User Monitoring (RUM) captures performance data from actual user sessions and sends it back to you.
The difference is significant. With RUM you can see:
- How performance varies by device type, browser and geography
- Which pages have the worst real-world LCP or INP
- How performance degrades over time as your codebase grows
- What percentage of your users are experiencing poor performance right now
Tools like Datadog RUM, SpeedCurve, Mux Data or even the free Chrome User Experience Report (CrUX) give you this visibility.
On a platform serving millions of users, even if 5% of your users are experiencing poor performance, that's 500,000 people having a bad time. RUM makes that visible. Lighthouse doesn't.
The Optimizations That Actually Move the Needle
1. JavaScript Bundle Size
This is almost always the biggest lever. JavaScript is the most expensive resource on the web - it has to be downloaded, parsed, and executed before it does anything useful.
Code splitting is non-negotiable at scale. Every route should load only the JavaScript it needs.
// Instead of importing everything upfront
import HeavyComponent from './HeavyComponent'
// Load it only when needed
const HeavyComponent = React.lazy(() => import('./HeavyComponent'))
Audit your bundle regularly. Tools like webpack-bundle-analyzer or vite-bundle-visualizer will show you exactly what's in your bundle and where the weight is coming from. You will almost always find something surprising.
Third-party scripts are usually the worst offenders. Analytics, chat widgets, ad scripts - these are often loaded synchronously and block rendering. Load them async or defer them entirely until after the page is interactive.
2. Image Optimization
Images are the largest assets on most pages. Getting this wrong has a direct impact on LCP.
- Use modern formats. WebP is widely supported and significantly smaller than JPEG or PNG. AVIF is even better where supported.
- Always set explicit width and height on images. This prevents layout shift and helps the browser allocate space before the image loads.
- Use lazy loading for images below the fold.
- Serve appropriately sized images. Don't serve a 2000px wide image to a 400px wide mobile screen.
<img
src="thumbnail.webp"
width="400"
height="225"
loading="lazy"
alt="Video thumbnail"
/>
For a platform with a large content library, image optimization alone can reduce page weight by 40–60%.
3. Critical Rendering Path
The browser has to download your HTML, parse it, discover CSS and JavaScript, download those, parse them and then render the page. Every step in that chain is an opportunity to either speed things up or slow things down.
Inline critical CSS - the styles needed to render above-the-fold content - directly in the HTML. This eliminates a render-blocking network request for the initial view.
Preload key resources the browser won't discover until late in the parsing process.
<link rel="preload" as="font" href="/fonts/main.woff2" crossorigin>
<link rel="preload" as="image" href="/hero.webp">
- Defer non-critical JavaScript. If a script doesn't need to run before the page is interactive, it shouldn't block rendering.
4. Caching Strategy
I covered this in depth in the previous article in this series, but it's worth mentioning here because caching is one of the highest-impact performance optimizations available to you.
Repeat visitors on a well-cached platform can load pages almost entirely from cache. No network requests for static assets, no server round trips for unchanged resources. The performance improvement for returning users is dramatic.
If you haven't read the caching article yet, it's worth going back to.
5. Reducing Main Thread Work
INP and TTI both suffer when the main thread is busy. JavaScript execution, long tasks, layout recalculations - these all compete for the same thread that handles user interactions.
A few things that help:
Break up long tasks. Any task that takes more than 50ms can cause noticeable jank. Use setTimeout or scheduler.postTask to yield control back to the browser between chunks of work.
Avoid layout thrashing. Reading and writing to the DOM in alternating calls forces the browser to recalculate layout repeatedly. Batch your reads and writes.
Move heavy computation off the main thread with Web Workers.
// Yielding to the browser between heavy tasks
async function processLargeDataset(items) {
for (let i = 0; i < items.length; i++) {
process(items[i])
if (i % 100 === 0) {
await new Promise(resolve => setTimeout(resolve, 0))
}
}
}
Performance Budgets - Making It Stick
One of the hardest parts of performance work at scale is keeping improvements from regressing over time. A performance budget solves this.
A performance budget sets explicit limits on metrics like bundle size, LCP or TTI. If a pull request would push you over the budget, it fails the build.
{
"resourceSizes": [
{ "resourceType": "script", "budget": 300 },
{ "resourceType": "total", "budget": 1000 }
],
"timings": [
{ "metric": "first-contentful-paint", "budget": 1500 },
{ "metric": "interactive", "budget": 3500 }
]
}
This keeps performance on everyone's radar, not just the engineer who cared enough to optimize it once.
What Scale Actually Teaches You About Performance
Here's what I've learned from working on platforms at this size that you don't find in most performance guides:
Device distribution matters more than you think. Your development machine is not representative of your users. Profile on a mid-range Android device and you will find issues you never knew existed.
Geography matters. A platform with a global audience needs a CDN strategy, not just a fast server in one region. Network latency from a distant origin server can add seconds to TTFB for users in certain regions.
Performance degrades gradually. Nobody ships a slow app intentionally. It gets slow one dependency, one feature, one third-party script at a time. Without a budget and regular monitoring, you won't notice until users are already complaining.
The 80/20 rule applies. A small number of pages usually account for the majority of your traffic. Find those pages, measure them obsessively and optimize them first. That's where your performance work will have the most impact.
Final Thoughts
Lighthouse is a tool, not a goal. A green score means you've done the basics right. It doesn't mean your users are having a fast experience.
The teams that get performance right at scale are the ones who measure what their real users experience, set budgets to prevent regression and focus their effort on the optimizations that actually move the needle for their specific platform and audience.
Start with RUM. Find where your real users are struggling. Fix those things first.
The Lighthouse score will follow.
Have thoughts or questions on frontend performance? Drop them in the comments, always happy to discuss.
This article is part of the Frontend at Scale series.
Top comments (0)