Hello there π
So... I may have gone a little overboard with this one.
Last time we pulled apart how ESLint actually works.
This time, we're dismantling web performance.
Why? Because it's one of the highest-ROI frontend skills you can learnβand one of the few that can't simply be outsourced to a tool.
Fortunately, browsers already hand you the diagnosis.
By the end of this article, you'll know how to read it.
Could this have been a 10-part series?
Absolutely.
Did I make it one?
Nope.
Curious what that 10-part series would've looked like?
Here you go.
- Core Web Vitals: LCP, INP, CLS
- The Critical Rendering Path
- JavaScript costs more than you think
- CSS is not free either
- Images
- Fonts
- The network
- Framework patterns (React/Vue)
- Perceived performance
- Lighthouse and field data β measuring what matters
- The performance budget
- The checklist
Now, the choice is yours β complete it in one sitting, go section by section, or skip straight to the checklist. No judgment.
3 seconds.
That's all you have before visitors start leaving.
Sounds harsh? Reality is even harsher.
53% of mobile users abandon a site that takes longer than 3 seconds to load. β Google's 2016 research
But you don't need a study to believe it.
How many times have you closed a tab because it took just a little too long?
Your users do exactly the same thing.
What does that mean?
That portfolio you spent weeks perfecting.
That product page you pulled three all-nighters building.
People might never see it.
Not because it's bad. Because it was slow.
Google found that improving mobile load times by just 0.1 seconds increased retail conversions by 8.4%.
That's not a rounding error.
That's revenue.
And that's only half the story.
Google also uses Core Web Vitals as a search ranking signal.
That means poor performance doesn't just cost you users β it also costs you visibility.
Performance isn't just a UX metric. It's a business metric.
Fast Isn't a Feeling. It's Measurable.
Everyone says:
"Make the site faster."
But what does "fast" actually mean?
For the browser, this comes down to three questions:
- Did the user see something? β Loading performance -> LCP
- Did the page respond when I click? β Interaction performance -> INP
- Did the layout stay stable? β Layout stability -> CLS
Google spent years figuring out how to measure what users actually experience.
The result? Core Web Vitals.
Before we go any further...
Let's measure this page together.
Open DevTools (
Cmd+Option+I/Ctrl+Shift+I) -> Performance
Keep this tab open. We'll keep coming back to it throughout the article.
Chrome DevTools Performance panel β Live Metrics view (see your scores)
Chrome DevTools Performance panel β Record and Reload (see why)
Spend a minute exploring.
I'll wait.
Performance isn't something you memorize. It's something you experience.
These two views tell you what's slow, why it's slow, and which Core Web Vital is paying the price.
That's 90% of performance engineering.
LCP, INP, CLS β What They Actually Measure
Here's what my Live Metrics view showed:
- LCP: 0.18s
- CLS: 0.00
- INP: 48ms
Let me explain what these mean.
LCP β Largest Contentful Paint
When does the user finally see the main content?
Not the spinner. Not the navigation.
The largest visible element in the viewport β because that's usually why they came.
Good -> under 2.5 s
Needs work -> 2.5 s β 4 s
Poor -> over 4 s
INP β Interaction to Next Paint
How quickly does the page react when someone clicks, taps, or types?
Not scrolling. Not hovering.
Good -> under 200 ms
Needs work -> 200 ms β 500 ms
Poor -> over 500 ms
Under 200ms, the interaction feels instant.
Past 500ms, people start wondering if the click even worked.
CLS β Cumulative Layout Shift
How much does the page unexpectedly move while someone is using it?
Ever clicked a button just as the page shifted?
You've experienced CLS.
If a late-loading banner affects 50% of the viewport and pushes content 20% of the viewport height:
0.5 (impact) Γ 0.2 (distance) = 0.1 shift score
Every unexpected layout shift adds to your CLS score.
Good -> under 0.1
Needs work -> 0.1 β 0.25
Poor -> over 0.25
All Three, In One Session
These metrics don't happen in isolation.
They happen on the same page. To the same user. Often within a few seconds.
Watch it play out:
0ms -> Browser sends request
300ms -> HTML received, parser starts building the DOM
600ms -> Browser discovers render-blocking CSS file.
Parsing pauses.
Nothing is painted.
User sees a blank screen.
900ms -> CSS loads. Parsing resumes.
1200ms -> Hero image starts downloading.
2400ms -> Hero image finishes rendering.
β LCP
2600ms -> User taps the menu.
2601ms -> A 400ms JavaScript task is already running.
Browser can't respond yet.
3000ms -> JS finishes. The click is finally processed.
β INP
3100ms -> Menu opens. User starts reading.
3300ms -> Font finishes loading.
Text shifts down.
3500ms -> An ad appears.
Content shifts again.
User taps the wrong button.
β CLS
Three metrics.
One page load.
One user who had a genuinely terrible experience.
The Browser Has a Pipeline. Your Resources Keep Interrupting It.
In the last section we measured LCP, INP, and CLS.
Those metrics aren't random.
They all come from the same pipeline the browser uses to load a page.
Two rules govern that pipeline.
Rule #1 β No DOM + CSSOM, no paint.
HTML builds the DOM.
CSS builds the CSSOM.
The browser needs both before it can paint a single pixel.
No CSSOM.
No paint.
Just a blank screen.
That's the Critical Rendering Path.
Here's the whole pipeline, start to finish:
The thing is, you don't have to wait for the entire stylesheet before the first paint happens.
Only the CSS for what's visible right now actually matters.
Everything below the fold can wait.
That's critical CSS.
<!-- π΄ Browser waits for the whole file before it can paint -->
<link rel="stylesheet" href="styles.css" />
<!-- π’ Critical styles inline. The rest loads without blocking. -->
<style>
/* just enough CSS for what's visible right away */
.hero { ... }
</style>
<link rel="stylesheet" href="styles.css" media="print" onload="this.media='all'" />
That media="print" part isn't really about printing.
It tricks the browser into treating the file as non-blocking.
Once it finishes loading, onload flips it back to media="all".
No network round-trip blocking your first paint.
One more thing: add a <noscript> fallback too, in case JavaScript is off.
<!-- Without JS, the onload above never fires, so this stylesheet would be -->
<!-- stuck on media="print" forever. The noscript version gives JS-less -->
<!-- browsers a normal, blocking stylesheet instead. -->
<noscript>
<link rel="stylesheet" href="styles.css" />
</noscript>
Rule #2 β Without defer or async, the parser waits.
A <script> without defer or async is parser-blocking β though what that actually costs you depends on where the tag sits, as you'll see below.
The browser pauses HTML parsing, executes the script, then resumes where it left off.
Watch what happens when you load the same script three different ways.
Here's what it looks like in DevTools:
Those markers tell the story of the page load.
| Marker | Full name | What it means | Time |
|---|---|---|---|
| FCP | First Contentful Paint | First visible content appears | 167.66ms |
| DCL | DOMContentLoaded | HTML finished parsing | 321.39ms |
| L | onLoad | Initial resources finished loading | 697.21ms |
| LCP | Largest Contentful Paint | Largest visible element rendered | 861.26ms |
Notice something.
The browser isn't "done" once.
It's done in stages.
Timeline comparing script loading:
Same FCP. Very different LCP β often the difference between passing and failing.
Rule of thumb:
-
defer-> for app code. -
async-> for independent scripts like analytics, ads, widgets, API calls, or anything that doesn't depend on the DOM.
One Exception That Trips Up Almost Everyone
async and defer only work on external scripts β inline scripts ignore both.
<!-- β
Works -->
<script src="app.js" defer></script>
<!-- β defer ignored β no src -->
<script defer>
document.getElementById('btn').addEventListener('click', doSomething)
</script>
Two good options if you must put an inline script in <head>.
Option 1 β Wait for the DOM.
<script>
document.addEventListener('DOMContentLoaded', () => {
document.getElementById('btn').addEventListener('click', doSomething)
})
</script>
Think of it as the inline equivalent of putting your script at the end of <body>.
Option 2 β Use modules (deferred automatically):
<script type="module">
document.getElementById('btn').addEventListener('click', doSomething)
</script>
Here's the full picture.
| Script | Best placement | Why |
|---|---|---|
| External, no attribute | End of </body>
|
Works, but starts downloading late |
External + defer
|
<head> |
Best choice β parallel download, safe execution |
External + async
|
<head> |
Independent scripts only |
| Inline, no wrapper | End of </body>
|
async and defer are ignored |
Inline + DOMContentLoaded
|
<head> |
Waits for the DOM |
Inline + type="module"
|
<head> |
Deferred automatically |
Then Why Does Everyone Still Teach "Put JavaScript at the Bottom"?
I wondered the same thing until I looked into the history β and found something interesting.
It was the right advice in 2007. It just never got updated.
Back then, defer existed, but browser support was unreliable.
Internet Explorer interpreted it differently.
Firefox had bugs.
Putting scripts at the end of </body> was the only reliable solution.
Later, Steve Souders' 2007 book High Performance Web Sites popularized the pattern. And the advice spread through books, tutorials, bootcamps, and never really changed.
Now, every modern browser implements defer consistently.
The workaround is no longer necessary.
Let's leave 2007 behind, shall we?
JavaScript Costs More Than You Think
Blocking paint is only half the story.
The other half is JavaScript.
It's more than just a download.
A 200KB image and a 200KB JavaScript file might be the same size.
They're nowhere near the same cost.
Image (200KB): download -> decode -> done
JavaScript (200KB): download -> parse -> compile -> execute -> (blocks interactions)
Real-world cost? Up to 30% of a page's total load time can go purely into JavaScript execution.
Not because kilobytes are evil.
Because the browser has to process every byte before it can run.
Optimizing JavaScript boils down to two goals:
- Ship less JavaScript.
- Run what you ship without blocking the main thread.
Goal #1 β Ship Less JavaScript
The fastest code is the code users never download.
Code Splitting β Ship Only What the User Needs
Most apps ship one giant bundle.
Every route.
Every feature.
Every component.
Even though most users never touch most of it.
Code splitting fixes that.
// Static import β downloads immediately
import { initMap } from './map.js'
// Dynamic import β downloads only when needed
document.getElementById('show-map').addEventListener('click', async () => {
const { initMap } = await import('./map.js')
initMap(document.getElementById('map-container'))
})
// Route-based splitting
const routes = {
'/dashboard': () => import('./pages/dashboard.js'),
'/profile': () => import('./pages/profile.js'),
'/settings': () => import('./pages/settings.js'),
}
The dashboard never downloads for the users who never open it.
That's not optimization.
It's simply not wasting bandwidth.
React, Vue, and Angular all build on the same idea.
Under the hood, they're just using dynamic import().
See it yourself.
Open DevTools -> Network -> JS on dev.to.
Fourteen JavaScript files.
None over 200KB.
Fourteen focused chunks beat one massive bundle.
Tree Shaking β Remove What You Don't Use
Install lodash. Use one function. The whole library ships anyway.
That's because lodash uses CommonJS β bundlers can't tree-shake it.
lodash-es is the same library, rewritten as ES modules.
Switch to lodash-es and use a named import:
// π΄ Ships ALL of lodash (~72KB minified)
// CommonJS β bundlers can't tree-shake this
import _ from 'lodash'
const val = _.get(obj, 'a.b.c')
// π’ Ships ONLY get (~5KB minified)
// ES modules β bundler strips everything you never imported
import { get } from 'lodash-es'
const val = get(obj, 'a.b.c')
One switch. 72KB -> ~5KB.
Want to find where this is happening in your own project?
npx vite-bundle-visualizer # see your full bundle as a treemap
Bundle analysis almost always reveals something that doesn't belong.
Choose Smaller Libraries
The fix isn't always how you import β it's what you import.
Open bundlephobia.com.
Search moment. Then search date-fns.
Same job. Very different cost.
- Moment: 75.4KB minified
- date-fns: 17.1KB minified
That's roughly 1.5s vs 342ms on Slow 3G.
Moment became so large that its own maintainers recommend not using it for new projects.
Not because it's broken. Because it no longer makes sense.
Goal #2 β Keep the Main Thread Free
Shipping less JavaScript gets it to the browser faster.
But whatever you ship still has to execute on the main thread.
Keep it free.
Long Tasks β Stop Freezing the Page
The browser can only do one thing at a time.
If JavaScript keeps the main thread busy for more than 50ms, everything else waits.
Clicks lag.
Scrolling freezes.
Animations drop frames.
That's a Long Task.
// π΄ Blocks the main thread
function processLargeDataset(data) {
return data.map(item => expensiveCalculation(item))
}
// π’ Gives the browser time to breathe
async function processLargeDataset(data) {
const results = []
for (let i = 0; i < data.length; i++) {
results.push(expensiveCalculation(data[i]))
if (i % 50 === 0) {
await new Promise(resolve => setTimeout(resolve, 0))
}
}
return results
}
setTimeout(resolve, 0) isn't making your code faster.
It's giving the browser a chance to handle pending input before continuing.
Feel the difference yourself.
Now inspect what's happening.
Open the CodePen -> Inspect -> Performance -> Record -> Reload.
Look for the red diagonal stripe in Main thread.
Every task longer than 50ms gets flagged.
The yielding version doesn't.
Breaking work into chunks fixes many long tasks.
Sometimes, though...
The work itself is the problem.
Web Workers β Move Heavy Work Off the Main Thread
Some work simply doesn't belong on the main thread.
Things like:
- Image processing.
- Large datasets.
- Complex calculations.
Move it to a Worker.
// worker.js β separate thread, never blocks UI
self.onmessage = ({ data }) => {
const result = heavyCalculation(data)
self.postMessage(result)
}
// main.js β page stays responsive
const worker = new Worker('./worker.js')
worker.postMessage(largeDataset)
worker.onmessage = ({ data }) => {
console.log('Done:', data)
}
Here's what that separation looks like β two threads, one message-passing bridge:
Turn the Worker OFF.
Keep pressing + until the particles begin to stutter.
Turn it back ON.
Same computation.
Different thread.
You can feel it β and the FPS counter backs you up.
In the demo, the main thread drops from 60fps to near-zero during heavy computation. The Worker keeps it at 60.
Want proof beyond feel? Open the demo in a new tab, start a Performance recording, and toggle the Worker a few times before you stop.
A Worker track shows up the moment it's switched on, and the Main thread goes quiet right alongside it. Switch it off, and the work lands straight back on the Main thread β the same moment the frame rate and CPU graph start complaining.
Nothing became faster.
The work simply stopped blocking the UI.
That's the entire point of Web Workers.
Before using one, remember three things.
-
Workers can't access the DOM. No
document, noquerySelector, no element event listeners. They process data and communicate throughpostMessage. - Messaging has a cost. Every message crosses a thread boundary. For tiny updates, that overhead can outweigh the benefit.
- Threads aren't free. Workers consume memory and scheduling resources. A few heavy Workers are great. Dozens of tiny ones usually aren't.
The browser will never create a Worker for you.
If you want another thread, you have to ask for one.
Passive Listeners β One Word. Instant Win.
Heavy JavaScript isn't the only thing that blocks the main thread.
Sometimes the browser is waiting...
For your event listener.
When you scroll or touch the screen, the browser has to answer one question first:
"Is this listener about to call
preventDefault()?"
If it might, scrolling has to wait.
// π΄ Browser waits before scrolling
window.addEventListener('scroll', updateScrollPosition)
// π’ Browser scrolls immediately
window.addEventListener('scroll', updateScrollPosition, {
passive: true,
})
passive: true is a promise:
"I won't cancel scrolling."
That lets the browser scroll immediately instead of waiting for JavaScript.
One word.
Instantly smoother scrolling.
Chrome even warns you when you forget.
π§ Catch it in the act: Open the CodePen link -> DevTools -> Console -> interact with the hamburger menu. You'll see the warning. Uncomment the passive version and it disappears.
CodePen link: open it here
The JavaScript You Didn't Write
So far we've been fixing your JavaScript.
Sometimes the slowest JavaScript isn't yours at all.
It's the third-party scripts you embedded.
Here's the cost:
- Downloads from a server you don't control.
- Runs on your user's device.
- Competes for the same CPU and main thread your own code needs.
Start with how it gets requested in the first place.
<!-- π΄ Blocks HTML parsing -->
<script src="https://analytics.example.com/track.js"></script>
<!-- π’ Downloads in parallel -->
<script src="https://analytics.example.com/track.js" async></script>
That fixes when it loads.
It doesn't fix what happens once it's there.
Open DevTools.
Go to More tools -> Coverage.
Record a page load.
That screenshot tells a painful story.
97% of one script never executed.
It still downloaded, parsed, compiled, and competed for CPU time.
For code that never ran.
async only changes when a script shows up. It does nothing about how much of it was dead weight to begin with.
Every third-party script is someone else's code spending your user's battery, bandwidth, and main-thread time.
Treat each one like a dependency.
Because that's exactly what it is.
For the expensive embeds β YouTube, Maps, chat widgets β async isn't enough.
Don't load them at all.
Not until someone actually wants them.
<div class="video-facade" onclick="loadPlayer(this)" data-id="dQw4w9WgXcQ">
<img src="thumbnail.webp" alt="Play video" width="640" height="360" />
<button>βΆ Play</button>
</div>
The fastest third-party script is the one that never loads.
CSS Is Not Free Either
Think you're done?
JavaScript isn't the only thing the browser has to process.
You already met CSS once, blocking your first paint back in the Critical Rendering Path.
That was the loading cost. This one's different.
CSS.
Every developer learns CSS as a stylesheet to build beautiful websites.
Does the browser see it the same way?
It doesn't.
The browser treats CSS as layout instructions.
Every time you change an element's size or position, it has to answer one question:
"Did this affect anything else?"
Sometimes the answer is one element.
Sometimes it's the entire page.
That's a reflow.
Reflows cost you frames.
The classic mistake looks harmless β say you're padding out a row of cards to match whichever one's tallest:
// π΄ Read, write, read, write β once per card, every card
cards.forEach(card => {
const height = card.offsetHeight
card.style.height = height + 24 + 'px'
})
// π’ Read everything first, then write everything
const heights = cards.map(card => card.offsetHeight)
cards.forEach((card, i) => {
card.style.height = heights[i] + 24 + 'px'
})
Reading layout (offsetHeight) immediately after writing layout (style.height) forces the browser to stop and recalculate synchronously.
Do it inside a loop and you pay that cost every iteration.
That's called layout thrashing.
It's common in card grids, feed lists, accordions β anything where you read layout and write styles in the same loop.
π§ Two demos to run:
Animation β hit Start. Watch the FPS counters.margin-topdrops frames.transformstays at 60.
Resize 50 Elements β hit β Run thrashing, then β Run batched. The ms timers show the real cost. Same 50 elements. Same result. Completely different price.
Animate Without Making the Browser Sweat
Want to animate something?
Not every CSS property costs the same.
margin-top vs transform
One keeps the browser busy.
The other barely touches it.
/* π΄ Layout recalculates every frame */
.card {
margin-top: -4px;
}
/* π’ GPU moves the layer */
.card {
transform: translateY(-4px);
}
Identical visual result.
Completely different cost.
margin-top changes layout.
Every frame, the browser recalculates positions before it can paint.
transform skips layout entirely.
The GPU compositor moves the layer without touching the DOM.
π§ Hit Start. The animation looks the same. Watch the CPU meters instead.
One rule for every animation:
transformandopacityare the two safest properties the GPU compositor can animate without touching the main thread.
Everything else triggers layout, paint, or both.
CSS Containment β Tell the Browser What Can't Change
Imagine every component had invisible walls around it.
That's exactly what contain does.
Changes inside stay inside.
.card {
contain: layout; /* changes inside don't affect outside */
}
.widget {
contain: strict; /* fully isolated from the rest of the page */
}
Every contained component becomes an island β when something changes inside it, the browser doesn't have to recheck the rest of the page.
On a page with hundreds of cards, that's the difference between the browser rechecking one card's layout and rechecking all of them.
Real-world A/B tests back this up. Applying containment to product tiles on a high-traffic e-commerce category page cut INP by ~120ms on mobile. In prototype testing, adding DOM elements on interaction dropped rendering work from 732ms to 54ms.
Skip Rendering What Nobody Can See
Why render content that's still three screens away?
You don't have to.
.article-section {
content-visibility: auto;
contain-intrinsic-size: auto 600px;
}
content-visibility: auto tells the browser:
"Ignore this until it's about to enter the viewport."
No layout.
No paint.
No compositing.
Not until the user gets there.
It can cut rendering work by 50% or more. One real page dropped from 232ms to 30ms. That's a 7x improvement.
contain-intrinsic-size reserves space while the section is skipped, preventing layout shifts when it finally renders.
Choose a value close to the section's real height.
Too far off, and the page will jump when the content appears.
Chrome, Firefox, and Safari all support it today. Anything older simply ignores it and renders everything normally.
Safe to ship.
CSS optimized.
The rendering pipeline is doing as little work as possible.
But there's one resource that's usually bigger than all of it combined.
Images.
Images Are Probably Your Biggest Problem
Let's prove it.
π§ Open DevTools -> Network -> Reload -> All -> sort by Size.
This is a blog post. Code blocks. A handful of screenshots.
Nothing like a portfolio, an e-commerce site, or Instagram.
And still, images dominate the top of that list.
Now imagine an online store.
Or a travel website.
Or a social media feed.
Odds are the top of that list isn't JavaScript.
It never is.
It's the thing everyone looks at, and nobody optimizes.
Four decisions. That's all it takes.
1. Choose the Right Format
Format matters more than compression settings.
| Format | Best for |
|---|---|
| JPEG | Photos (fallback) |
| PNG | Transparency (fallback) |
| WebP | Default choice β 25β34% smaller than JPEG |
| AVIF | Best compression β about 50% smaller than JPEG |
| SVG | Icons, logos, illustrations |
Let the browser choose the best format it supports.
<picture>
<source srcset="hero.avif" type="image/avif" />
<source srcset="hero.webp" type="image/webp" />
<img src="hero.jpg" alt="Hero image" width="1200" height="630" />
</picture>
Modern browsers get AVIF.
Older ones fall back automatically.
No JavaScript required.
2. Always Tell the Browser the Size
Always set width and height. These two attributes fix one of the most common causes of CLS.
<img src="hero.webp" width="1200" height="630" alt="Hero" />
Without them, the browser has no idea how much space to reserve.
The image loads.
Everything moves.
That's layout shift.
For your LCP image specifically, add fetchpriority="high" β it tells the browser: "Download this before less important images."
<img src="hero.webp" fetchpriority="high" width="1200" height="630" alt="Hero" />
3. Don't Send a 4K Image to a Phone
A phone doesn't need desktop-sized images.
srcset lets the browser choose the right one.
<img
src="hero-800.jpg"
srcset="hero-400.jpg 400w, hero-800.jpg 800w, hero-1200.jpg 1200w, hero-2400.jpg 2400w"
sizes="(max-width: 600px) 100vw,
(max-width: 1200px) 50vw,
1200px"
alt="Hero"
/>
A phone downloads the 400px version.
A desktop downloads the larger one.
Same image.
Different file.
Less wasted bandwidth.
4. Don't Download Images Nobody Can See
Images users can't see yet don't need to load immediately.
<img src="product.webp" loading="lazy" width="400" height="300" alt="Product" />
One attribute.
Often a 30β50% cut to initial page weight on an image-heavy page, sometimes more.
There is one important exception: Never lazy-load your LCP image.
If the hero is above the fold, delaying it delays your LCP.
Four decisions. One theme.
- Smaller files.
- Fewer pixels.
- Fewer downloads.
That's why images are usually the biggest performance win.
A Real Example β dev.to Is Already Doing This
Every image you upload gets stored on S3 at full resolution.
That's not what your readers download.
Cloudflare intercepts every request, resizes the image, converts the format, and sends the smaller version. The original never changes. The user never sees it.
The URL tells the whole story:
You uploaded:
https://dev-to-uploads.s3.us-east-2.amazonaws.com/uploads/articles/your-image.gif
Your readers get:
https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1ydkr1d4qmhxyms63lpx.gif
width=1000,height=420. format=auto. fit=cover.
Resize. Convert. Crop. All at the edge, before a single byte reaches your reader.
Same image. Two URLs. Here's what that costs:
| Size | Time | |
|---|---|---|
| CDN (what readers get) | 147 kB | 212ms |
| S3 original (what you uploaded) | 22,759 kB | 10.13s |
That's 155Γ smaller and 48Γ faster.
Not from any code change. From a URL.
dev.to is doing automatically what this section teaches you to do manually.
They do it for images.
They do it for embeds too.
Every CodePen, every YouTube embed β none of them load until you scroll to them.
Open DevTools. Watch the Network tab. Scroll slowly down the page.
You'll see requests fire at the exact moment each embed enters the viewport. Not a second before.
That "Queued at 18.7 min" is the exact moment I scrolled to the embed.
Not at page load. Not before. The second it entered the viewport β the request fired.
And it resolved in 47 microseconds. Cloudflare already had it cached at the edge.
That's intersection observer lazy loading for iframes. Same principle as loading="lazy" on images. dev.to just applies it to everything.
Fonts Cost More Than You Think
Images delay pixels.
JavaScript delays interaction.
Custom fonts delay reading.
The browser reaches your text.
Your font isn't there yet.
On a cold connection, Google Fonts alone can add 300ms to your LCP before a single character appears.
Now it has a few choices.
- FOIT (Flash of Invisible Text) β hide the text until the font finishes downloading.
- FOUT (Flash of Unstyled Text) β show a fallback font, then swap when the custom font arrives.
-
font-display: optionalβ give the custom font a brief window to show up; if it isn't ready in time, stick with the fallback for the rest of the visit and skip the swap entirely.
None of them are perfect.
The goal is to make the compromise invisible.
font-display: swap chooses readability over visual perfection.
The user starts reading immediately.
Press Play and watch all three strategies side by side.
@font-face {
font-family: 'Inter';
src: url('/fonts/inter.woff2') format('woff2');
font-display: swap;
}
Use woff2. It's smaller than older formats and supported by every modern browser.
One catch: If your fallback font has different metrics, swapping fonts causes layout shift. That's textbook CLS.
Properties like size-adjust, ascent-override, and descent-override in the @font-face rule let you tune the fallback so the swap barely moves.
You can also give the browser a head start.
<link rel="preload" href="/fonts/inter.woff2" as="font" type="font/woff2" crossorigin />
The font starts downloading before CSS asks for it.
Or...
hear me out β use no custom font at all:
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
}
System fonts are already available on the user's device.
Zero downloads. Zero waiting.
GitHub, Notion, dev.to β all use system fonts. So why can't you?
The fastest font is the one you never download.
Three chapters, one rule.
The fastest JavaScript never downloads.
The fastest image is the smallest one.
The fastest font is already installed.
The next bottleneck follows the same rule.
Except now... The delay isn't on your page.
It's on the wire.
The Network Is Not Your Friend β Until You Work With It
You've trimmed the JavaScript. Optimized the CSS. Shrunk the images. Fixed the fonts.
The bottleneck moves β to the network.
TTFB β The Delay Before Anything Happens
Before the browser can download CSS, discover JavaScript or request your hero image...
It waits for the server to start responding.
That's Time to First Byte (TTFB).
Good -> under 800ms
Needs work -> 800ms β 1.8s
Poor -> over 1.8s
If your TTFB is two seconds...
Everything else starts two seconds late.
Frontend optimizations haven't even begun.
It might sound like a backend concern β until you see what it cost Amazon.
Every 100ms of added latency cost Amazon 1% in sales.
Not 1% of a bad quarter.
1% of revenue, every single time, from a tenth of a second.
At Amazon's scale today, that 1% is worth billions of dollars a year.
π§ Clock it: DevTools -> Network -> Reload -> Click the HTML document -> Timing.
"Waiting for server response" bar is your TTFB.
Over 800ms means your server is now the bottleneck.
No amount of frontend optimization can hide it.
Four common causes β all fixable:
- Slow database queries
- No caching
- No CDN
- Servers too far from the user
A CDN (Content Delivery Network) solves the distance problem.
Instead of serving every request from one origin server, it serves files from an edge location close to the user. That's also why you'll often see region-specific domains like amazon.in and amazon.com alongside CDNs and regional data centers β shorter distances mean lower latency.
For global audiences, that's often a 200β400ms win before you've touched a line of code.
Platforms like Vercel, Netlify and Cloudflare already do this automatically.
If your server lives in one region while your users live everywhere...
A CDN is usually the highest-ROI optimization you can make.
Gzip, Brotli and Zstd β Free Bandwidth
Before files leave your server...
Compress them.
Original JS: 500 KB
After minify: 200 KB (bundler handles this)
After gzip: 60 KB (server handles this β 70% off)
After Brotli: 50 KB (15β25% better than gzip)
After zstd: 48 KB (newer than Brotli β what dev.to uses)
Minification removes unnecessary characters.
Compression removes unnecessary bytes.
Different job. Much bigger payoff.
Most CDNs enable compression automatically. Check yours:
π§ Peek at the headers: DevTools -> Network -> Reload -> click any
.jsfile -> Response Headers -> look forContent-Encoding.
-
br-> Brotli -
gzip-> Gzip -
zstd-> Zstandard
Any of them means compression is active.
dev.to already serves assets with zstd. If a platform serving millions of developers made the switch, the tooling is mature enough for your project too.
No Content-Encoding header?
Means your server is sending raw files β you're paying roughly a 70% bandwidth penalty for every request.
If you're using Vercel, Netlify, or Cloudflare... you're probably already covered.
No code change. Just a config line.
Most sites don't bother. That's why it's still a win.
On your own server, it's almost trivial β for gzip, at least.
# Nginx
gzip on;
brotli on;
# Apache
AddOutputFilterByType DEFLATE text/html text/css application/javascript image/svg+xml application/json
Gzip ships with nginx out of the box. Brotli doesn't β brotli on; will fail to start nginx unless you've already compiled in the separate ngx_brotli module (or you're on NGINX Plus, or using a distro package like nginx-extras on Debian/Ubuntu that bundles it in for you).
As for Zstandard (zstd), support depends on your server or CDN. Many modern CDNs already enable it automatically, while self-hosted servers often require additional modules or newer server software.
Preload, Prefetch, Preconnect β Know the Difference
These three are easy to confuse.
They solve completely different problems.
- Preload -> I need this on this page.
- Prefetch -> I'll probably need this on the next page.
- Preconnect -> I'm about to talk to this server, just open the connection so it's ready when I knock.
<!-- Preload: I need this RIGHT NOW on the current page -->
<link rel="preload" href="/hero.webp" as="image" />
<link rel="preload" href="/fonts/font.woff2" as="font" crossorigin />
<!-- Prefetch: I'll probably need this when the user navigates NEXT -->
<link rel="prefetch" href="/about.js" />
<!-- Preconnect: warm up the connection to this domain -->
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://cdn.myapp.com" crossorigin />
<!-- DNS lookup only: cheaper preconnect β resolves DNS only -->
<link rel="dns-prefetch" href="https://analytics.myapp.com" />
Use the wrong one and you waste bandwidth. Skip them entirely and you waste time.
| Hint | Use it for |
|---|---|
preload |
Hero image, critical font, above-the-fold CSS |
prefetch |
Likely next page |
preconnect |
Third-party origins you'll definitely contact |
dns-prefetch |
Third-party origins you might contact |
One <link rel="preconnect"> eliminates the DNS, TCP, and TLS cost upfront. That's typically 100β300ms saved per domain.
One warning: preload is a priority override β every preloaded resource jumps to the front of the queue.
Preload ten things and you've prioritized nothing.
Cache Everything That Doesn't Change
The browser already has an excellent cache.
Most sites barely use it.
Cache-Control: no-cache -> always validate with server
Cache-Control: max-age=31536000 -> cache for one year
Cache-Control: max-age=31536000, immutable -> cache forever, never revalidate
The pattern never changes.
HTML files -> Cache-Control: no-cache
(always check for new HTML β it references
updated asset URLs)
JS/CSS/Images -> Cache-Control: max-age=31536000, immutable
(cached forever β content hash in filename
changes on deploy)
Modern bundlers already fingerprint assets. (app.a3f5b2c.js)
app.js
β
app.a3f5b2c.js
New deploy, new hash.
Same hash, instant cache hit.
The browser downloads it once.
Everything else stays cached forever.
Here's what that looks like on a repeat visit.
π§ Watch it stick: DevTools -> Network -> reload twice -> watch the Size column.
Cached resources return instantly: 0ms.
Network requests still take: 371ms to over 1 second.
Same files.
Zero download.
That's exactly what immutable buys you.
Configuration depends on your host.
# Netlify β netlify.toml
[[headers]]
for = "/assets/*"
[headers.values]
Cache-Control = "max-age=31536000, immutable"
[[headers]]
for = "/*.html"
[headers.values]
Cache-Control = "no-cache"
// Vercel β vercel.json
{
"headers": [
{
"source": "/assets/(.*)",
"headers": [{ "key": "Cache-Control", "value": "max-age=31536000, immutable" }]
}
]
}
One deploy. One download. Every visit after that is free.
Back/Forward Cache β The Fastest Navigation You'll Ever Ship
What's faster than caching files?
Not loading the page at all.
That's exactly what the Back/Forward Cache (bfcache) does.
Press Back.
The browser restores the entire page from memory.
- No network.
- No rendering.
- No JavaScript startup.
- Usually under 100ms.
Most developers break it without realizing.
The biggest culprit?
unload.
// π΄ Disables bfcache
window.addEventListener('unload', cleanup)
// π’ Compatible with bfcache
window.addEventListener('pagehide', cleanup)
Many analytics libraries still register unload handlers behind the scenes.
One script.
Your entire site loses bfcache.
Check yours.
π§ DevTools -> Application -> Back/Forward Cache -> Run Test
Passing: dev.to
Failing: CodePen
The report tells you exactly what prevented caching.
Fix what it reports.
One exception: banking, payments, medical portals β anything showing sensitive data.
In those cases, restoring an old page from memory may be the wrong trade-off.
Security wins.
For everything else... bfcache should work by default.
One Move, Five Times
Every optimization in this chapter does exactly the same thing.
It avoids work.
- A CDN avoids the long trip to your origin server.
- Compression avoids sending unnecessary bytes.
- Preconnect avoids waiting for a connection to open.
- Caching avoids downloading files twice.
- bfcache avoids loading the page altogether.
The fastest request isn't the one you optimize.
It's the one the browser never has to make.
Framework Patterns That Actually Move the Needle
Everything so far was browser fundamentals.
Now let's look at what your framework is already doing for you.
Here's what many developers miss.
React and Vue aren't fast because they're frameworks.
They're fast because they avoid unnecessary work.
Out of the box:
-
React batches state updates (React 18+) so multiple
setState()calls become a single render. - Vue batches reactive updates and flushes DOM changes together on the next tick.
- Both diff a virtual DOM before touching the real DOM.
Less work.
Fewer DOM updates.
Better performance.
Most applications never need more than that.
The techniques below are for when the defaults stop being enough.
TLDR β what this chapter covers:
React and Vue already optimize rendering by default. This chapter covers three things for when that's not enough: memoization, conditional rendering (v-if vs v-show), and virtualization.Angular, Svelte, or Solid? Same ideas, different APIs β I don't have enough production experience to cover them accurately.
Skip to Perceived Performance
Stop Re-rendering What Didn't Change
The fastest render... is the one that never happens.
React skips re-rendering with React.memo() β it wraps a component and bails out if props haven't changed.
Vue handles this differently. Its reactivity system is more granular by default β components only re-render when their reactive dependencies change. For expensive computed values, computed() caches the result and only recalculates when its dependencies change.
// π’ React β skip re-rendering when props are the same
const DataGrid = React.memo(function DataGrid({ rows }) {
return rows.map(row => <Row key={row.id} data={row} />)
})
// π΄ React β comparison overhead outweighs the benefit
const Badge = React.memo(function Badge({ label }) {
return <span>{label}</span>
})
// π’ Vue β cached until dependencies change
const total = computed(() => cart.items.reduce((sum, item) => sum + item.price, 0))
// π΄ Vue β recalculates every render
const total = () => cart.items.reduce((sum, item) => sum + item.price, 0)
The part most tutorials skip: memoization is not free.
Every memoized component first compares its inputs before deciding to skip rendering.
If the component is cheap, that comparison can cost more than rendering itself.
Good candidates are:
- Large lists
- Heavy components
- Expensive calculations
- Deep trees with stable props
Everything else?
Measure first.
v-if vs. v-show β Don't Default to One
Vue has an optimization most developers get wrong.
<!-- π΄ Destroy and recreate every toggle -->
<ExpensiveModal v-if="isOpen" />
<!-- π’ Keep mounted, toggle visibility -->
<ExpensiveModal v-show="isOpen" />
Rarely shown?
-> Use v-if. The component never mounts.
Frequently toggled?
-> Use v-show. It stays in the DOM, just hidden.
Don't guess.
React DevTools Profiler and Vue Devtools will tell you exactly where your renders are coming from.
Virtualization β The DOM Was Never Meant for 10,000 Rows
Imagine a city-select dropdown with 10,000 options.
The user can only see 15 at a time.
The other 9,985:
- Still consuming memory.
- Still participating in layout.
- Still slowing everything down.
Don't render them.
Render only what's visible.
// React
import { FixedSizeList as List } from 'react-window'
;<List height={600} itemCount={rows.length} itemSize={50}>
{({ index, style }) => <Row style={style} data={rows[index]} />}
</List>
<!-- Vue -->
<RecycleScroller :items="rows" :item-size="50" v-slot="{ item }">
<Row :data="item" />
</RecycleScroller>
Ten thousand rows in memory.
Fifteen rows in the DOM.
That's virtualization.
Perceived Performance β Making It Feel Fast Before It Is
Technical performance improvements make pages faster.
Perceived performance makes them feel even faster.
Those aren't always the same thing.
A 1.5-second API request behind a spinner feels slow.
The same 1.5 seconds with an optimistic update feels instant.
The network didn't change.
The experience did.
π§ Race them: Which one feels fastest?
That's perceived performance.
Optimistic Updates β Respond Before the Server Does
Most applications wait for server confirmation before updating the UI.
Optimistic updates reverse that order.
The UI updates immediately.
The server catches up afterward.
function likePost(postId) {
setLiked(true)
setLikeCount(count => count + 1)
api.likePost(postId).catch(() => {
setLiked(false)
setLikeCount(count => count - 1)
})
}
If the request fails... Roll it back.
GitHub stars.
Notion edits.
They all update before the server responds.
In Instagram's earlier days, a single Justin Bieber post could pull in millions of simultaneous like-taps within seconds β more than enough to choke a database trying to keep an accurate count in real time.
That's a backend problem. It needed its own backend fix.
But it also exposed a frontend one: every tap was a person waiting on a network round-trip just to see a number move.
Optimistic updates fixed that part.
The number you see immediately after pressing β€οΈ is often an optimistic estimate that eventually catches up.
It feels instant because your tap isn't waiting for the database anymore.
Stale While Revalidate β Show Something Now
Returning users shouldn't stare at another loading spinner.
They've already seen this data.
Show the cached version immediately.
Refresh it quietly in the background.
// TanStack Query / SWR
const { data } = useQuery({
queryKey: ['dashboard'],
queryFn: fetchDashboardData,
staleTime: 30_000,
})
First visit -> normal load.
Every next visit -> instant content from cache.
Silent refresh.
Libraries like TanStack Query and SWR make this almost effortless.
You see it in the Instagram feed too β the content from your last visit appears instantly while fresh posts load in the background.
Prefetch Before the Click
Remember the prefetch hint from a few chapters back? That was a guess about the next page.
People rarely click instantly.
Most hover for 200β300ms first.
That's free time. Use it.
document.querySelectorAll('a[data-prefetch]').forEach(link => {
link.addEventListener(
'mouseenter',
() => {
const preload = document.createElement('link')
preload.rel = 'prefetch'
preload.href = link.href
document.head.appendChild(preload)
},
{ once: true }
)
})
By the time the click happens...
Part of the next page is already downloading.
Next.js does this automatically for internal links.
Skeletons Beat Spinners
A spinner tells the user...
"Something is happening."
A skeleton tells them...
"This is what you're waiting for. The content is almost here."
The loading time is identical.
Research by Viget found users rate skeleton screens as 20% faster than spinners β even when the actual load time is identical.
Not because they are.
Because uncertainty feels slower than progress.
A different kind of win this time.
None of these techniques make the server faster.
They change when the user experiences the wait.
Sometimes that's enough.
Because performance isn't only measured in milliseconds.
It's measured in how long the wait feels.
Lighthouse Is Where You Start. Field Data Is Where You Trust.
You've spent the whole article learning what to optimize.
Now let's learn how to find it.
Start with Lighthouse.
Run it before every release.
π§ Run it now: DevTools -> Lighthouse -> Mobile -> Analyze page load.
Ignore the score.
The real value is underneath.
- FCP.
- LCP.
- TBT.
- CLS.
- Speed Index.
Each one points to a specific bottleneck.
Scroll a little further.
Lighthouse doesn't just tell you something is slow.
It tells you why.
- Render-blocking resources.
- Unused JavaScript.
- Long Tasks.
- Broken bfcache.
Prioritized. Estimated. Actionable.
One caveat: that 81 score came from one controlled test.
- One device.
- One network.
- One page load.
Your users don't browse under laboratory conditions.
They browse:
- On trains.
- In elevators.
- On overloaded mobile networks.
That's why field data exists.
π§ Check it yourself: Go to PageSpeed Insights pagespeed.web.dev, paste any URL, then look for "Discover what your real users are experiencing."
Notice something.
The top section isn't Lighthouse.
It's CrUX.
- Real Chrome users.
- Real devices.
- The last 28 days.
The score underneath is still Lighthouse.
Same page.
Different measurement.
In this example,
- This URL measures only the homepage.
- Origin averages every page on dev.to.
Same site.
Different data.
Different conclusions.
Lighthouse tells you what's possible.
Field data tells you what's happening.
When they disagree... Trust the field.
Once you've found the problem, you've already learned how to investigate it.
Open the Performance panel. -> Record. -> Find the bottleneck. -> Fix it.
Finally, measure your own users:
import { onLCP, onINP, onCLS } from 'web-vitals'
onLCP(metric => analytics.track('LCP', metric.value))
onINP(metric => analytics.track('INP', metric.value))
onCLS(metric => analytics.track('CLS', metric.value))
Send those metrics to whatever analytics platform you already use.
Then watch the p75 β not the average, but the point where 75% of visits were this fast or faster.
That's the number Google uses for Core Web Vitals scoring.
And it's the one your users actually experience.
One more tool: WebPageTest.
Lighthouse covers most days. WebPageTest goes deeper β filmstrip, waterfall, real device in a specific city on a throttled connection.
Same site. Same browser. Same connection speed. Tested from two cities on opposite sides of the planet:
Tokyo and Columbus, Ohio β and the numbers barely move. That's a CDN doing its job.
Full reports: Tokyo run Β· Columbus run
The Performance Budget β Make It Impossible to Regress
Here's the problem β you fix performance, the numbers look great, then it quietly rots.
Performance doesn't usually break overnight.
It leaks.
One dependency.
One analytics script.
One feature.
Sprint after sprint.
Your Lighthouse score is 92 today.
Six months later it's 58.
Nobody remembers when it happened.
A performance budget fixes that.
It's a contract enforced by CI.
// lighthouserc.js
module.exports = {
ci: {
assert: {
assertions: {
'categories:performance': ['error', { minScore: 0.85 }],
'largest-contentful-paint': ['error', { maxNumericValue: 2500 }],
'total-blocking-time': ['error', { maxNumericValue: 200 }],
'cumulative-layout-shift': ['error', { maxNumericValue: 0.1 }],
'resource-summary:script:size': ['error', { maxNumericValue: 300000 }],
},
},
},
}
Five limits. Miss any one, and CI flags it before a reviewer has to.
Wire it into your pipeline β and performance stops being someone's responsibility.
Cross the limit.
The pipeline fails.
The Uncomfortable Part Nobody Puts in the Tutorial
Every optimization you've seen has trade-offs.
- Lazy-loading your hero image delays your LCP.
-
Wrapping every component in
memoadds comparison overhead. - Preloading everything means nothing is truly prioritized.
- SSR improves LCP but introduces hydration work.
- Over-splitting bundles creates unnecessary network overhead.
Performance advice isn't universal. It's contextual.
Before changing anything, ask one question:
Am I solving a problem I've actually measured, for users who actually have it?
If you can't answer that β that's not optimization.
That's a guessing game.
The most expensive optimization is the one that improves a metric nobody was watching.
The Checklist You'll Actually Use
One article. Twelve chapters. Hundreds of decisions.
Here's all of it in one place.
Server
β Keep TTFB under 800 ms (imp)
β Serve through a CDN (imp)
β Enable Brotli or Gzip or Zstd (imp)
Browser Rendering
β Remove render-blocking CSS and JS (imp)
β Use defer for app scripts (imp)
β Inline critical CSS
JavaScript
β Code split by route (imp)
β Tree shake dependencies (imp)
β Analyze bundles before release
β Use passive listeners
β Load third-party scripts with async (imp)
β Move heavy work to Web Workers
β Break up long tasks (imp)
CSS
β Animate transform and opacity only (imp)
β Avoid layout thrashing
β Use CSS containment
β Use content-visibility for off-screen content
Images
β Prefer AVIF or WebP (imp)
β Always set width and height (imp)
β fetchpriority="high" for the LCP image (imp)
β Lazy-load below the fold (imp)
β Use srcset
Fonts
β Preload critical fonts (imp)
β Use font-display: swap (imp)
β Prefer system fonts when possible
Network
β Preconnect to third-party origins
β Prefetch likely next pages
β Cache hashed assets aggressively (imp)
β Keep bfcache working (imp)
Framework
β Memoize expensive components only (imp)
β Virtualize long lists
Perceived Performance
β Optimistic updates
β Stale-while-revalidate
β Prefetch on hover
β Skeleton screens (imp)
Measurement
β Lighthouse CI (imp)
β web-vitals in production (imp)
β Track p75 field data (imp)
β Verify with WebPageTest
Not every item applies to every project.
But every item you skip should be a choice you made β not an oversight.
The Takeaway
Performance isn't about making browsers faster.
It's about giving them less work to do.
Less JavaScript to parse.
Less CSS to calculate.
Fewer images to download.
Fewer layouts to recalculate.
Fewer requests to make.
Every optimization in this article follows the same idea.
Avoid work.
Because every piece of work you remove... is time your user gets back.
Three seconds.
You don't earn them back with one optimization.
You earn them back a hundred milliseconds at a time.


























Top comments (3)
How do you handle cases where initial load time exceeds 3 seconds due to heavy computation? I'm following your posts for more web perf insights, would love to hear your thoughts on this.
Hi, Initial load time increases because your computation keeps the main thread busy.
Move it to a Web Worker to free up the main thread - page stays responsive, and complete everything else: DOM+CSSOM construction, JS execution, hydration, rendering, etc.
But if your page depends on that computation's result, moving it alone won't fully fix things.
Pair it with a skeleton loader or cached data until the new result arrives β that way, the user isn't staring at a blank screen till your data is loaded.
If you can share what the heavy computation logic actually is, I might be able to give you more specific detail.
One caveat worth adding next to the
media="print"stylesheet trick: it defers the whole file, so if any of the above-the-fold styles live in there instead of your inline critical block, you get a flash of unstyled hero before it swaps in. In practice that means being pretty strict about what counts as critical CSS, which is the hard part nobody enjoys. The section that earns the length though is the single timeline where LCP, INP, and CLS all fire in one load. Seeing them as one story instead of three separate scores is the thing most people miss.