DEV Community

ZeeshanAli-0704
ZeeshanAli-0704

Posted on • Edited on

Frontend System Design: Network Optimization – Performance

Network Optimization – Frontend Performance

Network optimization focuses on reducing the number, size, and latency of HTTP requests. Even with perfectly optimized assets, poor network strategy can bottleneck performance.

Table of Contents


HTTP 2 and HTTP 3

Concept:
HTTP/1.1 sends requests one at a time per connection (head-of-line blocking). HTTP/2 and HTTP/3 remove this bottleneck with multiplexing, allowing many requests over a single connection.

Evolution:

Feature HTTP/1.1 HTTP/2 HTTP/3
Multiplexing No (1 req/connection) Yes (many streams) Yes (many streams)
Header compression No HPACK QPACK
Server Push No Yes (deprecated in Chrome 106) Spec allows, but no browser implements
Transport TCP TCP QUIC (UDP-based)
Head-of-line blocking Per connection Per TCP connection None (per-stream)
Connection setup TCP + TLS (2-3 RTT) TCP + TLS (2-3 RTT) 0-1 RTT

Note on Server Push: HTTP/2 Server Push was designed to let the server proactively send resources before the browser requested them. However, Chrome removed Server Push support in Chrome 106 (October 2022) due to low adoption and complexity. No major browser actively supports it in practice. Use <link rel="preload"> or 103 Early Hints (see Section 8) instead.

Impact on optimization strategy:

In HTTP/1.1 era:

  • Concatenate files (fewer requests better)
  • Use CSS sprites
  • Domain sharding (multiple domains for parallel downloads)

In HTTP/2+ era:

  • Smaller, granular files are fine (multiplexing handles many requests)
  • CSS sprites are less necessary
  • Domain sharding actually hurts (breaks multiplexing)
  • Focus on reducing total bytes, not total requests

Verification:

// Check HTTP protocol version in browser DevTools
// Network tab → Right-click header → Enable "Protocol" column
// Look for h2 (HTTP/2) or h3 (HTTP/3)

// Performance API check
const entries = performance.getEntriesByType('resource');
entries.forEach(entry => {
  console.log(entry.name, entry.nextHopProtocol);
  // Output: "script.js" "h2" or "h3"
});
Enter fullscreen mode Exit fullscreen mode

⬆ Back to Top


Resource Hints (preload, prefetch, preconnect, dns prefetch)

Concept:
Resource hints tell the browser about resources it will need soon, allowing it to start fetching or connecting earlier than it normally would.

Four resource hints:

Hint Purpose Priority When to Use
dns-prefetch Resolve DNS for a domain Low Third-party domains
preconnect DNS + TCP + TLS handshake Medium Critical third-party origins
preload Download specific resource NOW High Critical current-page resources
prefetch Download resource for NEXT page Low Next-page navigation resources

Example – All four in practice:

<head>
  <!-- DNS Prefetch: resolve domain name early -->
  <link rel="dns-prefetch" href="https://analytics.example.com">

  <!-- Preconnect: full connection setup to critical third-party -->
  <link rel="preconnect" href="https://fonts.googleapis.com">
  <link rel="preconnect" href="https://cdn.example.com" crossorigin>

  <!-- Preload: fetch critical resources for THIS page immediately -->
  <link rel="preload" href="/fonts/main.woff2" as="font" type="font/woff2" crossorigin>
  <link rel="preload" href="/css/critical.css" as="style">
  <link rel="preload" href="/images/hero.webp" as="image">

  <!-- Prefetch: fetch resources for NEXT likely page -->
  <link rel="prefetch" href="/next-page/bundle.js" as="script">
  <link rel="prefetch" href="/next-page/data.json" as="fetch">
</head>
Enter fullscreen mode Exit fullscreen mode

Timeline comparison:

Without hints:
  HTML → Parse → Discover CSS → Download CSS → Parse CSS → Discover font → Download font
                                                                            ^^^^ Very late

With preconnect + preload:
  HTML → Preconnect to CDN + Preload font (parallel with everything)
      → Parse → Discover CSS → Download CSS (connection already open!)
      → Font already downloading or downloaded
  Savings: 200-500ms per resource
Enter fullscreen mode Exit fullscreen mode

Best practices:

  • Limit preconnect to 2-4 critical origins (each one costs CPU)
  • Preload only resources needed within first 3 seconds
  • Prefetch only highly likely next-page resources
  • dns-prefetch is cheap, use for all known third-party domains
  • Over-using preload can hurt – it competes with other critical resources

⬆ Back to Top


Caching Strategies

Concept:
Caching stores responses locally so repeat visits or requests do not require network round-trips. An effective caching strategy can make second visits near-instant.

Two levels of caching:

1. Browser Cache (HTTP Cache Headers):

Cache-Control: public, max-age=31536000, immutable
Enter fullscreen mode Exit fullscreen mode
Directive Meaning
public Any cache (browser, CDN) can store
private Only browser can store (user-specific data)
max-age=N Cache is valid for N seconds
immutable File will never change (skip revalidation)
no-cache Always revalidate with server before using
no-store Never cache (sensitive data)
stale-while-revalidate=N Use stale cache while revalidating in background

Recommended cache strategy by file type:

HTML pages:
  Cache-Control: no-cache
  (Always check for latest version, server may return 304 Not Modified)

CSS/JS with hash in filename (main.a1b2c3.js):
  Cache-Control: public, max-age=31536000, immutable
  (Cache foreverfilename changes when content changes)

Images:
  Cache-Control: public, max-age=86400
  (Cache for 1 dayor use immutable with hashed filenames)

API responses:
  Cache-Control: private, max-age=0, must-revalidate
  (User-specific, always fresh)

Fonts:
  Cache-Control: public, max-age=31536000, immutable
  (Fonts rarely change)
Enter fullscreen mode Exit fullscreen mode

2. ETag / Last-Modified (Conditional Requests):

First request:
  Server → ETag: "abc123"
  Browser caches response

Second request:
  Browser → If-None-Match: "abc123"
  Server checks: file unchanged → 304 Not Modified (no body, saves bandwidth)
  Server checks: file changed → 200 OK with new content
Enter fullscreen mode Exit fullscreen mode

⬆ Back to Top


Compression (Gzip, Brotli)

Concept:
Text-based assets (HTML, CSS, JS, JSON, SVG) are compressed on the server before transmission. The browser decompresses them automatically. This reduces transfer size by 60-90%.

Compression comparison:

Algorithm Compression Ratio Speed Browser Support
None 1x (baseline) N/A Universal
Gzip 5-8x smaller Fast Universal
Brotli 6-10x smaller Slower to compress 96%+ browsers

Typical savings:

Asset Original Gzip Brotli
React bundle (200KB) 200KB 55KB 45KB
CSS file (100KB) 100KB 18KB 15KB
JSON API (50KB) 50KB 8KB 6KB

Server configuration examples:

# Nginx – Enable both Brotli and Gzip
# Brotli (preferred)
brotli on;
brotli_types text/html text/css application/javascript application/json image/svg+xml;
brotli_comp_level 6;

# Gzip (fallback)
gzip on;
gzip_types text/html text/css application/javascript application/json image/svg+xml;
gzip_min_size 256;
Enter fullscreen mode Exit fullscreen mode

How browser negotiation works:

Browser sends:
  Accept-Encoding: br, gzip, deflate

Server responds with best match:
  Content-Encoding: br       (if Brotli supported)
  Content-Encoding: gzip     (fallback if not)
Enter fullscreen mode Exit fullscreen mode

Build-time pre-compression (Webpack):

const CompressionPlugin = require('compression-webpack-plugin');

module.exports = {
  plugins: [
    // Gzip
    new CompressionPlugin({
      algorithm: 'gzip',
      filename: '[path][base].gz',
    }),
    // Brotli
    new CompressionPlugin({
      algorithm: 'brotliCompress',
      filename: '[path][base].br',
      compressionOptions: { level: 11 },
    }),
  ],
};
Enter fullscreen mode Exit fullscreen mode

Benefits of pre-compression:

  • Server serves pre-compressed files instantly (no CPU cost per request)
  • Can use maximum Brotli level 11 (too slow for real-time, but fine for build)
  • Gzip as fallback for older clients

⬆ Back to Top


CDN (Content Delivery Network)

Concept:
A CDN distributes copies of your assets to servers spread across the globe (edge locations). Users fetch assets from the nearest edge server instead of your origin server, reducing latency dramatically.

How CDN reduces latency:

Without CDN:
  User in Tokyo → Request to origin in US → 200ms RTT
  Total for CSS + JS + images = 200ms x 10 resources = 2000ms

With CDN:
  User in Tokyo → Request to CDN edge in Tokyo → 10ms RTT
  Total for CSS + JS + images = 10ms x 10 resources = 100ms
  (Plus HTTP/2 multiplexing reduces this further)
Enter fullscreen mode Exit fullscreen mode

What to serve from CDN:

  • Static assets: CSS, JS, images, fonts, videos
  • Build artifacts with content hashes
  • Third-party libraries

What NOT to serve from CDN:

  • HTML pages that change frequently (or use short cache + stale-while-revalidate)
  • User-specific or authenticated content
  • Real-time API endpoints

CDN configuration example (Cloudflare):

Page Rules:
  *.example.com/static/*
    Cache Level: Cache Everything
    Edge Cache TTL: 1 month
    Browser Cache TTL: 1 year

  example.com/*.html
    Cache Level: Standard
    Edge Cache TTL: 10 minutes
Enter fullscreen mode Exit fullscreen mode

Multi-CDN strategy:

  • Use primary CDN for your assets
  • Use specialized Image CDN for image optimization
  • Use separate CDN for video streaming
  • Implement failover between CDNs

⬆ Back to Top


Bundle Splitting and Code Splitting

Concept:
Instead of sending one massive JavaScript bundle, split it into smaller chunks that load on demand. Users only download the code they actually need for the current page or interaction.

Types of splitting:

Strategy What It Does Example
Vendor splitting Separate third-party libraries React, Lodash → vendor.js
Route splitting Separate code per page/route /home → home.js, /about → about.js
Component splitting Lazy load heavy components Modal, Chart → separate chunks

Webpack vendor splitting:

// webpack.config.js
module.exports = {
  optimization: {
    splitChunks: {
      chunks: 'all',
      cacheGroups: {
        vendor: {
          test: /[\\/]node_modules[\\/]/,
          name: 'vendor',
          chunks: 'all',
          priority: 10,
        },
        common: {
          minChunks: 2,
          name: 'common',
          chunks: 'all',
          priority: 5,
        },
      },
    },
  },
};
Enter fullscreen mode Exit fullscreen mode

React route-based code splitting:

import { lazy, Suspense } from 'react';
import { BrowserRouter, Routes, Route } from 'react-router-dom';

// Each route loads its own chunk
const Home = lazy(() => import('./pages/Home'));
const Dashboard = lazy(() => import('./pages/Dashboard'));
const Settings = lazy(() => import('./pages/Settings'));

function App() {
  return (
    <BrowserRouter>
      <Suspense fallback={<PageLoader />}>
        <Routes>
          <Route path="/" element={<Home />} />
          <Route path="/dashboard" element={<Dashboard />} />
          <Route path="/settings" element={<Settings />} />
        </Routes>
      </Suspense>
    </BrowserRouter>
  );
}
Enter fullscreen mode Exit fullscreen mode

Result:

  • / loads home.[hash].js (~30KB)
  • /dashboard loads dashboard.[hash].js (~80KB)
  • /settings loads settings.[hash].js (~20KB)
  • User visiting only / never downloads dashboard or settings code

Component-level splitting:

import { lazy, Suspense, useState } from 'react';

// Heavy chart library (~200KB) loaded only when user opens analytics
const AnalyticsChart = lazy(() => import('./AnalyticsChart'));

function Dashboard() {
  const [showAnalytics, setShowAnalytics] = useState(false);

  return (
    <div>
      <h1>Dashboard</h1>
      <button onClick={() => setShowAnalytics(true)}>Show Analytics</button>

      {showAnalytics && (
        <Suspense fallback={<ChartSkeleton />}>
          <AnalyticsChart />
        </Suspense>
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

⬆ Back to Top


Service Workers and Offline Caching

Concept:
A Service Worker is a JavaScript file that runs in the background, separate from the web page. It intercepts network requests and can serve cached responses, enabling offline functionality and instant repeat loads.

Service Worker lifecycle:

1. Register → Browser downloads and installs the SW
2. Install → SW caches critical assets (precaching)
3. Activate → SW takes control of pages
4. Fetch → SW intercepts every network request and decides: cache or network
Enter fullscreen mode Exit fullscreen mode

Basic Service Worker registration:

// main.js – Register service worker
if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/sw.js')
    .then(reg => console.log('SW registered:', reg.scope))
    .catch(err => console.log('SW registration failed:', err));
}
Enter fullscreen mode Exit fullscreen mode

Service Worker with caching strategies:

// sw.js
const CACHE_NAME = 'app-v1';
const PRECACHE_ASSETS = [
  '/',
  '/css/main.css',
  '/js/app.js',
  '/images/logo.svg',
  '/offline.html',
];

// Install: precache critical assets
self.addEventListener('install', event => {
  event.waitUntil(
    caches.open(CACHE_NAME).then(cache => cache.addAll(PRECACHE_ASSETS))
  );
});

// Activate: clean old caches
self.addEventListener('activate', event => {
  event.waitUntil(
    caches.keys().then(keys =>
      Promise.all(keys.filter(k => k !== CACHE_NAME).map(k => caches.delete(k)))
    )
  );
});

// Fetch: serve from cache, fallback to network
self.addEventListener('fetch', event => {
  event.respondWith(
    caches.match(event.request).then(cached => {
      if (cached) return cached;               // Cache hit
      return fetch(event.request).then(response => {
        // Cache new resources for next time
        const clone = response.clone();
        caches.open(CACHE_NAME).then(cache => cache.put(event.request, clone));
        return response;
      });
    }).catch(() => caches.match('/offline.html'))  // Offline fallback
  );
});
Enter fullscreen mode Exit fullscreen mode

Common caching strategies:

Strategy Behavior Best For
Cache First Check cache → fallback to network Static assets, fonts, images
Network First Check network → fallback to cache API data, HTML pages
Stale While Revalidate Serve cache immediately → update cache from network Frequently updated content
Cache Only Only cache, never network Precached app shell
Network Only Only network, never cache Real-time data, auth

Workbox (Google's SW library) example:

// sw.js using Workbox
import { precacheAndRoute } from 'workbox-precaching';
import { registerRoute } from 'workbox-routing';
import { CacheFirst, StaleWhileRevalidate, NetworkFirst } from 'workbox-strategies';
import { ExpirationPlugin } from 'workbox-expiration';

// Precache build assets
precacheAndRoute(self.__WB_MANIFEST);

// Cache images: Cache First, expire after 30 days
registerRoute(
  ({ request }) => request.destination === 'image',
  new CacheFirst({
    cacheName: 'images',
    plugins: [new ExpirationPlugin({ maxEntries: 100, maxAgeSeconds: 30 * 24 * 60 * 60 })],
  })
);

// Cache API: Network First
registerRoute(
  ({ url }) => url.pathname.startsWith('/api/'),
  new NetworkFirst({ cacheName: 'api-cache' })
);

// Cache CSS/JS: Stale While Revalidate
registerRoute(
  ({ request }) => request.destination === 'style' || request.destination === 'script',
  new StaleWhileRevalidate({ cacheName: 'static-resources' })
);
Enter fullscreen mode Exit fullscreen mode

⬆ Back to Top


103 Early Hints

Concept:
103 Early Hints is an HTTP status code that allows the server to send preload and preconnect hints to the browser before the final response is ready. While the server is still computing the full response (querying a database, rendering HTML), it can immediately tell the browser to start fetching critical resources.

The problem without Early Hints:

Browser ──── GET /page ────────────► Server
                                      │
                                      │  Processing... (500ms)
                                      │  DB query, render HTML
                                      │
Browser ◄──── 200 OK + HTML ──────── Server
  │
  │  NOW discovers <link rel="preload" href="/style.css">
  │  NOW discovers <link rel="preconnect" href="https://fonts.googleapis.com">
  │
  │── GET /style.css ─────────────►
  │── DNS+TCP+TLS to fonts.gstatic ►
Enter fullscreen mode Exit fullscreen mode

With 103 Early Hints:

Browser ──── GET /page ────────────► Server
                                      │
Browser ◄── 103 Early Hints ──────── Server (sent immediately!)
  │   Link: </style.css>; rel=preload; as=style
  │   Link: <https://fonts.googleapis.com>; rel=preconnect
  │
  │── GET /style.css ─────────────►  (starts NOW, in parallel with server processing)
  │── DNS+TCP+TLS to fonts.gstatic ► (starts NOW)
  │                                   │
  │                                   │  Server still processing... (500ms)
  │                                   │
Browser ◄──── 200 OK + HTML ──────── Server
  │
  │  style.css already downloaded! ✅
  │  fonts.googleapis.com already connected! ✅
Enter fullscreen mode Exit fullscreen mode

Savings: 200-500ms off critical resource loading — the browser gets a head start during time that was previously wasted waiting.

Server implementation (Node.js):

app.get('/page', (req, res) => {
  // Send 103 Early Hints immediately
  res.writeEarlyHints({
    link: [
      '</css/main.css>; rel=preload; as=style',
      '</fonts/body.woff2>; rel=preload; as=font; crossorigin',
      '<https://api.example.com>; rel=preconnect',
    ],
  });

  // Now do the slow work
  const data = await fetchFromDatabase();
  const html = renderPage(data);
  res.status(200).send(html);
});
Enter fullscreen mode Exit fullscreen mode

Nginx configuration:

location / {
  http2_push_preload on;  # Converts Link headers to 103 Early Hints
  add_header Link "</css/main.css>; rel=preload; as=style" early;
  add_header Link "</js/app.js>; rel=preload; as=script" early;
}
Enter fullscreen mode Exit fullscreen mode

Browser support: Chrome 103+, Edge 103+, Firefox 102+. Safari does not yet support 103.

When to use 103 Early Hints:

Scenario Use 103? Why
Server takes > 200ms to generate HTML ✅ Yes Browser wastes time waiting — give it a head start
Pages with known critical resources ✅ Yes CSS, fonts, key API endpoints are predictable
Static HTML served from CDN ❌ No Response is fast enough, no idle time to exploit
Highly dynamic resource URLs ❌ No Server doesn't know which resources to hint

103 Early Hints vs Server Push: Early Hints replaced Server Push as the recommended way to get resources to the browser early. Unlike Server Push (which was removed from Chrome), Early Hints simply tells the browser what to fetch — the browser retains full control and can skip resources it already has cached.

⬆ Back to Top


Stale While Revalidate Deep Dive

Concept:
The stale-while-revalidate Cache-Control directive allows the browser to immediately serve a cached (stale) response while simultaneously fetching a fresh copy from the network in the background. The next request gets the updated version.

This is one of the most powerful caching directives because it gives you instant loads (from cache) with eventual freshness (background update).

How it works:

Cache-Control: max-age=600, stale-while-revalidate=3600

Timeline:
┌─────────────────┬──────────────────────────────┬──────────────┐
│  0 – 600s       │  600s – 4200s                │  After 4200s │
│  (max-age)      │  (stale-while-revalidate)    │              │
│                 │                              │              │
│  Serve from     │  Serve stale IMMEDIATELY     │  Must fetch  │
│  cache directly │  + revalidate in background  │  from network│
│  (no network)   │  (user gets instant response)│  (slow)      │
└─────────────────┴──────────────────────────────┴──────────────┘
Enter fullscreen mode Exit fullscreen mode

Practical examples:

# API responses – fresh for 1 minute, serve stale for up to 1 hour while revalidating
Cache-Control: public, max-age=60, stale-while-revalidate=3600

# Static assets with version hashing – long cache, moderate SWR window
Cache-Control: public, max-age=86400, stale-while-revalidate=604800

# User profile data – short freshness, but cached response OK briefly
Cache-Control: private, max-age=30, stale-while-revalidate=300
Enter fullscreen mode Exit fullscreen mode

Server configuration (Nginx):

location /api/ {
  proxy_cache_valid 200 60s;
  add_header Cache-Control "public, max-age=60, stale-while-revalidate=3600";
}
Enter fullscreen mode Exit fullscreen mode

CDN configuration (Cloudflare, Vercel):

Most CDN edge servers also honor stale-while-revalidate at the edge layer, meaning the CDN serves stale content to users while fetching a fresh copy from your origin in the background.

# Vercel (vercel.json)
{
  "headers": [
    {
      "source": "/api/(.*)",
      "headers": [
        {
          "key": "Cache-Control",
          "value": "public, max-age=60, stale-while-revalidate=3600"
        }
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

When to use stale-while-revalidate:

Resource Type Recommended Value Why
API responses (feeds, listings) max-age=60, s-w-r=3600 Instant load, background refresh
CMS/blog content max-age=3600, s-w-r=86400 Content changes rarely, instant revisits
User avatars / profile images max-age=300, s-w-r=86400 Rarely change, OK to serve stale briefly
Auth / payment endpoints Never use Always needs fresh data
Hashed static assets (main.a1b2c3.js) Not needed Use immutable instead — content never changes per URL

Key insight for interviews: stale-while-revalidate at the HTTP layer is the same philosophy as React's useSWR hook and TanStack Query's staleTime — return cached data instantly, refresh in the background.

⬆ Back to Top


Key Takeaways

  • Leverage HTTP/2+ multiplexing (many small files are fine, avoid bundling everything)
  • Use resource hints strategically: preconnect for critical origins, preload for critical assets, prefetch for next-page resources
  • Cache aggressively with content-hashed filenames (max-age=31536000, immutable)
  • Enable Brotli compression (Gzip as fallback) – 60-90% smaller text assets
  • Serve static assets from CDN edge locations for minimal latency
  • Split bundles by vendor, route, and component – users download only what they need
  • Use Service Workers for offline support, instant repeat loads, and background sync

Performance Metrics Impact

Optimization LCP FCP CLS TTI TTFB
HTTP/2-3 + + + ++
Resource hints ++ ++ + +
Caching +++ +++ +++ +++
Compression + + + ++
CDN ++ ++ + +++
Code splitting +++
Service Workers +++ +++ +++ +++

+++ = Major impact, ++ = Moderate impact, + = Minor impact

⬆ Back to Top


More Details:

Get all articles related to system design
Hashtag: SystemDesignWithZeeshanAli

systemdesignwithzeeshanali

Git: https://github.com/ZeeshanAli-0704/front-end-system-design

⬆ Back to Top

Top comments (0)