DEV Community

Cover image for How Web Browsers Work — The Phenomenon Behind Every Web Experience
Muhammad Hamid Raza
Muhammad Hamid Raza

Posted on

How Web Browsers Work — The Phenomenon Behind Every Web Experience

Every time you type a URL and hit Enter, a small miracle unfolds in milliseconds. Most people never think twice about it. Developers should.


1. Introduction — More Than a Window

Open Chrome. Type a URL. Press Enter. A page appears.

Simple, right?

Not even close.

Behind that instant result is one of the most complex pieces of software ever written. A modern web browser is simultaneously a network client, a rendering engine, a JavaScript runtime, a security sandbox, and a mini operating system — all working in perfect coordination, every single time you browse.

Google Chrome has over 35 million lines of code. Firefox has been actively developed for more than two decades. These aren't simple apps — they're engineering marvels disguised as everyday tools.

In this post, we're going to pull back the curtain completely. From the moment you hit Enter to the pixel appearing on your screen — this is how browsers actually work.


2. Step One — Finding the Server (DNS Resolution)

Before anything loads, the browser needs to answer one question: Where is this website?

Computers don't understand domain names like hamidrazadev.com. They speak in IP addresses like 104.21.45.78. The system that translates between the two is called DNS — Domain Name System.

Here's the lookup chain:

  1. Browser Cache — Has it visited this domain recently? If yes, use the saved IP.
  2. OS Cache — Ask the operating system's own DNS cache.
  3. Router Cache — Ask the local network router.
  4. ISP DNS Resolver — Ask your internet provider's DNS server.
  5. Root → TLD → Authoritative Nameserver — A recursive query through the global DNS hierarchy until the correct IP is returned.

This entire process often completes in under 20 milliseconds — but it's the invisible foundation of every page load.

Connecting to the Server — TCP Handshake

With the IP address in hand, the browser opens a TCP connection using a three-way handshake:

Client  →  SYN         →  Server   (I want to connect)
Client  ←  SYN-ACK     ←  Server   (Acknowledged, go ahead)
Client  →  ACK         →  Server   (Connection established)
Enter fullscreen mode Exit fullscreen mode

For https:// sites (which is basically everything today), a TLS handshake follows immediately after — exchanging cryptographic certificates, verifying the server's identity, and establishing an encrypted channel. This is what puts the padlock in your address bar.

Only after all of this does the browser send the actual request:

GET / HTTP/1.1
Host: hamidrazadev.com
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, deflate, br
Enter fullscreen mode Exit fullscreen mode

3. Step Two — The Server Responds

The server processes the request and sends back an HTTP response. This includes:

  • Status Code200 OK, 301 Redirect, 404 Not Found, 500 Server Error
  • Response Headers — Content type, caching rules, security policies, cookies
  • Response Body — The actual HTML markup

One of the most important things to understand here: the browser does not wait for the full HTML to download before it starts working. It reads the response as a stream, processing bytes as they arrive over the network. This is called streaming parsing — and it's the reason you sometimes see a page begin rendering before it's fully loaded.

Modern browsers also take advantage of HTTP/2 and HTTP/3, which allow multiple resources (HTML, CSS, JS, images) to be fetched over a single connection simultaneously — a massive leap over HTTP/1.1's one-request-at-a-time model.


4. Step Three — Parsing HTML into the DOM

As HTML bytes arrive, the browser's HTML Parser gets to work. Its job is to convert raw markup text into a structured, in-memory data structure called the DOM — Document Object Model.

Think of it as a tree:

Document
└── <html>
    ├── <head>
    │   ├── <title>My Site</title>
    │   └── <link rel="stylesheet" href="styles.css">
    └── <body>
        ├── <header>
        │   └── <nav>...</nav>
        ├── <main>
        │   └── <article>...</article>
        └── <footer>...</footer>
Enter fullscreen mode Exit fullscreen mode

Every HTML element becomes a node. Every node has a parent, zero or more children, and sibling relationships. This tree is what JavaScript manipulates at runtime — adding, removing, and modifying nodes to create dynamic UIs.

Parser Blocking — The Hidden Performance Trap

Parsing is not always a straight line. The HTML parser can get blocked:

  • <link rel="stylesheet"> — Triggers a CSS file fetch. Rendering is blocked until CSS is downloaded and parsed (to avoid a flash of unstyled content).
  • <script> (without async/defer) — The parser stops completely to download and execute the JavaScript before continuing.
  • <img>, <video> — Non-blocking; parallel fetches are fired off but don't halt the parser.

This is exactly why developers are told to put <script> tags at the bottom of the body, or use defer — to avoid stalling the parser mid-document.


5. Step Four — Building the CSSOM

While the HTML parser builds the DOM, a parallel process handles your CSS — constructing the CSSOM (CSS Object Model).

Every style rule, from every source (<style> tags, external stylesheets, inline styles) is parsed and organized into its own tree. The browser then applies the cascade algorithm to determine which styles actually apply to each element, resolving conflicts using:

  1. Origin — Browser defaults vs. developer styles vs. user overrides
  2. Specificity#id > .class > element selector
  3. Order of appearance — Later rules override earlier ones at equal specificity
  4. Inheritance — Some properties (like font-family) inherit from parent to child automatically

Once both the DOM and CSSOM are complete, the browser merges them into the Render Tree — a new tree containing only the visible elements, each annotated with their final computed styles.

Elements with display: none are excluded entirely. Elements with visibility: hidden remain in the tree (they still occupy space) but are painted transparently.


6. Step Five — The Rendering Pipeline

With the render tree ready, the browser enters the rendering pipeline — the process of turning data into actual pixels. This happens in four distinct stages:

1. Layout (Reflow)

The browser calculates the exact position and dimensions of every element on the page. This involves:

  • Box model calculations (margin, border, padding, content width/height)
  • Flexbox and CSS Grid algorithms
  • Text wrapping and line-breaking
  • Relative and absolute positioning

Layout is computed from the root of the render tree downward. Changing a single element's size can trigger a reflow of its descendants and siblings — which is why layout-thrashing in JavaScript is a serious performance concern.

2. Paint

The browser walks the render tree and records paint operations — drawing text, colors, borders, box shadows, images — layer by layer. This stage produces a list of draw calls, not actual pixels yet.

Modern browsers split the page into multiple layers (think: transparent sheets stacked together). Elements that animate or change frequently are promoted to their own layer so they can be updated independently without repainting the rest of the page.

3. Rasterization

Paint commands are converted to actual pixels — either on the CPU or, more commonly, the GPU. The page is broken into tiles and rasterized in parallel for speed.

4. Composite

The GPU combines all rasterized layers into the final frame displayed on screen. CSS properties like transform and opacity are handled entirely at this stage — which is why they're the gold standard for performant animations. They skip Layout and Paint entirely.

The full pipeline looks like this:

JavaScript → Style Recalc → Layout → Paint → Composite
     ↑                                              ↓
  DOM/CSSOM Changes                         Frame on Screen
Enter fullscreen mode Exit fullscreen mode

Browsers target 60 frames per second — that's one frame every 16.67ms. Miss that budget, and you get visible jank.


7. Step Six — JavaScript Execution

No browser discussion is complete without the JavaScript engine — arguably the most complex component of all.

Different browsers ship different engines:

  • V8 — Chrome, Edge, Node.js
  • SpiderMonkey — Firefox
  • JavaScriptCore (Nitro) — Safari

They all follow a similar pipeline:

Source Code → AST → Bytecode → JIT Machine Code
Enter fullscreen mode Exit fullscreen mode
  1. Parsing — JS source is parsed into an Abstract Syntax Tree (AST)
  2. Bytecode Compilation — The AST is compiled to bytecode by an interpreter (e.g., V8's Ignition)
  3. JIT Optimization — Frequently-executed ("hot") code paths are recompiled to highly optimized native machine code (e.g., V8's TurboFan)
  4. Deoptimization — If assumptions change (e.g., a variable's type changes), the JIT compiler throws away optimized code and falls back to bytecode
  5. Garbage Collection — Unreachable memory is automatically freed using a generational GC algorithm

The Event Loop — Why JS Doesn't Freeze

JavaScript is single-threaded — one call stack, one execution context at a time. Yet it handles async operations (network requests, timers, events) without blocking. How?

The Event Loop:

Call Stack → empty? → Check Microtask Queue → Check Macrotask Queue → Push callback → Repeat
Enter fullscreen mode Exit fullscreen mode
  • Long-running browser operations (fetch, setTimeout, DOM events) are handled by Web APIs in C++ — outside the JS thread.
  • When they complete, their callbacks enter a queue.
  • The event loop picks callbacks off the queue and pushes them onto the call stack only when it's empty.

This is why async/await and Promises don't block the UI — they defer callbacks until the call stack is free.


8. The Bigger Picture — Security, Caching & Performance

Process Isolation & The Sandbox

Modern browsers run each tab in its own sandboxed OS process. Even if malicious JavaScript runs in one tab, it cannot access:

  • Another tab's DOM or data
  • Your file system
  • OS-level resources

This is enforced at the OS level — not just by browser rules. A crashed tab crashes its process, not the entire browser.

Key security mechanisms include:

  • Same-Origin Policy (SOP) — JavaScript on site-a.com cannot read data from site-b.com
  • Content Security Policy (CSP) — Server-defined rules about which scripts and resources are allowed to load
  • HTTPS + HSTS — Enforces encrypted connections and prevents downgrade attacks
  • Cross-Origin Resource Sharing (CORS) — Controlled exceptions to SOP for trusted cross-origin requests

Caching — Speed Through Memory

Browsers cache aggressively to eliminate redundant work:

Cache Type Lifetime Speed
Memory Cache Tab session Instant
Disk Cache Controlled by HTTP headers Fast
Service Worker Cache Programmatic / Persistent Flexible

The Cache-Control, ETag, and Last-Modified headers give servers fine-grained control over how long resources are cached and when they expire.

Modern Performance Techniques

  • <link rel="preload"> — Fetch critical resources (fonts, hero images) before the parser discovers them
  • <link rel="dns-prefetch"> — Resolve third-party domains early
  • loading="lazy" — Defer off-screen image/iframe loads
  • async / defer on scripts — Avoid parser blocking
  • HTTP/3 + QUIC — Faster connection establishment, especially on lossy mobile networks
  • Brotli Compression — Smaller transfer sizes than gzip for text-based assets

Final Thoughts — Know Your Runtime

Every page load you've ever experienced — every click, every scroll, every animation — was the product of this entire pipeline running in perfect orchestration, usually in under a second.

As a developer, understanding this isn't just trivia. It's leverage.

It's the difference between a 4-second load and a 0.9-second one. Between janky scroll and buttery 60fps. Between an XSS vulnerability and a hardened app. Between a developer who writes code and one who understands the system their code runs in.

The browser is your runtime. Respect it. Understand it. And build for it with intention.

Top comments (0)