We use web browsers on a daily basis. When we type in a URL, the page appears within seconds, but behind the scenes, a huge number of processes are happening simultaneously.
In this article, we’ll break down the flow from DNS resolution to DOM tree construction, rendering, and compositing. We’ll also clearly explain the often-confused difference between the DOM (API specification) and the DOM tree (data structure).
Browser Architecture and the First Communication
A browser is not just an “app that displays pages.” It consists of multiple components working together. The main components are as follows:
User Interface
The part the user directly interacts with:
- Address bar, back/forward buttons, tab switching, bookmark management, etc.
- It doesn’t directly participate in internal processing, but it passes user actions to the browser engine.
Browser Engine
The “conductor” that coordinates the whole system:
- Bridges user actions (URL input, scrolling, etc.) to the respective modules.
- Instructs the rendering engine to “parse the retrieved HTML and display it.”
Networking Module
The “communication clerk” of the browser, responsible for internet communication.
When you enter a URL, the networking module is the first to start working. Inside the browser, it functions as the communication handler, and operates as follows:
-
DNS Resolution
- Extracts the hostname (e.g.,
example.com
) from the URL and converts it into an IP address. -
If the information is in cache, it’s used. Otherwise, it queries step by step: root DNS → TLD DNS → authoritative DNS to obtain the final IP.
👉 The networking module translates the “human-readable address (hostname)” into the “machine-usable address (IP address).”
- Extracts the hostname (e.g.,
-
TCP/TLS Connection
- Once the IP address is known, the networking module establishes a TCP connection with the server.
- If using HTTPS, it also performs a TLS handshake to set up secure, encrypted communication.
-
HTTP Request and Response
- After the connection is established, the networking module sends an HTTP request to the server.
- The server responds with HTML, followed by CSS, JavaScript, images, and other resources.
-
Hand-off to the Rendering Engine
- The networking module’s role is to pass the received HTML, CSS, etc., to the rendering engine.
- From here, DOM tree construction begins.
Rendering Engine
The core component that parses HTML and CSS, and paints them onto the screen.
- Chrome/Edge → Blink
- Safari → WebKit
- Firefox → Gecko
JavaScript Engine
The component that interprets and executes JavaScript.
- Manipulates the DOM tree through the DOM API.
- Handles event processing and asynchronous communication (e.g.,
fetch
). - Examples: V8 (Chrome/Edge), SpiderMonkey (Firefox), JavaScriptCore (Safari).
Data Storage
Stores data for users and websites.
- Cookies: Session management, login information.
- LocalStorage / SessionStorage: Key-value storage accessible from JavaScript.
- IndexedDB: A database for handling large-scale data.
- Cache: Stores images and HTML for faster reloads.
Understanding the DOM
“What exactly is the DOM? And how is it different from the DOM tree?”
This is a common source of confusion for many developers.
DOM = API Specification
- DOM (Document Object Model) is an API specification for manipulating documents from programs.
- Methods like
document.getElementById()
anddocument.querySelector()
in JavaScript are part of the DOM API. - In other words, DOM is the “system that lets the browser expose the document as objects.” It’s closer to a set of rules (a spec) than an actual data structure.
DOM Tree = Actual Data Structure
- The DOM tree is what the browser (rendering engine) builds by parsing HTML.
- It has a hierarchical tree structure:
-
<html>
is the root -
<head>
and<body>
are children -
<p>
elements and text nodes form the leaves
-
DOM Manipulation = DOM Tree Manipulation
- When developers talk about “manipulating the DOM,” what they’re really doing is using the DOM API to modify the DOM tree.
-
Example:
document.body.appendChild(document.createElement("div"));
This creates a new node and attaches it to the tree.
👉 The key takeaway: DOM is the concept (API spec), while the DOM tree is the actual data structure.
DOM Tree Construction Process
So how is the DOM tree actually built?
The rendering engine processes the received HTML as follows:
-
Receive HTML
- HTML file data arrives from the networking module.
-
Tokenization
- Breaks HTML into tokens (building blocks).
- Example:
<p>Hello</p>
→ start tag<p>
, textHello
, end tag</p>
.
-
Parsing
- Analyzes tokens according to grammar rules, generating corresponding objects.
-
Node Creation
- Tag → Element node
- Text → Text node
- Comment → Comment node
-
Tree Construction
- Connects nodes into a parent-child hierarchy to complete the DOM tree.
- When this DOM tree is ready, the
DOMContentLoaded
event fires. - At this point, developers can safely manipulate the DOM with JavaScript.
👉 Note: The DOM tree is built incrementally as the HTML stream arrives, not all at once.
Rendering Engine and the DOM
Now that we understand what the DOM is and how the DOM tree is built, let’s look at its relationship with the rendering engine.
Rendering Engine Responsibilities
- Parse HTML to build the DOM tree.
- Parse CSS to build the CSSOM tree.
- Combine them to create the render tree.
- Perform layout, paint, and compositing to display the page.
Relationship with the DOM
- The DOM tree is a product of the rendering engine.
- Developers don’t directly control the rendering engine; instead, they manipulate the DOM tree via the DOM API.
- Changes to the DOM tree trigger the rendering engine to update the screen.
👉 Think of it like this:
- Rendering Engine = Factory
- DOM Tree = Product made by the factory
- DOM API = Remote control to operate the product
Rendering Pipeline Overview
A DOM tree alone isn’t enough to display the page. With CSS added, the rendering engine proceeds as follows:
- Build the DOM tree
- Build the CSSOM tree
- Generate the Render tree (DOM + CSSOM, excluding invisible elements)
- Layout: Calculate positions and sizes
- Paint: Convert styles into pixel instructions
- Composite: Merge multiple layers on the GPU and display the final result
👉 This is the browser’s rendering pipeline.
Compositing in Detail
Compositing is the final stage of rendering.
Layering System
- The page isn’t drawn on a single canvas; it’s split into multiple layers.
- Example:
- Background → Layer 1
- Body text → Layer 2
- Fixed header → Layer 3
- CSS animation element → Layer 4
Why Compositing Matters
- Redrawing the entire page is expensive.
- By managing layers, only the parts that change need to be updated, then everything is merged at the end.
- Crucial for smooth scrolling and animations.
Developer Optimization Tips
- Use
transform
andopacity
to trigger “composite-only” updates (no reflow/paint). - Use
will-change
in CSS to pre-promote elements to their own layers. - But don’t overuse it—too many layers consume memory.
👉 Understanding compositing is key to smooth animations and performance.
JavaScript and Rendering
JavaScript DOM manipulations directly affect rendering.
Reflow (Layout recalculation)
- Happens when element position/size changes.
- Forces recalculation of the entire layout.
- Example:
element.style.width = "500px";
Repaint (Redraw)
- Happens when only visual properties (color, background, etc.) change.
- No layout recalculation, but pixels are repainted.
- Example:
element.style.backgroundColor = "red";
Composite-only
-
transform
andopacity
changes don’t require reflow or repaint. - Handled in the compositing stage only → fastest.
👉 Key takeaway:
- Know which operations cause reflow/repaint.
- Prefer composite-only operations whenever possible.
Performance Optimization Tips
-
Optimize JavaScript loading
- Use
defer
/async
to avoid blocking rendering.
- Use
-
Lazy load images and videos
- Use
loading="lazy"
to defer unnecessary resources.
- Use
-
Batch DOM operations
- Modify elements in bulk instead of one by one.
- Use
DocumentFragment
or batch updates.
-
Use transform/opacity for animations
- Avoid triggering layout and paint.
-
Avoid unnecessary layers
- Don’t overuse
will-change
.
- Don’t overuse
👉 The golden rule: Don’t break the rendering pipeline.
Conclusion
- The DOM is the “API specification,” while the DOM tree is the “actual structure built by the browser.”
- The rendering engine parses HTML → builds DOM tree → builds CSSOM → generates render tree → layout → paint → composite → display.
- Compositing is crucial for performance.
- By understanding how DOM operations affect rendering, developers can build faster, smoother web experiences.
Next time you open a web page, remember: behind the scenes, the entire flow from DNS resolution to compositing is happening. And with that in mind, the browser will feel much less of a black box.
Top comments (0)