Ritam Saha

Posted on Feb 1

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

#webdev #beginners #browser #systemdesign

Introduction: A Simple Action, a Complex Story

You type a URL.
You press Enter.
A webpage appears.

It feels instant. Effortless.

Nothing special, right?!!

But under the hood, that single key press triggers a carefully coordinated chain of events involving networking, parsing, layout calculations, and rendering logic — all happening in milliseconds or maybe less than that.

Most beginners assume a browser’s job is simple: “It opens websites.”

That’s not wrong — but it’s wildly incomplete.

A browser is not just a viewer. It’s a system (or maybe we can say equivalent to an Operating System itself). The one that fetches files, understands languages like HTML and CSS, converts text into structured data, calculates layouts, and finally paints pixels onto your screen.

In this guide, we’ll slow down that moment.

We won’t dive into specifications.
We won’t memorize components.
We’ll follow a story — from typing a URL to seeing the pixels.

When you press Enter, the browser doesn’t jump straight to drawing the page.
It follows a pipeline — a sequence of steps — where each stage prepares data for the next.

Let’s break that system down, and by the end, browsers will feel less like magic and more like well-designed machines doing exactly what they’re supposed to do.

What Is a Browser, Really?

At a high level, a browser is a platform that does four core jobs:

Provides a user interface (tabs, address bar, buttons)
Fetches resources from the network
Understands web languages (HTML, CSS, JavaScript)
Converts those instructions into pixels on your screen

Think of a browser like a restaurant:

The menu and tables are what you interact with
The kitchen processes raw ingredients.
The chefs follow instructions.
The final dish is what gets served.

The important takeaway:
A browser is a software application, a collection of cooperating components that is used to translate HTML into visual data, not a single program.

Main Parts of a Browser

Here are the major players within the browsers:

User Interface

Whatever you see in the browser except the window where the web page is rendered, is known as the UI of browser. The address bar, tabs, refresh button, back button — these are part of the browser’s control panel.

What’s important to understand is this:

The UI doesn’t render websites.

It simply accepts your input and hands off the real work to other components.

When you type a URL and press Enter, the UI says:

“Here’s the request. Handle it.”

Browser engine:

Browsers engine is the brain of the browser. It manages:

Navigation: Back & Forth, the reloads
Communication: Managing the flow between the UI layer & disk storage.
Session management: it keeps track of sessions (from start to end), state across different tabs.

Example: Trident(Internet Explorer, Edge), Blink(Chrome, Opera, Brave), Webkit(Apple's Safari), Gecko(Firefox) etc.

Rendering engine

The rendering engine is responsible to make HTML content visible.
Rendering engine does the most heavy tasks like creating the DOM, CSSOM using parser, render tree, reflows, paint.

It listen the commands of the browser engine like reload, refresh etc.

Networking

Once the browser engine receives your URL, it asks the networking layer to fetch resources. It handles the DNS resolutions, request data from server using http/https

Then, the browser requests for:

HTML files
CSS files
Images, fonts, and more

At this stage, the browser doesn’t care how anything looks.
It’s just collecting raw materials.

JS Interpreter

This is the main layer that is responnsible to interpret and parse the JavaScript codes. Some JS engines are V8(used by NodeJS), SpiderMonkey(used by Mozilla Foundation; Firefox), JavaScriptCore(used by Apple, Bun), Chakra(Used by Microsoft Edge) etc.

UI backend

UI backend does the actual work of displaying the HTML content in the browser by lighting up the pixels and setting up the respective backends like what will happen after clicking on "About" button will be set up by thhis llayer.

Rendering engine creates layout (reflow) and then in the paint process creates instructions and give it to UI backend.

Disk API

It is used to save data into Browser cache, cookies, local storage, session storage etc.
It also writes data into your computer's browser folder using some internal method.

This is the high level overview of the browser components that what each component is responsible for.

How Browser Actually works?

Now, let's deep dive into how the components work together to render a website. I will again start with the same question: what happens when we write an URL on the address bar and Enter?

As you press Enter after entering URL:

The browser engine commands the networking layer for DNS resolution.
The networking layer finds the IP address of the server, then a handshake connection is established. After connecting to the server, networking layer requests data using HTTP or HTTPs.
As the first packet of data is recieved, networking layer sends the packet to the rendering engine.
Now the rendering engine sends HTML codes to tokenizer to covert codes to tokens and then parse the HTML codes.

Now, it’s clear how data is fetched to browser after entering the URL. Now we are going to deep dive into how the HTML and CSS are getting passed, how the reflux is generated, and ultimately how that display comes in front of us.

HTML Parsing: From Text to DOM

HTML arrives as plain text.
The browser doesn’t see “headings” or “sections” yet — it sees characters.

So it parses the HTML:

Reads it token by token
Understands tags and nesting
Builds a structured representation called the DOM

What Is the DOM?

The Document Object Model (DOM) is a tree-like structure where:

Each HTML element becomes a node
Parent-child relationships are preserved

Think of the DOM as a family tree of elements.

CSS Parsing: From Styles to CSSOM

CSS also starts as text.
The browser parses CSS into another structure called the CSSOM (CSS Object Model).
This structure stores:

Selectors
Properties
Values
Cascading and inheritance rules

Important note:

The CSSOM doesn’t decide positions — it only defines how things should look.

Content Sink / DOM mutation observer

You can think content sink as the collector and dispatcher. As the DOM nodes are parsed, it gets the reference of that node and sends that to the render engine to build the render tree.
Content sink is also known as DOM Mutation Observer because as the DOM gets updates, it observes that and the render tree is built again.

DOM + CSSOM → Render Tree

Now the browser has:

Structure (DOM)
Styling rules (CSSOM)

It merges them into the Render Tree / Visual Tree. The render tree:

Contains only visible elements
Includes computed styles
Is optimized for layout and painting

You can think of it as:

The DOM dressed in CSS, ready to be drawn.

Now we have completed an important checkpoint of creating a render tree. Now, we are in the final step where the browser will use render tree nodes and display them in the browser. Here are the further steps:

Frame Construction

After the render tree is created, the browser constructs frames. Frames do not calculate the final size or position of elements. Instead, they define how each render tree node participates in layout—whether it behaves as a block, inline, flex item, or grid item, and how it interacts with surrounding elements.

The browser must "re-construct" this frame up to 60 times per second so that scrolling, animations, and interactions feel smooth rather than static.

Reflow (Frame Tree)

This is the step where the browser calculates the width, length and the positions (x-axis & y-axis), margin, padding etc. This is the highest computational task performed by a browser in this process.

It takes the frames, viewport size as input and enables geometry to each frame. In simple terms, it creates the layout of website.

Paint

This is the second last step of rendering the website. It translate frames layout into paint instructions. In simple words, it creates the instruction for the UI backend.

UI Backend

Now at the end, UI Backend comes into the picture and talk with your OS graphics API to lights up the pixels in the screen and decides how the behind-the-scenes should actually go on.

Conclusion: Browsers Aren’t Magic — They’re Engineering

Browsers feel effortless because they’re incredibly well designed.

They hide complexity behind clean interfaces, parallel processes, and optimized pipelines. What looks like “instant loading” is actually thousands of small, precise decisions executed at high speed.

As a beginner, you don’t need to remember every component or step.
What matters is understanding that: