Ayush Maurya

Posted on Jul 6

How Browsers Parse and Render HTML: From Request to Paint

#javascript #html #webdev #programming

This all started today when I was trying something like this:

const containerHeight = innerDivHeight.scrollHeight + 'px';
container.style.height = containerHeight;

const container2Height = innerDiv2Height.scrollHeight + 'px';
container2.style.height = containerHeight;

Yeah, not the best way to go about it.

Just out of curiosity, I started digging into how the browser handles such changes and realized that even small DOM mutations can trigger a reflow or repaint.

So, instead of doing it this way, you should ideally write:

const containerHeight = innerDivHeight.scrollHeight + 'px';
const container2Height = innerDiv2Height.scrollHeight + 'px';

container.style.height = containerHeight;
container2.style.height = container2Height;

How HTML Works in the Browser

It all begins when the browser requests the server for the HTML file. The moment it starts receiving the first byte, the browser doesn’t wait — it begins parsing immediately using a streaming parser.

Let’s break this process down.

1. Tokenization and DOM Construction

The browser starts by converting the raw HTML into tokens (e.g. div, h1, p, etc.). Each tag, along with its attributes and values, is turned into a token.

Once tokenized, the browser constructs the DOM (Document Object Model) — a tree-like structure representing the hierarchical relationships between elements.

That’s why when you use JavaScript to select multiple elements, you get a NodeList. That NodeList is a slice of the DOM that the browser built during this stage.

2. CSSOM & JavaScript Blocking

Now that the DOM is ready, the browser still hasn’t rendered anything visually yet.

CSSOM
The browser parses the CSS in a similar way — raw bytes → tokens → CSSOM (CSS Object Model). The CSSOM is a structured representation of all styling rules.

Important point:

CSS blocks HTML parsing.

When the parser encounters a link or style, it pauses HTML parsing until the CSS is loaded and parsed.

This happens because layout depends on style, and the browser can’t render elements accurately without knowing how they should look.

JavaScript
Same thing for JavaScript.

When the browser hits a <script> tag (especially one without async or defer, it pauses everything, downloads the script, executes it, and then resumes HTML parsing.

Why? Because JavaScript can manipulate the DOM. So, the browser has to wait to avoid building a tree that JS might change anyway.

Note: CSS is prioritized over JavaScript when both are blocking resources.

3. Render Tree Construction

At this point, the DOM and CSSOM exist independently — they don’t yet know about each other.

The browser engine combines them to build the Render Tree.

Render Tree = DOM (structure) + CSSOM (style)

It figures out what needs to be rendered and how — calculating layout, dimensions, positioning, margins, paddings, etc., using:

Box model
Flex/grid rules
Relationships in the DOM
This is the stage where the actual layout is determined.

4. Painting and Compositing

Now comes the painting phase — the browser draws actual pixels on your screen.

Text, colors, borders, shadows — everything is drawn layer by layer.
Each layer is rasterized into a bitmap.
These bitmaps are uploaded to the GPU as textures.
The GPU composites all textures into a single frame to display.

This phase is super sensitive — every style change or unoptimized animation can trigger repaints or reflows, which can hit performance.

The painting happens recursively — children are painted first, and then the parents.

Conclusion

All of this — DOM parsing, CSSOM creation, JS blocking, layout calculation, GPU compositing — happens just to render a simple HTML page.

Mind-blowing, right?

For a deeper dive, check out this video on YouTube:
https://www.youtube.com/watch?v=SmE4OwHztCc

If you liked this article, feel free to like and follow for more frontend deep dives.

DEV Community