loading...
Cover image for How the browser renders a web page

How the browser renders a web page

jstarmx profile image James Starkie Originally published at jstar.mx Updated on ・7 min read

My thinking: if I'm going to build websites that are fast and reliable, I need to really understand the mechanics of each step a browser goes through to render a web page, so that each can be considered and optimised during development. This post is a summary of my learnings of the end-to-end process at a fairly high level.


A lot of this is based on the fantastic (and FREE!) Website Performance Optimization course by Ilya Grigorik and Cameron Pittman on Udacity. I'd highly recommend checking it out.

Also very helpful was the article How Browsers Work: Behind the scenes of modern web browsers by Paul Irish and Tali Garsiel. It's from 2011 but many of the fundamentals of how browsers work remain relevant at the time of writing this blog post.

Ok, here we go. The process can be broken down into these main stages:

  1. Start to parse the HTML
  2. Fetch external resources
  3. Parse the CSS and build the CSSOM
  4. Execute the JavaScript
  5. Merge DOM and CSSOM to construct the render tree
  6. Calculate layout and paint

1. Start to parse the HTML

When the browser begins to receive the HTML data of a page over the network, it immediately sets its parser to work to convert the HTML into a Document Object Model (DOM).

The Document Object Model (DOM) is the data representation of the objects that comprise the structure and content of a document on the web.

The first step of this parsing process is to break down the HTML into tokens that represent start tags, end tags, and their contents. From that it can construct the DOM.

Steps involved in the parsing of HTML by a web browser

2. Fetch external resources

When the parser comes across an external resource like a CSS or JavaScript file, it goes off to fetch those files. The parser will continue as a CSS file is being loaded, although it will block rendering until it has been loaded and parsed (more on that in a bit).

JavaScript files are a little different - by default they block parsing of the HTML whilst the JavaScript file is loaded and then parsed. There are two attributes that can be added to script tags to mitigate this: defer and async. Both allow the parser to continue whilst the JavaScript file is loaded in the background, but they operate differently in the way that they execute. More on that in a bit too, but in summary:

defer means that the execution of the file will be delayed until the parsing of the document is complete. If multiple files have the defer attribute, they will be executed in the order that they were discovered in the HTML.

<script type="text/javascript" src="script.js" defer>
Enter fullscreen mode Exit fullscreen mode

async means that the file will be executed as soon as it loads, which could be during or after the parsing process, and therefore the order in which async scripts are executed cannot be guaranteed.

<script type="text/javascript" src="script.js" async>
Enter fullscreen mode Exit fullscreen mode

Preloading resources

As an aside, modern browsers will continue to scan the HTML whilst blocked and 'look ahead' to what external resources are coming up and then download them speculatively. The manner in which they do this varies between different browsers so cannot be relied upon to behave a certain way. In order to mark a resource as important and therefore more likely to be downloaded early in the rendering process, a link tag with rel="preload" can be used.

<link href="style.css" rel="preload" as="style" />
Enter fullscreen mode Exit fullscreen mode

Fetching CSS and JavaScript resources in a web browser

3. Parse the CSS and build the CSSOM

You may well have heard of the DOM before, but have you heard of the CSSOM (CSS Object Model)? Before I started researching this topic a little while back, I hadn't!

The CSS Object Model (CSSOM) is a map of all CSS selectors and relevant properties for each selector in the form of a tree, with a root node, sibling, descendant, child, and other relationship. The CSSOM is very similar to the Document Object Model (DOM). Both of them are part of the critical rendering path which is a series of steps that must happen to properly render a website.

The CSSOM, together with the DOM, to build the render tree, which is in turn used by the browser to layout and paint the web page.

Similar to HTML files and the DOM, when CSS files are loaded they must be parsed and converted to a tree - this time the CSSOM. It describes all of the CSS selectors on the page, their hierarchy and their properties.

Where the CSSOM differs to the DOM is that it cannot be built incrementally, as CSS rules can overwrite each other at various different points due to specificity. This is why CSS blocks rendering, as until all CSS is parsed and the CSSOM built, the browser can't know where and how to position each element on the screen.

Parsing CSS and building the CSSOM in a web browser

4. Execute the JavaScript

How and when the JavaScript resources are loaded will determine exactly when this happens, but at some point they will be parsed, compiled and executed. Different browsers have different JavaScript engines to perform this task. Parsing JavaScript can be an expensive process in terms of a computer's resources, more-so than other types of resource, hence why optimising it is so important in achieving good performance. Check out this fantastic post for a deeper dive into how the JavaScript engine works.

Load events

Once synchronously loaded JavaScript and the DOM are fully parsed and ready, the document.DOMContentLoaded event will be emitted. For any scripts that require access to the DOM, for example to manipulate it in some way or listen for user interaction events, it is good practice to first wait for this event before executing the scripts.

document.addEventListener('DOMContentLoaded', (event) => {
    // You can now safely access the DOM
});
Enter fullscreen mode Exit fullscreen mode

After everything else like async JavaScript, images etc. have finished loading then the window.load event is fired.

window.addEventListener('load', (event) => {
    // The page has now fully loaded
});
Enter fullscreen mode Exit fullscreen mode

Timeline of executing JavaScript in a web browser

5. Merge DOM and CSSOM to construct the render tree

The render tree is a combination of the DOM and CSSOM, and represents everything that will be rendered onto the page. That does not necessarily mean all nodes in the render tree will be visually present, for example nodes with styles of opacity: 0 or visibility: hidden will be included, and may still be read by a screen reader etc., whereas those set to display: none will not be included. Additionally, tags such as <head> that do not contain any visual information will always be omitted.

As with JavaScript engines, different browsers have different rendering engines.

Merging the DOM and CSSOM to create a render tree in a web browser

6. Calculate layout and paint

Now that we have a complete render tree the browser knows what to render, but not where to render it. Therefore the layout of the page (i.e. every node's position and size) must be calculated. The rendering engine traverses the render tree, starting at the top and working down, calculating the coordinates at which each node should be displayed.

Once that is complete, the final step is to take that layout information and paint the pixels to the screen.

And voila! After all that, we have a fully rendered web page!

Calculating the layout and paint of a web page in a browser

Discussion

pic
Editor guide
Collapse
neo1380 profile image
Neo

Great post.loved the images. Thank you.

Collapse
jstarmx profile image
James Starkie Author

Thanks, really appreciate it!

Collapse
alohci profile image
Nicholas Stimpson

Nice article, but I've long argued that the information in the diagram in section 3 cannot possibly be right. In the absence of a DOM document, (remember, that's not applied until section 5) there's no way to turn body { font-size: 16px } div { font-size: 14px } into a tree structure in which a div rule is a child of a body rule. You could possibly turn body { font-size: 16px } body div { font-size: 14px } into such a tree structure, though personally I have my doubts that browsers actually do this, since I really can't see how doing so would significantly help evaluate the cascade.

The CSSOM is real and is indeed tree structured, but it's a very different sort of tree, where stylesheets are towards the top of the tree and each has rule children, which have selector and declaration-block children. Each declaration block has declaration children which have name and value children.

Collapse
jstarmx profile image
James Starkie Author

Thanks for the feedback, yes I see what you're saying. It's a great point. I'm happy to admit I'm no expert on the inner workings of the CSSOM. I'll do some more investigation and see if I can update the diagram to better reflect what's happening 👍

Collapse
luiz0x29a profile image
Real AI

From what I've read on Firefox source code, James is right.
You can actually make the CSS tree at #3. Both trees are constructed at the same time. Rules are just rules, they are attached in reverse order from what you expect, you can actually build the entire rule tree without having the DOM, its the DOM that does the lookup on the CSSOM when its time to render.

Its more like the body rule is the child of the div rule, CSS is inverted. The lookup is the inverse of what common sense says it is. Its a literal inverted tree, it actually is.

Thread Thread
alohci profile image
Nicholas Stimpson

What do you mean by both trees? The DOM tree and the CSSOM tree, or the CSSOM tree and the render tree? Have you got a link to the relevant bit of the Firefox source code?

Collapse
aarongustafson profile image
Aaron Gustafson 🚀🕸

Thanks for putting this together so succinctly!

Over on A List Apart, we have a series that goes into great depth on this as well as how assistive technologies play into it. The roll-up is here: alistapart.com/article/from-url-to...

Collapse
jstarmx profile image
James Starkie Author

Awesome, thanks, I'll check it out!

Collapse
gochev profile image
Nayden Gochev

How did you created this diagrams ? The style of them looks SICK as hell :+)

Collapse
jstarmx profile image
James Starkie Author

Thanks! 😀 I draw them in Photoshop. The typeface is one I made, intending to open source it soon.

Collapse
gochev profile image
Nayden Gochev

awesome work !

Collapse
sagar profile image
Sagar

The interviewer asked me this question and I'm failed to answer it and exactly define the process of browser rendering when the user visited any URL. After reading your article I'm pretty much sure that I'll explain it to someone in a better way. I really like your article and started following you to read similar kind of articles in the future.

Thank you, @james Starkie.

Collapse
jstarmx profile image
James Starkie Author

Very kind, thank you, glad I could help 😊

Collapse
attilajakab profile image
attilajakab

Hi @jstarmx , thanks for writing this great post.
Could you please fix the link for "Check out this fantastic post for a deeper dive into how the JavaScript engine works" it takes me to this same page. Thank you!

Collapse
jstarmx profile image
James Starkie Author

Ah whoops!! Thanks for the heads up, sorted now 😊

Collapse
mandarbadve profile image
Mandar Badve

Great post, have one question - In case of Ajax or SPA, all of these steps executed or is there any difference?

Collapse
jstarmx profile image
James Starkie Author

Thanks! These steps would still be executed, the differences would come afterwards really. In the case of a SPA, most would only happen once (on initial load), the trade-off being that those steps will likely take longer because there are more assets to load up-front. But the last couple of steps will execute every time the DOM is updated (e.g. navigating to a new 'page') as the layout will need to be recalculated and repainted each time.

Collapse
mandarbadve profile image
Mandar Badve

Make sense, thanks.

Collapse
qq449245884 profile image
qq449245884

Hello, may I translate your article into Chinese?I would like to share it with more developers in China. I will give the original author and original source.

Collapse
jstarmx profile image
James Starkie Author

Yes absolutely, please go ahead, that would be great :)

Collapse
qq449245884 profile image
qq449245884

thank you very mush!

Collapse
sarghed profile image
Sarghe Dana

Great information!
Thanks!

Collapse
thewdhanat profile image
Thew Dhanat

Another detailed resource is Chrome University

Collapse
jstarmx profile image
James Starkie Author

That looks really interesting, thanks!

Collapse
andreabtahi96 profile image
Andre Abtahi

I love the visuals. A fantastic approach to sharing your knowledge!

Collapse
hb profile image
Henry Boisdequin

Insightful, thanks!

Collapse
josiasaurel profile image
Josias Aurel

Much thanks for writing this article. Very helpful ✨

Collapse
kundrasarthak profile image
Sarthak kundra

What a great, concise and informative article! Loved it! <3

Collapse
stuartcmd profile image
Stuart

Hi, James, fantastic post, and thank you for sharing.

Collapse
z2lai profile image
z2lai

Very nice chronological summary of how browsers render pages. The topics covered (parse order, blocking vs non-blocking loading and load events) were exactly what I was looking for!

Collapse
jstarmx profile image
James Starkie Author

Fantastic, glad I could help!

Collapse
phongduong profile image
Phong Duong

Thank you. Your post is interesting

Collapse
mintuz profile image
Adam Bulmer

Great article, I also find this resource useful for knowing what causes a reflow/layout

Collapse
jstarmx profile image
James Starkie Author

Thank Adam! Looks like very interesting and useful link too, cheers.

Collapse
93alan profile image
Alan Montgomery

Brilliant explanation. Thanks.

Collapse
camilaebf profile image
Camila Blanc Fick

Awesome work 👏

Collapse
lucinick profile image
Hung

Cool, now I know basically how websites painted. Thanks