loading...
Cover image for What happens when you type 'google.com' into a browser and press Enter?

What happens when you type 'google.com' into a browser and press Enter?

antonfrattaroli profile image Anton Frattaroli ・5 min read

My most favorite interview question I've come across yet was "You type 'google.com' into a browser address bar and hit <Enter>, what happens afterwards?"

Someone could talk for days on end trying to answer that with some form of completeness. How deep will they go? Strictly for fun, I'm going to put my answer here. When I was asked this in an actual interview, I rambled on for a good 10 minutes before they stopped me. And then I kept remembering things I forgot to include even after the interview finished.

I'm going to keep this formatted as a wall of text because that's how it felt to answer this question in conversation.

So What Happens?

The browser is going to analyze the input. Usually if it has a ".com" it won't think you're typing search terms. And once it decides it must be a URL, it'll check that it has a scheme, if not, it'll add "http://" to the beginning. And since you didn't specify a number of HTTP protocol features, it'll assume defaults, like port 80, GET method and no basic auth.

Then it'll create an HTTP request and send that. I'm not confident in my low level networking knowledge but if I was I'd say something about the MAC address, TCP packet transfers, dropped packet handling. But anyway, a "google.com" DNS lookup will happen, and if it's not already cached a DNS service will reply with a list of IP addresses, because "google.com" doesn't just have a single IP address. Browsers will pick the first one by default I believe. Not sure if they're regional or how the list works, but I know it's there.

So the HTTP request jumps from node to node until it gets to the IP address of google.com's load balancer. It wouldn't last long, Google would respond that you need to be using HTTPS - assuming with a 301 permanent redirect. So it would go all the way back to your browser, the browser would change the scheme to HTTPS, use the default 443 port and resend. This time the TLS handshake would take place between the load balancer and the browser client. Not 100% on how that works but I know the request would tell Google what protocols it supports (TLS 1.0, 1.1, 1.2) and Google would respond with "Let's use 1.2". Then the request gets sent with TLS encryption.

I think the next thing Google would do is put it through web application firewall rules on its load balancer to see if it's a malicious request. When it passes, the secure connection has probably been terminated (because PCI-DSS regulations say you don't need to encrypt internal traffic) and the request would get assigned to a pool in their CDN, and the google-side cached homepage will be returned in an HTTP response. Probably pre-gzipped.

Google's response header would be read by the browser, cached according to the response header caching policy, then the body would be un-gzipped. And because it's google it's probably ultra-optimized: minified, likely a lot of pre-rendered content, inlined CSS, JavaScript and images to reduce network requests and the time-to-first-render. But that request will trigger a cascade of other requests, all concurrent because it should be running HTTP/2. While those requests are being made, JavaScript would be parsed, probably not blocking because they used the defer attribute on their tags - or async, I never did read about what those did individually.

But the browser has probably already rendered the search box and is working on the toolbar at the top, which is going to take some extra network requests - I probably already have a cookie or maybe local storage with an OAuth token - or maybe I'm using Chrome and it already knows who I am, and that request with auth gets sent to their Google+ API that tells the Google search page application who I am.

Another request would be sent to get my avatar image. At this point they've already browser-sniffed to see if I wasn't using Chrome, in which case they would have popped-in a tooltip to tell me that Chrome is awesome and I should be using that instead of anything else.

I think it would quiet down at that point. All taking place in a fraction of a second.

What is observably different?

Let's lookup the DNS:

DNS lookup response for google.com

  • I know I had previously seen google.com coming back with multiple IP addresses, but that doesn't seem to be the case anymore. Seems that they used to use round-robin but don't anymore. This StackOverflow question covers it. I had forgotten it was called round-robin.

Network Layers...

In a formally structured answer, you'd probably reference the OSI Model, which I know of but am not well versed in. After looking it up, I take it network layering maps like this:

  1. Application - The logic initiating requests
  2. Presentation - HTTP
  3. Session - TLS
  4. Transport - TCP
  5. Network - packet routing (IP)
  6. Data link - frames (which seem to be packet containers)
  7. Physical - bitstreams
  • I missed that in TLS they exchange certificates after agreeing on a protocol.
  • Networking isn't my strongest arena.

Open google.com in my browser, disable cache:

Network tab of browser dev tools

  • I missed the host name canonicalization - which was a 301.
  • The correction from HTTP to HTTPS is a 307 Internal Redirect.
  • It then downloads fonts, the logo images, and my avatar image. Without an API call, which means they shoved my profile information in the page and bundled that with the return - so they're doing actual data retrieval when you hit google.com and not just serving cached assets.

The Response

WinMerge comparison of Chrome and IE11 response bodies

Above is a file comparison of the IE 11 and Chrome responses - both logged out.

  • Not terribly different between IE11 and Chrome. But it means they're user-agent sniffing server-side and not client-side. Could have mentioned this in my answer.
  • Unexpectedly, the Chrome response is larger by 22kB. I wonder if it's the search-by-voice feature, which is visibly absent from IE 11. IE11 probably needs polyfills and the Chrome advertisement but it's all obfuscated and I'm not going to torture myself any further.
  • Even after I clear my cookies in Chrome, it still sends cookies on first request. It does not do that in IE 11.

Lets dig into that rendering!

First render loading of google.com

That pic above is the first screenshot Chrome will give you.

  • There aren't any async or defer attributes on the script tags, just nonce attributes. I'm learning about nonce as of this minute, and it seems to be security related. I guess they want those blocking scripts. I'm sure they fiddled around with/without async/defer at some point and decided against it.
  • Note to self: Full response is a mess of mixed JavaScript, CSS, and HTML. They aren't following any rules governing their placement in regards to separation.

What about the question itself?

You know what? Maybe it's not that great of an interview question for a developer since the answer has so much networking involved. It's the format of the question I like, something open ended, that includes some guessing. That gives the interviewer the opportunity to follow up with questions like "How do you think TLS is established?" to see how the candidate thinks, see how creative they are, see what their limit is (how patient?).

What's your favorite interview question?

Discussion

markdown guide
 

You could go on for much longer if you also talk about hardware interrupts from the keyboard, and the handling of the WM_KEYDOWN window messages. That is, when you type "google.com" on the keyboard, where do those key presses go? How does the computer read and process them?

For every abstraction layer, there's always a deeper layer that you could explain in more detail. :)

 

That is, when you type "google.com" on the keyboard, where do those key presses go? How does the computer read and process them?

welp, guess know what rabbithole I'm falling down next

 

You can answer this question for infinitely long -

keyboard and typing physics > brain signals to make you type > psychology about why you typed > your great grandfather's marriage and the quadrillion possibilities and detail since then that made you come to that interview > Infinite ∞ stuff about the creation of the universe. 😵😵😵😵

 

As an addition, if you get into the google internal infrastructure, with their multi-layered DNS LB and service LB and front-end proxies you can talk for a few hours, at least :))

A good question indeed. I also realized that I lack in the TLS knowledge, I'll put it in the 1000 long TOREAD list.

 

I thought about including a section on browser mechanics - parse, evaluate, paint. But I didn't originally say anything about that so I deemed it to be out of scope.

 

There is a Github repository that deals with that question and tries to be really specific.

GitHub logo alex / what-happens-when

An attempt to answer the age old interview question "What happens when you type google.com into your browser and press enter?"

What happens when...

This repository is an attempt to answer the age old interview question "What happens when you type google.com into your browser's address box and press enter?"

Except instead of the usual story, we're going to try to answer this question in as much detail as possible. No skipping out on anything.

This is a collaborative process, so dig in and try to help out! There are tons of details missing, just waiting for you to add them! So send us a pull request, please!

This is all licensed under the terms of the Creative Commons Zero license.

Read this in 简体中文 (simplified Chinese), 日本語 (Japanese) and 한국어 (Korean). NOTE: these have not been reviewed by the alex/what-happens-when maintainers.

Table of Contents

 

thanks for your nice comment , what happen if you copy n paste following address into your browser and press enter -- alamnr.github.io/profile.html?user...

 

I often ask this question to figure out which layer they have strength of technical knowledge.

If the interviewee talks about network layer longer than others, he probably have great skill on that layer.

 

The question is so incredibly broad and can be answered in so many (correct) ways that I can't see how it would be useful

 

Ah, but it is a perfect question to learn about the candidate, what he knows, how he thinks, etc. It’s a kind of Rorschach test. Also, a good sieve for two “interesting” categories of people: “I don’t know”, and “the browse displays the google search box”.

 

The nonce attribute on a script tag is a CSP-related attribute:

developers.google.com/web/fundamen...

To use a nonce, give your script tag a nonce attribute. Its value must match one in the list of trusted sources. For example:

<script nonce=EDNnf03nceIOfn39fn3e9h3sdfa>
  //Some inline code I cant remove yet, but need to asap.
</script>

Now, add the nonce to your script-src directive appended to the nonce- keyword.

Content-Security-Policy: script-src 'nonce-EDNnf03nceIOfn39fn3e9h3sdfa'

Remember that nonces must be regenerated for every page request and they must be unguessable.

It does prevent unsafe script from being executed on the webpage if you have an XSS vulnerability. As the nonce must be present in the CSP header, even if you could inject a script tag it wouldn't be executed.

Also note that it does not always default to http and port 80: If your website is in a HSTS preload list. In the case of google.com, it isn't, but many big websites will.

If a website enforces HSTS, it will only default to HTTPS.

 

I like how that's nested under "fundamentals".

 

The tech equivalent of Subway's "How do you make a PB&J sandwich" interview test.

Both are obviously impossible to answer in full detail, so the question then becomes, what details do you choose to focus on?

 

Every time I see a breakdown of this question I am pleasantly surprised about where the author digs in vs skims over. I liked your approach here of writing out an answer off the top of your head and then digging in on what actually happens. I did my own take on this a while back, but I am afraid it came across waaaay more stiff :).

 

Can you try to explain me how to use SOLID in frontend with pure javascript?

 

For my part, nothing. It tries to resolve the domain name, encounter my blocklist, and is redirected to the loopback address.

 

Don't forget about all the neat low level stuff that's happening even before you leave the LAN!

 

And then I would ask "how do the neurons inside your brain react on your intention to google 'cat videos'?".

 

Everyone forgets the obvious. It will break the Internet. Please don't do it. That wireless black box atop Big Ben will be subjected to bad things.

 
 

If you were not a developer, what would you like to be?

 

Can you do an "explain to a 5 year old" what happens when you type google into a browser?!

 
 

Hmm... and I thought you get to the google search. ;)

 

So what did the interviewer say about your answer?

Great read, thanks for sharing this!

 

"Yeah, that's fine, we got it. Tell me about the last time you had a difficult bug."

Honestly, seemed like he had good questions but wasn't a good interviewer.

 

There is a whole github repo describing what happens in such detail that it will make happy any interviewer.

 

Good article, just a little note: from an OSI Model point of view, HTTP is placed in the Application layer, not in the Presentation one.

 

You forgot to talk about what happens in google datacenters, the nodes that activate or deactivate, depending on the implied load, the air conditionnera, the solar panels.

You forgot to talk about the atomso, the electrons, the light, the quantum propability wave functions of the whole universe (at least in the light cone), and how all this happens really only if you look at the screen, as an observer, or the cat may still be alive (or dead).