Overview
Understanding how the web works is a great start to building web stacks that align with the principles associated with the web. Hence, in this article, the focus will be to debunk what happens when you request a specific URL through the browser. The article goes into detail in explaining the various components of the URL, communication protocols used on the web, and server components that are involved in processing the request.
The Flow of the request
1. Typing the URL and pressing enter
When you type the URL, it is usually characterized by various components separated by special characters such as semi-colons, backward slashes, or periods. The URL is composed of the scheme i.e. https://
, the domain i.e. google.com
, the path i.e. /maps
, and the resource i.e. {long, lat}
.
-
The Scheme
The scheme
https://
stands forHyperText Transfer Protocol Secure
. Its an extension ofHTTP
, an application protocol (OSI 7) built onTCP
as the transport layer (OSI 3). This scheme tells the browser to make a connection to the server usingTLS
(Transport Layer Security).TLS
is a secure successor toSSL
(Secure Sockets Layer).SSL
is a technology that clients and servers rely upon to communicate in a secure manner using encrypted communication channels over any network. To set up the encrypted communication channel, the client and the server perform a handshake facilitated through digital certificates known as `TLS/SSL certificates. -
Domain
The component
google.com
is the domain name. A domain name is a human-readable address that points to a specificIP
(Internet protocol) address. Domain names make it easier for clients to access resources across the web. It is hard to memorize anIP
address in comparison to a domain name.IP
addresses are versioned where we haveIPV4
the fourth version of the Internet Protocol made up of 32 bits, divided into four 8-bit segments i.e. 8.8.8.8/8.8.4.4 forgoogle.com
.IPV4
is only limited to 4.3B unique addresses. There is alsoIPV6
, the sixth version of the Internet Protocol made up of 128-bit addresses that allow for trillions of unique addresses i.e. 2001:4860:4860::8888 forgoogle.com
. Path
A path i.e./maps
points a resource that the user wants to access on the site visited. The resource might be static content i.e. HTML, CSS or JavaScript or dynamically generated content i.e. reviews.Resource
The resource{long, lat}
represents a place within the maps resource path provided by Google in the maps service. The resource might be in the form of static or dynamic content.
2. Browser resolves the domain to its unique IP
When the URL is accessed, the browser first makes a DNS
request to the DNS
server where the domain in the URL is resolved and an IP
address is returned as a response.
-
DNS (Domain Name System)
A database system that maps domain names to
IP
addresses, allowing browsers to request specific websites and servers to communicate with each other. It is a high-level abstraction that abstracts the complexity ofIP
addresses.
The lookup process is shown below:
3. Browser initiates a TCP
connection with the server
Transmission Control Protocol (TCP) is a communications protocol that enables reliable and ordered delivery of data packets between two hosts i.e. the client and the server. The browser uses the IP
address from the DNS server to make the TCP connection using a 3-way handshake shown below:
The communication channel establish between the client and the server relies on TLS
to encrypt all data packets shared between the two hosts.
4. Browser sends the HTTP request to the server.
After the encrypted communication channel is established, the browser can now send a HTTP request to the server and receive a response safely. A HTTP request is a message sent by a client to a server to request access to some resource. A HTTP request is made up of various components including a request line, request headers, and message body (optional).
-
Request line
The request line contains the method i.e.
GET
,POST
,PUT
, orDELETE
. It also contains the path i.e./
depicting the root path and the version of HTTP being used to make the request i.e.HTTP/1.1
. -
Request headers
The request provides additional information such as the browser making the request i.e
User-Agent: Chrome/68.0.3440.106
, the type of content the server accepts i.eContent-Type: application/json
to show the server accepts content in JSON format, or the host the request is being made to i.eHost: google.com
. -
Message body
The message body represents data being sent to the server and it is used with request methods such as
POST
andPUT
.
Example of a HTTP request:
GET / HTTP/1.1
Host: google.com
User-Agent: Chrome/68.0.3440.106
Accept: text/html
Accept-Language: en-US
Accept-Encoding: gzip
Connection: keep-alive
Content-Type: multipart/form-data
Content-Length: 256
5. Server receives request, processes the request and returns a response
The HTTP request goes through the Web Application Firewall (WAF) which is a component of the web stack that protects the web application from malicious HTTP traffic by analyzing and filtering the traffic.
The HTTP request then goes through a load balancer which is another component within the web stack that ensures traffic is distributed appropriately to the web server(s). The load balancer is also used to terminate TLS/SSL. HAProxy is one of the most used load balancers.
The web server, a software that serves static or dynamic content to the web receives the request and processes it. It ensures that all the required headers are in the request and that the resource path exists. If the request is for static content i.e. GET /index.html
, the web server will look up the .html
file within the application code. If the content is dynamic and requires any business logic to generate, the web server communicates with the application server. Examples of web servers include NGINX and Apache.
An application server is a software framework that handles business logic, database transactions, and communication between applications and users. The application server resolves the request for dynamic content and returns a response to the web server.
The web server returns the content as a HTTP response back to the client (browser). The HTTP response is made up of various components including the status line, response headers, and message body.
-
Status line
The status line contains the version of HTTP used i.e
HTTP/1.1
and the status code of the request i.e200 OK
-
Response headers
The response headers can include:
Server: Apache/2.4.46
,Content-Length: 1024
, orContent-Type: text/html
. -
Message body
The message body contains the content returned by the web server i.e. a text or a JSON response.
Example of a HTTP response
`
HTTP/1.1 200 OK
Date: Fri, 12 May 2023 01:11:12 GMT
Server: Nginx/1.24
Last-Modified: Thur, 11 May 2023
ETag: "0-23-4024c3a5"
Accept-Ranges: bytes
Content-Length: 42
Connection: close
Content-Type: text/html
Holberton School
`
6. Browser renders the content
The browser receives the response from the server and checks the response headers to know how to render the content. for example, if the Content-Type
header is text/html
, the browser knows that it will render a html file to the user. If the Content-Type
is application/json
the browser will parse the JSON data into an object and wrap the object in a HTML element and render it. The rendering of the content is done through the DOM, a high-level Web API provided by the browser to efficiently render a webpage.
Closing
Understanding the concepts outlaid above helps assess avenues for improving the reliability, performance, and security of the web.
Top comments (0)