The story of Authentication and Authorization

#webdev #softwaredevelopment #softwareengineering

When a process on one machine wants to communicate with a process running on another machine, it does it via sockets. This is inter-process communication over a network. A process which is running or listening on a socket (listening socket) in a machine (server) awaits other processes (clients) to initiate the communication by sending SYN messages, which means that it accepts SYN messages from clients and performs the TCP 3-way handshake to establish server-client connections. Once the TCP 3-way handshake is done, a connected socket is created at the server for that client: Send and Receive buffers are created in the kernel memory to facilitate persistent 2-way communication with the client, and a file-descriptor is assigned to identify the server-client connection, file-descriptor is basically a number that acts as a label to data map (in this case data containing source-ip, source-port, destination-ip, destination-port, etc.). So the client and server communicate via connected sockets. A client sends a request to the server which is queued in the receive buffer and once the server process makes a receive system call, the OS copies the request data from kernel memory to the server process' memory and then the server handles the request, generating response which is copied by the OS to the send buffer of the connected socket, from where it is sent to the client over a network. This setup and mechanism seems enough for an inter-process communication over a network to work properly in the case of web servers serving static websites, but application server processes (or simply applications) have complex business logic that needs to manage user-specific data in order to handle requests. Also, applications need user data to be durable, much more than what's offered by a TCP connection, and finally, a user may or may not use the same device to use the application. These application level features need an application-specific identity for the user, and mapping of each request to a specific user. In other words, an application needs to authenticate each request. And these users can access resources and perform actions in the application based on their roles or permissions, which is the authorization. So, authentication and authorization is done for each request that an application receives, which then allows the application to map that request to a managed user (a user that already has an entry in the application's database) and it helps in identifying which resources a user can access or which actions a user can perform. So in short, authentication is 'who has sent this request?'. And authorization is 'what resources can this application user access and what actions can this application user perform within the application?'.

Registration is the process of adding new users to the application, i.e., creating their profiles and entries in the application database with a unique id and managing it. Registration of new users may require people to enter something that is unique to them only, like their email address or their phone number or their government id number and usually a password along with it. And on successful registration, an entry for that user is created in the application database which will be referenced for each request to access a protected resource and therefore would require the user to enter its credentials. So, the simplest way to authenticate each request to the app would be to prompt the user to enter its credentials, such as username and password, a PIN, or a secret passphrase, which, once verified by the application via referencing the database, would allow the user to access the requested protected resource based on user's roles and permissions. But requiring the user to provide credentials for every request could be too repetitive and would be a bad user experience. So we need a better way of authentication.

Once the user enters its credentials, and it's verified by the application to be valid, the application server process can generate a temporary access token and send it to the user to be sent along with the subsequent requests for the session. Once the request is received, the server should validate the token in that request to allow/reject the request to access/take action on a specific protected resource. The token is both issued and validated by the server, the client merely holds it to send in the requests and the token can have any structure and information. JSON Web Tokens or JWTs are a standard way to implement access tokens. They have a header, payload/body and signature. The body or payload contains user-info like subject (username or userid), user-roles / permissions / authorities, and token metadata like issued-time and expiry-time. The signature is server-specific and is used to sign the token. The header contains information regarding the signing algorithm used. As the JWT token contains user-info (should not contain sensitive information) within it, it doesn't require a database lookup. Token validation includes signature verification, extraction of user-info and checking token expiry. The first request requires the user to enter credentials for authentication, which on success generates the token. The user uses the token to authenticate subsequent requests. So it is credentials-based authentication for the first request of the user-session (primary request authentication) + token-based authentication for subsequent requests of the same user-session (subsequent request authentication).

We can further solidify the authentication by making the first request authentication to be of multiple steps, combining a possession-based authentication step after the credentials-based authentication step. This is multi-factor authentication or MFA, having 2 factors of authentication in this case. Once credentials are verified, an OTP is generated and sent to the user via SMS or email, and on entering of OTP by user, the first request of the session is successfully authenticated and a JWT access token is generated.

The application does not necessarily have to perform first request authentication. They can delegate authentication to some other server where the person making the request is already a user to avoid the user creating and remembering multiple credentials for multiple applications. This is single sign-on (SSO) and trusted authorization servers like Google, GitHub, and LinkedIn are generally used to log into other applications as most people have profiles on these apps. So the user is presented with a list of authorization servers to choose from to login, and as the user makes the choice, the authorization server asks them to give consent to share their data with the client application server on which the user has made the request. Once given the consent, the authorization server (Google, LinkedIn, GitHub, etc.) generates an authorization code which the client (application server) then exchanges for an identity token which contains user data and the authentication is complete. This is the main principle behind Open ID Connect (OIDC).

API keys are another tool for authenticating requests. To authenticate the first request, users have to rely on credentials-based authentication and then generate a key for themselves which is stored in the database against the user id. Then subsequent requests need to have that key in them (in the header or in the URL as a query-parameter). The server validates the API key via querying the database.

Biometric authentication is another mode of authentication. It directly maps user identity in the app to the real human identity. It is not generic as it requires scanning of the eyes or fingerprints or face, which means it can be done only via fingerprint scanner, etc. and it is a very critical piece of information for it to be shared with trivial applications. That's why its use is very limited. One common example is a phone lock, which has facial lock as well as fingerprint lock options in addition to traditional password, PIN and pattern locks.

So we have authentication methods that depend on what we know (credentials-based), what we have (possession-based) and who we are (biometric) to examine our identity. And this identity is used to manage requests made to the server.

Authorization is closely tied with authentication and is used to manage user access in the application, if the authenticated user has the authority to access the protected resource or perform action on the protected resource. It is something that comes into effect once the user is authenticated or logged in. Just like authentication, it also depends on the business logic of the application. For example, an Instagram user is allowed to view all posts from public accounts but can only view posts from a private account if they are a follower of a private account. Similarly, they can only create, edit and delete posts on their own account and no one else's. Ways of authorization can vary from application to application. Some apps can have business logic to assign a role to each user that defines user access. For example: a regular user can have read and write access to its own content while having only read access to others' content, while an admin can have both read and write access to both its own content and to others' content. This is simple role-based authorization, but the specifics can vary a lot, as roles can differ within groups or specific structs of the application, like a WhatsApp user can have different roles in different groups (member, admin, owner) and a different role overall.

In some cases, applications need user data which is already posted by the user on the internet like a GitHub repository or some calendar events setup by the user, to name a few. Because that application doesn't ask the user to post that piece of information again, rather it requests the server having that data to access it, and once the user gives the consent, the OG server sends the data to the requesting application. This is the main principle behind OAuth2. An application requests the authorization server to access a specific resource containing user-info which, once consented by the user, generates an auth-code which is to be exchanged to get an access token which finally allows the client application to access the requested resource (user-info) on behalf of the user. This same process can also be used for authentication, as I have already discussed above.

Now I would like to show the flow regarding how authentication and authorization roughly works in a real-life application, and for that, I will be using JWT-based authentication + role-based authorization using Spring security.

PRIMARY REQUEST AUTHENTICATION (LOGIN)
• If the request is POST /login (has HTTP method=POST and endpoint=/login), then UsernamePasswordAuthenticationFilter intercepts the request (credentials-based authentication). It extracts the username and password from the request body and calls the authentication manager to handle the authentication process, which contains username and password verification checks.
• Authentication manager creates an authentication object, which contains a username, password (credentials for authentication), a set of roles and permissions (for authorization) and a flag initialized to false (authenticated=false).
• Then depending on authentication type, it selects the suitable authentication provider for the authentication. For example: DAO authentication provider in the case of credentials-based authentication.
• Authentication provider loads the user from the database using its username and then matches its stored password, which is hashed, against the user-entered password after hashing it too. So it is an authentication provider that actually carries out username and password verification checks via its two components: UserDetailsService and a password encoder.
• Username verification check is a database call in the case of DAO authentication provider (in the case of credentials-based authentication) and is managed by UserDetailsService which loads the user from the database via its username.
• Password verification check is performed by another component of the authentication provider which is a password encoder and is used to hash the raw password.
• Once it is successful, the authentication object's authenticated flag is set to true (authenticated=true) and it is stored in the security-context-holder, which holds the authentication object for the duration of the request. It acts as a temporary place where authenticated user's information is stored for the duration of the request. Post-authentication operations (business-logic) fetch user-info (username or their roles/permissions/authorities) from there. In spring, @AuthenticationPrincipal and @PreAuthorize use the security-context holder to get the Authentication object for authorization purposes.

The first request is successfully authenticated (login is successful) and a JWT access token is generated and returned, which is to be included in the Authorization header of subsequent requests. JWT contains the roles and permissions for the user.

SUBSEQUENT REQUEST AUTHENTICATION
• JwtAuthenticationFilter intercepts the request and extracts the JWT from the Authorization header and validates the JWT.
• It is JWT-based authentication and its consists of the JWT verification check.
• JWT validation includes checking if it has the correct structure, if it is issued by this server, if it contains a subject and if its expired.
• Once JWT is verified, it builds an authentication object and stores it in the security-context-holder for the duration of the entire request.

In short, the application needs to assign an identity to the user and then check its privileges to function. Authentication as well as authorization can get quite complex based on the requirements of the application, as we will explore sometime later. Choosing the correct strategy depends on one's needs.

DEV Community

The story of Authentication and Authorization

Top comments (0)