DEV Community


Posted on

Serverless with real-time communication (WebSockets)


This article is a deep dive guide to build WebSockets on Serverless architecture.

If you are not familiar with WebSockets, please first What are WebSockets?
If you never used Serverless, please read first What is Serverless?


Last year my team built 100% serverless mobile application which now helps hundreds of developers to maintain happy and healthy teams.

Check how the Team Health Check help developers

After we've built the first version of the app, something felt off.
Imagine the app, where each screen must call the Lambda function to get the data every time, and if someone joins your team, you must navigate out and in again to see the change.

We've realized, it's time for a real-time connection.

We've tried easier way (AppSync), and... more flexible way :) (API Gateway v.2). This guide presents the more sophisticated architecture - using API Gateway.

Difference between Serverless REST API and Serverless WebSockets.

Let's first review how typical serverless architecture looks like, and then see how it evolves when we add the sockets layer.

Serverless using REST API

Handling REST API requests on Serverless is a straightforward process, which in the most common case utilizes 4 layers - each responsible for its own functionality (diagram below).

  1. User sends a request with Authorization header to the API Gateway.
  2. API Gateway authenticates the user with out-of-the-box Cognito User Pool authorizer. Amazon Cognito grants (or denies) the access, and pass the user attributes as uuid or email to the request context.
  3. API Gateway executes the associated Lambda function together with the user attributes.
  4. Lambda function makes all necessary requests to a database, data transformations and returns a response to the API Gateway
  5. API Gateway pass the response back to the user.

The same process, including authentication, happens every single time, for every request.

Easy, right?

WebSockets on Serverless

WebSocket connection though, is more… sophisticated :)

The user here is authenticated only once - during their initial UPGRADE request. If authentication process succeed, then the socket connection is established for as long, as any of the sides doesn’t close it.

User making subsequent requests to the WebSocket protocol doesn’t provide his claims because connection is stateful, and the server (here API Gateway) keeps the information about the sender.

The below diagram presents what happen when devices connect to the API Gateway WebSocket API, and later when one of the users invoke the API (in this case the user is voting, and his teammates get notified about it).

There are a few interesting facts about this architecture.

We are implementing Custom Authorizer.

API Gateway WebSocket API doesn’t offer out-of-the-box integration with Cognito User Pool. Therefore a Custom Authorizer must be implemented by ourselves.

The Custom Authorizer is a Lambda function, that validates JSON token with a public rsa key fetched from an authorization server where the user exist, and later grant the access by a setting a proper IAM policy in the response.

We are storing socket id for later reference.

When the access is granted, it’s our responsibility to save the socket id to the database such as DynamoDB.

In the above architecture the socket id is saved both to ConnectionsTable and ProfilesTable.

Authorizer is executed only once.

Both Authorizer and onConnect are the lambda functions executed only once when user connects to the socket API. This makes the subsequent requests to the socket even faster.

We can call the socket from anywhere, even outside AWS.

We can notify the socket from another service such as Lambda Function, EC2 or even outside AWS.

We do it by sending a POST request to the API Gateway's @connections route with id of a socket we want to notify in the path.
If we want to notify 100 sockets, we must send 100 POST requests what can be a bit worrying, but so far it worked well in our production app.

We are signing every request.

Each request to the API Gateway WebSocket API must be signed. This is to ensure API Gateway, that we have a permission to push a data to the user through the socket.

Not long ago it required a lot of steps, but thanks to our recently released npm package aws-request-signer you can do it with almost no effort.

To wrap up

We've covered the typical serverless architecture, then we've shown how it evolves when we add real-time communication and lastly we've covered some of the most interesting highlights about it.

If you want to go serverless, or need to add real-time communication, feel free to text me, or hire my services so you can enjoy fully automated serverless setup with the best practices within a week.

If you don't agree with any part, please comment your feedback.
Follow if you don't want to miss more deep dive content from production apps.

Thanks for reading!

Discussion (0)