DEV Community

loading...
Cover image for Android WebSocket Clients for Amazon API Gateway

Android WebSocket Clients for Amazon API Gateway

jameson profile image Jameson ・7 min read

WebSockets are a great way to achieve bi-directional communication between a mobile app and a backend. In December 2018, Amazon's API Gateway service launched serverless support for WebSockets.

Setting it all up is somewhat involved. I recommend starting with the Simple WebSockets Chat backend they provide as a demo. We'll focus on how to talk to it from Android, using Square's familiar OkHttp or JetBrains' newer Ktor.

Pre-requisites

  1. An activate an AWS account
  2. The latest version of the AWS CLI
  3. wscat installed via npm.
  4. AWS SAM CLI installed via Homebrew (if on Mac.)
  5. Android Studio

Setting Up a Backend

Installing the serverless demo app is pretty straight-forward.

Checkout the Chat app:

git clone https://github.com/aws-samples/simple-websockets-chat-app.git
Enter fullscreen mode Exit fullscreen mode

And deploy it to your account:

sam deploy --guided
Enter fullscreen mode Exit fullscreen mode

After a while, that command will finish. You'll be able to see the provisioned resources by inspecting the outputs of the CloudFormation stack:

aws cloudformation describe-stacks \
    --stack-name simple-websocket-chat-app \
    --query 'Stacks[].Outputs'
Enter fullscreen mode Exit fullscreen mode

It creates a DynamoDB table, three Lambda functions, an AWS IAM role, and an API Gateway with WebSockets support.

After you've glanced over the output to understand what all was created, let's zero in on the WebSocket endpoint itself. We'll need to find the URI that our client can use, to access it.

aws cloudformation describe-stacks \
    --stack-name simple-websocket-chat-app \
    --query 'Stacks[].Outputs[]' | \
jq -r '.[]|select(.OutputKey=="WebSocketURI").OutputValue'
Enter fullscreen mode Exit fullscreen mode

This should output something like:

wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod
Enter fullscreen mode Exit fullscreen mode

Command Line Testing using wscat

Let's quickly test the backend, to make sure it works as we expect. We'll use the wscat command-line utility. The command below will open a long-lived connection to the API Gateway, and we'll be able to send and receive messages in a subshell:

$ wscat -c wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod
Connected (press CTRL+C to quit)
> {"action": "sendmessage", "data": "foo"}
< foo
Enter fullscreen mode Exit fullscreen mode

The JSON format above is required by the app we deployed. You can change the value of "foo", but you can't change anything else.

If you try to pass something else, it doesn't work:

> Hello, how are you?
< {"message": "Forbidden", "connectionId":"Y0bEuc0UIAMCIiA=", "requestId":"Y0bwuGjXIAMFmEg="}
Enter fullscreen mode Exit fullscreen mode

But, if you open up multiple terminal windows, and connect them all to the endpoint, they'll all receive a valid message.

Calling the API from Android via OkHttp

Now that we know the WebSocket API is working, let's start building an Android app to use as a client, instead.

WebSockets have been supported in OkHttp since 3.5, which came out all the way back in 2016.

Initializing a WebSocket client is straight-forward:

val request = Request.Builder()
    .url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
    .build()
val listener = object: WebSocketListener() {
    override fun onMessage(ws: WebSocket, mess: String) {
        // Called asynchronously when messages arrive
    }
}
val ws = OkHttpClient()
    .newWebSocket(request, listener)
Enter fullscreen mode Exit fullscreen mode

To send a message to the API Gateway, all we have to do is:

ws.send(JSONObject()
    .put("action", "sendmessage")
    .put("data", "Hello from Android!")
    .toString())
Enter fullscreen mode Exit fullscreen mode

We can add a button to our UI with ViewBinding, and fire the WebSocket message whenever we click it:

ui.button.setOnClickListener { 
    ws.send(JSONObject()
        .put("action", "Hello from Android!")
        .put("data", command)
        .toString())
}
Enter fullscreen mode Exit fullscreen mode

Since all of the threading is handled inside OkHttp, there really isn't a lot more to it. Save a handle to your view binding and to to your WebSocket client when you create your Activity:

class MainActivity : AppCompatActivity() {
    private lateinit var ui: ActivityMainBinding
    private lateinit var ws: WebSocket

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        ui = ActivityMainBinding.inflate(layoutInflater)
        setContentView(ui.root)
        connect() // As above
    }
Enter fullscreen mode Exit fullscreen mode

Calling the API from Android via Ktor

Ktor is built with the assumption that you're using Coroutines, and managing your own scope/context. This makes it a more flexible tool, but adds some additional complexity.

The basic setup for the tool is going to look like this:

private suspend fun connect(ktor: HttpClient, u: Url) {
    ktor.wss(Get, u.host, u.port, u.encodedPath) {
        // Access to a WebSocket session
    }
}

private /* not suspend */ fun connect() {
    val url = Url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
    val ktor = HttpClient(OkHttp) {
        install(WebSockets)
    }
    lifecycleScope.launch(Dispatchers.IO) {
        connect(ktor, url)
    }
}
Enter fullscreen mode Exit fullscreen mode

Inside the trailing closure, you have access to an instance of DefaultClientWebSocketSession. It has two important members:

  1. A ReceiveChannel named ingoing, and
  2. A SendChannel named outgoing.

SendChannel and ReceiveChannel are the inlet and outlet to a Kotlin Channel, which is basically like a suspendible queue.

It's pretty trivial to send and receive some simple messages inside the WebSocket session closure:

ktor.wss(Get, u.host, u.port, u.encodedPath) {
    // Send a message outbound
    val json = JSONObject()
        .put("action", "sendmessage")
        .put("data", "Hello from Android!")
        .toString()
    outgoing.send(Frame.Text(json))

    // Receive an inbound message
    val frame = incoming.receive()
    if (frame is Frame.Text) {
        ui.status.append(frame.readText())
    }
}
Enter fullscreen mode Exit fullscreen mode

However, we're missing a bunch of functionality that we had in our OkHttp solution. Namely:

  1. We want to send and receive data at the same time, in a loop, and
  2. We want to get notifications when the connection opens and closes.

Parallel send and receive in Ktor

Our goal here is ultimately to dispatch a stream of messages to the outgoing channel, and to consume a stream of messages from the ingoing channel, at the same time. The basic approach we'll use is to launch two bits of work asynchronously, and then wait for them to finalize:

ktor.wss(Get, u.host, u.port, u.encodedPath) {
    awaitAll(async {
        // Code that will send messages
    }, async {
        // Code that will receive messages
    })
}
Enter fullscreen mode Exit fullscreen mode

We need some stream of events to trigger the send messages. Adding a button element to our UI makes sense. But, we'd like to model the button clicks as a stream of commands, by the time we dispatch events over the WebSocket.

Let's first create an extension function to map button clicks into a Flow<Unit> (credit to StackOverflow):

private fun View.clicks(): Flow<Unit> = callbackFlow {
    setOnClickListener { offer(Unit) }
    awaitClose { setOnClickListener(null) }
}
Enter fullscreen mode Exit fullscreen mode

Now, we can listen to a flow of events from a button, map them into the format we need, and send them:

ui.button.clicks()
    .map { click -> "Hello from Android!" }
    .map { message -> JSONObject()
        .put("action", "sendmessage")
        .put("data", message)
        .toString()
    }
    .map { json -> Frame.Text(json) }
    .collect { outgoing.send(it) }
Enter fullscreen mode Exit fullscreen mode

That will work well for the outgoing events. Now, we just need to respond to inbound events, in the second async block.

In this case, there isn't a lot of value to mapping the result. The value we receive over the socket is the contents associated with the "data" key in the messages. So for example, we might get "Hello from Android!":

incoming.consumeEach { frame ->
    if (frame is Frame.Text) {
        ui.status.append(frame.readText())
    }
}
Enter fullscreen mode Exit fullscreen mode

When it's all said and done, we end up with something like this:

ktor.wss(Get, u.host, u.port, u.encodedPath) {
    awaitAll(async {
        ui.button.clicks()
            .map { click -> JSONObject()
                .put("action", "sendmessage")
                .put("data", "Hello from Android!")
                .toString()
            }
            .map { json -> Frame.Text(json) }
            .collect { outgoing.send(it) }
    }, async {
        incoming.consumeEach { frame ->
            if (frame is Frame.Text) {
                ui.status.append(frame.readText())
            }
        }
    }
})
Enter fullscreen mode Exit fullscreen mode

Ktor's lifecycle events

OkHttp allowed us to override callbacks on the WebSocketListener, to get notified of various lifecycle events:

val listener = WebSocketListener() {
    override fun onOpen(ws: WebSocket, res: Response) {
        // when WebSocket is first opened
    }
    override fun onClosed(ws: WebSocket, code: Int, reason: String) {
        // when WebSocket is closed
    }
}
Enter fullscreen mode Exit fullscreen mode

Ktor doesn't work like that. They suggest some different approaches to recover those events.

The easiest one to recover is the event when the socket opens. It's just the first thing that happens inside of the wss session closure:

ktor.wss(Get, u.host, u.port, u.encodedPath) {
    ui.status.append("Connected to $u!")
}
Enter fullscreen mode Exit fullscreen mode

To get more insight into the WebSocket termination, we can expand our processing in our receive block:

incoming.consumeEach { frame ->
    when (frame) {
        is Frame.Text -> { /* as above */ }
        is Frame.Close -> {
            val reason = closeReason.await()!!.message
            ui.status.append("Closing: $reason")
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Catching failures is also fairly easy:

private fun connect() {
    val url = Url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
    val client = HttpClient(OkHttp) {
        install(WebSockets)
    }
    lifecycleScope.launch(Dispatchers.IO) {
        try {
            connect(client, url)
        } catch (e: Throwable) {
            val message = "WebSocket failed: ${e.message}"
            ui.status.append(message)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Wrapping Up

Well, there you have it: a rough and dirty explanation of how to use OkHttp and Ktor to consume an Amazon API Gateway WebSockets API. 🥳.

The code for this project is available on my GitHub page.

Discussion

pic
Editor guide