DEV Community

Kazuki Higashiguchi
Kazuki Higashiguchi

Posted on • Updated on

Deep dive into WebSocket opening handshake protocol with Go

In this article, I will explain what exactly WebSocket protocol does when handshaking. What a WebSocket server should do is explained in Learn WebSocket handshake protocol with gorilla/websocket server.

This article focuses what a WebSocket client does, and explains it with Go codes.

WebSocket

WebSocket is a mechanism for low-cost, full-duplex communication on Web, which protocol was standardized as RFC 6455.

The following diagram, quoted by Wikipedia, describe a communication using WebSocket between client and server.

A diagram describing a connection using WebSocket

Now, we will focus on the first step "Handshake (HTTP Upgrade)",

WebSocket client

Next, let's learn the WebSocket protocol with client implementation. The sample code is available on a GitHub repository.

package main

import (
    "log"

    "github.com/gorilla/websocket"
)

func main() {
    ctx := context.Background()

    // (omit)... signal handling
    endpointUrl := "ws://localhost:12345"

    dialer := websocket.Dialer{}
    c, _, err := dialer.DialContext(ctx, endpointUrl, nil)
    if err != nil {
        log.Panicf("Dial failed: %#v\n", err)
    }
    defer c.Close()

    // Codes for bidirectional messages (listening to read messages for server)
    done := make(chan struct{})
    go func() {
        defer c.Close()
        defer close(done)

        for {
            _, message, err := c.ReadMessage()
            if err != nil {
                log.Printf("read message: %#v\n", err)
                return
            }
            log.Printf("recv: %s\n", message)
        }
    }()

    // (omit)... periodic message sending and shutdown handling
}
Enter fullscreen mode Exit fullscreen mode

URI schemes "ws" and "wss"

Specify the endpoint of the request to the WebSocket server like this.

endpointUrl := "ws://localhost:12345"
Enter fullscreen mode Exit fullscreen mode

The WebSocket protocol described in RFC 6455 registered two URI schemes.

  • ws: WebSocket Protocol
    • A ws URI identifies a WebSocket server and resource name
  • wss: WebSocket Protocol over TLS
    • A wss URI identifies a WebSocket server and resource name and indicates that traffic over that connection is to be protected via TLS.

Some security considerations about WebSocket are described in RFC 6455 10.6. Connection Confidentiality and Integrity.

Connection confidentiality and integrity is provided by running the WebSocket Protocol over TLS (wss URIs). WebSocket implementations MUST support TLS and SHOULD employ it when communicating with their peers.

In the production use case, we need to use wss to ensure connection confidentiality and integrity, but this sample code is only intended to run locally, so I use ws scheme.

Subprotocols

The next line is an initialization of Dialer struct. A Dialer contains options for connection to WebSocket server.

dialer := websocket.Dialer{}
Enter fullscreen mode Exit fullscreen mode

Here is the definition of Dialer struct.

type Dialer struct {
    NetDial func(network, addr string) (net.Conn, error)
    NetDialContext func(ctx context.Context, network, addr string) (net.Conn, error)
    Proxy func(*http.Request) (*url.URL, error)
    TLSClientConfig *tls.Config
    HandshakeTimeout time.Duration
    ReadBufferSize, WriteBufferSize int
    WriteBufferPool BufferPool
    Subprotocols []string
    EnableCompression bool
    Jar http.CookieJar
}
Enter fullscreen mode Exit fullscreen mode

An important field to understand the WebSocket protocol is Subprotocols.

Subprotocols is like a custom XML schema or doctype declaration. For example, if you're using a subprotocol json, all data is passed as JSON.

    Sec-WebSocket-Protocol: soap, wamp
Enter fullscreen mode Exit fullscreen mode

By specifying subprotocols with a client, you can tell the server which protocol you want to use. For instance:

dialer := websocket.Dialer{
    Subprotocols: []string{"json"},
}
c, _, err := dialer.DialContext(ctx, endpointUrl, nil)
if err != nil {
    log.Panicf("Dial failed: %#v\n", err)
}
fmt.Printf("negotiated protocol: %q\n", c.Subprotocol())
// Output: ""
Enter fullscreen mode Exit fullscreen mode

Conn.Subprotocol function return the negotiated protocol for the connection. In the above example, the WebSocket server does not support json subprotocol, so the negotiated result is "no subprotocol used".

A WebSocket serve specifies its supported protocols, and when the server does not support any subprotocols, the client's demands will not be met.

Opening handshake

Dialer.DialContext creates a new client connection to open handshake.

dialer := websocket.Dialer{}
c, _, err := dialer.DialContext(ctx, endpointUrl, nil)
if err != nil {
    log.Panicf("Dial failed: %#v\n", err)
}
defer c.Close()
Enter fullscreen mode Exit fullscreen mode

You can understand the specification of WebSocket handshake by reading codes inside Dialer.DialContext.

At first, we need to decide the URI scheme. In gorilla/websocket, the URI scheme is determined as follows:

switch u.Scheme {
case "ws":
    u.Scheme = "http"
case "wss":
    u.Scheme = "https"
default:
    return nil, nil, errMalformedURL
}
Enter fullscreen mode Exit fullscreen mode
  • ws -> http
  • wss -> https

In the case of wss, specify https as the URI scheme for opening handshake because communication should be over TLS.

Second, we send HTTP GET request by codes in gorilla/websocket.

req := &http.Request{
    Method:     "GET",
    URL:        u,
    Proto:      "HTTP/1.1",
    ProtoMajor: 1,
    ProtoMinor: 1,
    Header:     make(http.Header),
    Host:       u.Host,
}
// (omit)...
req.Header["Upgrade"] = []string{"websocket"}
req.Header["Connection"] = []string{"Upgrade"}
req.Header["Sec-WebSocket-Key"] = []string{challengeKey}
req.Header["Sec-WebSocket-Version"] = []string{"13"}
if len(d.Subprotocols) > 0 {
    req.Header["Sec-WebSocket-Protocol"] = []string{strings.Join(d.Subprotocols, ", ")}
}
Enter fullscreen mode Exit fullscreen mode

The specification of handshake is as follow:

  • The method of the request MUST be GET as described on RFC 6455 - page 16
  • The request uses HTTP/1.1.
  • The request MUST contain a Upgrade header field whose value MUST include the websocket keyword.
  • The request MUST contain a Connection header field whose value MUST include the upgrade token.
  • Sec-Websocket-Version is the WebSocket protocol version which is supported by server. Available versions are listed in IANA WebSocket Version Number Registry. RFC 6415 is registered as Version 13, so basically, only 13 should be supported.

Finally, verify that the handshake would be completed correctly by validating the response (here).

if resp.StatusCode != 101 ||
!strings.EqualFold(resp.Header.Get("Upgrade"), "websocket") ||
!strings.EqualFold(resp.Header.Get("Connection"), "upgrade") ||
resp.Header.Get("Sec-Websocket-Accept") != computeAcceptKey(challengeKey) {
    buf := make([]byte, 1024)
    n, _ := io.ReadFull(resp.Body, buf)
    resp.Body = ioutil.NopCloser(bytes.NewReader(buf[:n]))
    return nil, resp, ErrBadHandshake
}
Enter fullscreen mode Exit fullscreen mode

The handshake response of WebSocket server MUST be as follows:

Conclusion

This article explains WebSocket handshake protocol using gorilla/websocket client implementation.

Additional articles will be published on the WebSocket protocol from the server perspective and data frame processing when exchanging messages and etc.

Latest comments (1)

Collapse
 
pabloverduzco profile image
Pablo Verduzco

An awesome publication, thank you Kazuki.