DEV Community

Cover image for Build Scalable Real-Time WebSocket Server in Go: Complete Guide with Code Examples
Nithin Bharadwaj
Nithin Bharadwaj

Posted on

Build Scalable Real-Time WebSocket Server in Go: Complete Guide with Code Examples

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Let's talk about building a chat system, a live dashboard, or a multiplayer game. The common thread is instant updates. When you click send, you expect everyone else to see the message right away. That's the world of real-time communication, and for that, we often use WebSockets.

Think of a traditional website like a series of letters. You send a request (a letter), and you wait for a response (a return letter). It's slow. A WebSocket is like a phone call. Once connected, both you and the server can talk and listen at any time, instantly. My job is to manage thousands of these simultaneous "phone calls" efficiently. That's what we're going to build.

We'll use Go because it's brilliant at handling many things at once. Its lightweight goroutines are perfect for managing many connections without drowning in complexity. Let's start from the ground up.

The heart of our system is the WebSocketServer. It's the switchboard operator. Its job is to accept incoming calls, keep track of who is connected, and route messages to the right people.

type WebSocketServer struct {
    upgrader       websocket.Upgrader
    connections    *ConnectionPool
    hubs           map[string]*MessageHub
    stats          ServerStats
    config         ServerConfig
}
Enter fullscreen mode Exit fullscreen mode

When a client first connects, it's just a regular HTTP request. We need to "upgrade" it to a WebSocket connection. This HandleConnection function does that handshake.

func (wss *WebSocketServer) HandleConnection(w http.ResponseWriter, r *http.Request) {
    conn, err := wss.upgrader.Upgrade(w, r, nil)
    if err != nil {
        log.Printf("Upgrade error: %v", err)
        return
    }
    // ... create and register the client
}
Enter fullscreen mode Exit fullscreen mode

Once the connection is upgraded, we create a ClientConnection. This object represents that single, ongoing phone call with a user. It holds the actual connection, a unique ID, and some crucial mechanics: a channel for sending messages and a channel for knowing when to hang up.

type ClientConnection struct {
    conn        *websocket.Conn
    id          string
    userID      string
    groups      map[string]bool
    sendChan    chan []byte
    closeChan   chan struct{}
    lastActive  time.Time
}
Enter fullscreen mode Exit fullscreen mode

Why a channel for sending? This is a key pattern. We separate reading and writing into two independent loops, or "pumps." The readPump listens for messages coming from the client. The writePump listens for messages that need to be sent to the client.

This separation is vital. It prevents a slow-reading client from blocking us from sending messages to them, and vice-versa. They operate on separate tracks.

Here's the writePump. It sits in a loop, waiting for one of three things: a message to send, a timer to send a "ping," or a signal that the connection is closed.

func (wss *WebSocketServer) writePump(client *ClientConnection) {
    pingTicker := time.NewTicker(wss.config.PingInterval)
    defer pingTicker.Stop()

    for {
        select {
        case message, ok := <-client.sendChan:
            if !ok {
                // Channel closed, send a goodbye
                client.conn.WriteMessage(websocket.CloseMessage, []byte{})
                return
            }
            // Write the actual message to the network
            client.conn.WriteMessage(websocket.TextMessage, message)

        case <-pingTicker.C:
            // Send a ping to check if the client is still there
            client.conn.WriteMessage(websocket.PingMessage, nil)

        case <-client.closeChan:
            return // Time to stop
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The readPump is simpler. It constantly listens. When a message arrives, it processes it. If it gets an error or a close message, it triggers the closeChan to tell the writePump to finish.

Now, where do we put all these ClientConnection objects? We can't just leave them lying around. We need a directory. That's the ConnectionPool.

type ConnectionPool struct {
    mu          sync.RWMutex
    connections map[string]*ClientConnection
    groups      map[string]map[string]bool
}
Enter fullscreen mode Exit fullscreen mode

The pool has two main maps. One maps a Client ID to the Client object. The other is for groups: which clients are in "room-5" or "notifications"? The sync.RWMutex (Read-Write Mutex) is the bouncer at the door. It allows many goroutines to read the map at the same time (to send a message), but only one goroutine to write (to add or remove a client). This keeps things fast and safe.

Adding and removing clients is straightforward with the pool.

func (cp *ConnectionPool) Add(client *ClientConnection) {
    cp.mu.Lock()
    defer cp.mu.Unlock()
    cp.connections[client.id] = client
}

func (cp *ConnectionPool) Remove(clientID string) {
    cp.mu.Lock()
    defer cp.mu.Unlock()
    if client, exists := cp.connections[clientID]; exists {
        close(client.sendChan) // This will cause writePump to exit
        delete(cp.connections, clientID)
        // Also clean up from all groups
        for group := range client.groups {
            delete(cp.groups[group], clientID)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Notice close(client.sendChan). This is how we politely tell the writePump goroutine to stop. When we close a channel, any goroutine receiving from it gets a signal. Our writePump sees ok become false and shuts itself down. No leaks.

Let's send a message. The simplest is a broadcast: shout to everyone.

func (wss *WebSocketServer) Broadcast(message []byte) {
    wss.connections.mu.RLock() // Many goroutines can hold this lock
    defer wss.connections.mu.RUnlock()

    for _, client := range wss.connections.connections {
        select {
        case client.sendChan <- message:
            // Success! Message queued to send.
        default:
            // The client's send channel is full. They might be slow.
            // We skip them to avoid blocking the server.
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

We use RLock() because we're only reading the map. The select with a default case is a non-blocking send. If the client's channel buffer is full (maybe their network is slow), we don't wait. We just move on. This prevents one slow client from holding up a message for everyone else.

But broadcasting to everyone is wasteful. What if you're in a chat room? You only need messages for that room. That's where groups come in. First, a client joins a group.

func (cp *ConnectionPool) JoinGroup(clientID, group string) {
    cp.mu.Lock()
    defer cp.mu.Unlock()
    if client, exists := cp.connections[clientID]; exists {
        client.groups[group] = true
        // Ensure the group map exists, then add the client
        if cp.groups[group] == nil {
            cp.groups[group] = make(map[string]bool)
        }
        cp.groups[group][clientID] = true
    }
}
Enter fullscreen mode Exit fullscreen mode

Now, sending to a group is efficient. We look up the group, get the list of member IDs, and only send to them.

func (wss *WebSocketServer) BroadcastToGroup(group string, message []byte) {
    wss.connections.mu.RLock()
    defer wss.connections.mu.RUnlock()

    groupMembers, exists := wss.connections.groups[group]
    if !exists {
        return // No one in this group
    }
    for clientID := range groupMembers {
        if client, exists := wss.connections.connections[clientID]; exists {
            select {
            case client.sendChan <- message:
            default:
                // Skip slow client
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Sometimes connections die without saying goodbye. The network drops. A user closes their laptop. We can't wait forever. This is where heartbeats come in. Remember the pingTicker in the writePump? It sends a Ping message periodically.

The readPump sets up a "Pong handler." When it receives a Pong (the automatic response to a Ping), it updates the connection's lastActive time.

client.conn.SetPongHandler(func(string) error {
    client.lastActive = time.Now()
    return nil
})
Enter fullscreen mode Exit fullscreen mode

We can then have a cleanup goroutine that runs every minute, checking for connections that haven't been active for, say, two ping intervals. Those are stale and can be removed.

func (cm *ConnectionManager) cleanupStaleConnections() {
    cm.wss.connections.mu.Lock()
    defer cm.wss.connections.mu.Unlock()
    cutoff := time.Now().Add(-2 * cm.wss.config.PingInterval)
    for id, client := range cm.wss.connections.connections {
        if client.lastActive.Before(cutoff) {
            delete(cm.wss.connections.connections, id)
            client.conn.Close() // Forcefully close the dead connection
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

What about more complex routing? Imagine a stock ticker. Thousands of clients subscribe to "GOOG" or "TSLA." A pub/sub (publish/subscribe) MessageHub is perfect for this. It's more dynamic than pre-defined groups.

type MessageHub struct {
    mu          sync.RWMutex
    subscribers map[string]map[string]bool // channel -> set of client IDs
    messageChan chan *BroadcastMessage
}
Enter fullscreen mode Exit fullscreen mode

A client subscribes to a channel:

func (mh *MessageHub) Subscribe(clientID, channel string) {
    mh.mu.Lock()
    defer mh.mu.Unlock()
    if mh.subscribers[channel] == nil {
        mh.subscribers[channel] = make(map[string]bool)
    }
    mh.subscribers[channel][clientID] = true
}
Enter fullscreen mode Exit fullscreen mode

When new data arrives for "TSLA," we publish it. The hub looks up all subscribers for that channel and places a broadcast message into a central channel. Another worker goroutine would read from this channel and actually send the messages out via the ConnectionPool. This decouples receiving an event from the work of distributing it.

We also need to know how our server is doing. Are we handling the load? We use atomic counters for that. They're fast and safe for concurrent use.

type ServerStats struct {
    connectionsActive uint64
    messagesSent      uint64
    messagesReceived  uint64
    errors            uint64
}
// When a message is sent:
atomic.AddUint64(&wss.stats.messagesSent, 1)
// To read the stats:
active := atomic.LoadUint64(&wss.stats.connectionsActive)
Enter fullscreen mode Exit fullscreen mode

Finally, no single server can handle everything. We need to plan for scale. A LoadBalancer can sit in front of multiple WebSocket server instances. It directs new connections to different servers, often in a simple round-robin fashion.

type LoadBalancer struct {
    servers []string // e.g., ["ws1.example.com", "ws2.example.com"]
    current uint32
}
func (lb *LoadBalancer) NextServer() string {
    n := atomic.AddUint32(&lb.current, 1)
    return lb.servers[(int(n)-1)%len(lb.servers)]
}
Enter fullscreen mode Exit fullscreen mode

If one server fails, the load balancer stops sending new connections to it. Existing clients on that server would try to reconnect, and the load balancer would point them to a healthy one. Your application needs to be designed to handle this reconnection gracefully, perhaps by temporarily storing state in a shared database like Redis.

Putting it all together, starting the server looks like this:

func main() {
    server := NewWebSocketServer()

    // Start the connection cleanup janitor
    manager := &ConnectionManager{wss: server, cleanupInt: 1 * time.Minute}
    go manager.StartCleanup()

    // Start a stats logger
    go func() {
        ticker := time.NewTicker(5 * time.Second)
        for range ticker.C {
            stats := server.GetStats()
            log.Printf("Active: %d, Msg/Sec: %d", stats.connectionsActive, stats.messagesSent/5)
        }
    }()

    // Set up the HTTP route
    http.HandleFunc("/ws", server.HandleConnection)
    http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
        w.WriteHeader(http.StatusOK) // Simple health check
    })

    log.Println("Server starting on :8080")
    log.Fatal(http.ListenAndServe(":8080", nil))
}
Enter fullscreen mode Exit fullscreen mode

Building this is like constructing a careful, well-organized city. The ConnectionPool is the directory of all phone lines. The readPump and writePump are the dedicated operators for each line. The MessageHub is the specialized broadcasting station. The mutexes are the traffic rules. The heartbeat and cleanup are the maintenance crew.

You start with a simple connection. Then you add management so you don't lose track. Then you add efficient routing so you don't shout in empty rooms. Then you add monitoring and resilience. Each piece addresses a real problem you'll hit as more and more users connect. The goal is a system that stays responsive and stable, whether there are ten connections or ten thousand.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | Java Elite Dev | Golang Elite Dev | Python Elite Dev | JS Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)