The Atomic Tap RPC That Keeps a Multiplayer Bingo Board Honest

#webdev #postgres #nextjs #javascript

Real-time games have a habit of turning simple UI events into distributed systems problems. A player taps a cell. Another tab reconnects. A stale WebSocket event arrives from the previous round. The user taps again before the first request finishes. If the server treats those as independent "claim" and "unclaim" commands, the board can drift.

I ran into this while building BingWow, a browser-based multiplayer bingo game. The product looks simple: every player has a board, taps squares, and wins by completing a row. The data model behind that simple interaction has to answer one question perfectly: after this tap, which clue ids are currently claimed by this player in this round?

The bug-prone version

The first version had separate endpoints:

POST /api/game/claim
POST /api/game/unclaim

That matched the UI state. If a square looked unclaimed, the client called claim. If it looked claimed, the client called unclaim.

The problem is that the client is not the source of truth. It can be stale by one render, one optimistic update, or one reconnect. A double tap could send two claim requests. A delayed Ably event could make the square look claimed when the database had already undone it. The API shape made illegal states easy to express.

One tap, one server decision

The fix was to collapse the command surface into a single action:

POST /api/game/tap

The endpoint does not accept "claim" or "unclaim" from the browser. It accepts only room_id, player_id, and clue_id. The database decides what the tap means inside one locked transaction.

The core RPC shape is:

SELECT * INTO v_room
FROM rooms
WHERE id = p_room_id
FOR UPDATE;

SELECT id INTO v_existing_claim_id
FROM claims
WHERE room_id = p_room_id
  AND player_id = p_player_id
  AND clue_id = p_clue_id
  AND round_number = v_room.current_round
  AND undone = false
LIMIT 1;

IF v_existing_claim_id IS NOT NULL THEN
  UPDATE claims
  SET undone = true, undone_at = NOW()
  WHERE id = v_existing_claim_id;
  v_action := 'unclaimed';
ELSE
  INSERT INTO claims (room_id, clue_id, player_id, round_number, undone)
  VALUES (p_room_id, p_clue_id, p_player_id, v_room.current_round, false)
  RETURNING id INTO v_claim_id;
  v_action := 'claimed';
END IF;

The browser no longer tells the server what state it thinks the cell is in. The database reads the current claim row and toggles it.

The response returns the whole active set

The RPC returns the complete active claim set for that player and round:

SELECT COALESCE(array_agg(clue_id), ARRAY[]::TEXT[])
INTO v_claimed_clue_ids
FROM claims
WHERE room_id = p_room_id
  AND player_id = p_player_id
  AND round_number = v_room.current_round
  AND undone = false;

That one detail matters. The API response is not just "your tap worked." It is "here is the authoritative set now." The client can replace local state instead of trying to patch from a possibly stale starting point.

On the server, the same response drives the bingo check for multiplayer bingo. The endpoint maps the player's board into positions, marks the returned clue ids as claimed, and runs a shared row-completion detector. There is no separate SQL bingo loop that can drift from the browser's single-player detector.

Round number is part of the key

The active claim uniqueness constraint includes round_number. That prevents old taps and stale events from contaminating the next round. A claim from round 1 is still auditable, but it is not active in round 2.

That rule pairs with the WebSocket side. Realtime events include the round number, and subscribers ignore events from older rounds. This is the difference between a pleasant bingo card game and a board that appears to resurrect old cells after the next round starts.

Derive grid size from the board

Another small cleanup made the RPC safer: it derives grid_size from board length instead of reading a separate column.

SQRT(jsonb_array_length(v_player.board))::INT

The board has 9, 16, or 25 cells, so the grid is 3x3, 4x4, or 5x5. Duplicating that value in a column created drift bugs. Deriving it inside the RPC means the server's win check uses the actual board the player has.

That is especially important because BingWow deliberately clamps mobile players to a smaller board while desktop players can use larger ones. The rule is product behavior, not presentation. It belongs on the server.

What this buys you

The final shape is boring in the best way:

The client sends one command: tap this clue.
The database locks the room and player.
The database toggles the active claim row.
The database returns the full active claim set.
The server checks bingo from that authoritative set.
The browser publishes realtime events after the HTTP response succeeds.

For a simple party game like online bingo calling, that may look overbuilt. It is not. The visible product is casual, but the failure mode is embarrassing: a player taps a square, the UI says one thing, and the server believes another.

The lesson I keep relearning is that multiplayer game commands should describe user intent, not presumed state. "Tap this square" is intent. "Claim this square" is a guess.

When your API accepts guesses, stale clients become data corruption machines. When the server owns the transition, the UI can be optimistic without being authoritative.

The same principle applies outside games. Shopping carts, kanban cards, calendar RSVPs, feature toggles - anywhere the user can flip state from multiple tabs or devices. Give the server the minimum intent, lock the row that owns the invariant, and return the new truth.

That is the pattern behind BingWow's real-time multiplayer bingo architecture, and it has held up far better than the two-endpoint version I started with.