DEV Community

Stojan Kojo
Stojan Kojo

Posted on

Building Poker Software: From Zero to Hero

Building poker software from "zero to hero" requires transitioning from a simple game loop to a distributed, event-driven, real-time multiplayer architecture. The journey involves mastering deterministic state machines, cryptographic RNG, low-latency networking, and rigorous security compliance.

1. Phase 1: The Core Engine (Determinism & Logic)

The foundation is a pure, deterministic game engine. No networking, no UI, just logic.

  • Architecture: Use a Functional Core, Imperative Shell pattern. The core logic (rules, hand evaluation, state transitions) must be pure functions with no side effects.
  • State Machine: Implement a Finite State Machine (FSM) for betting rounds. Every state (Pre-flop, Flop, Turn, River) and transition (Check, Call, Raise) must be mathematically exhaustive.
  • Hand Evaluator: Integrate a Lookup Table (LUT) based evaluator (e.g., Cactus Kev) for $O(1)$ performance. Never use runtime logic for hand ranking.
  • RNG: Implement a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) (e.g., crypto.randomBytes in Node, rand_chacha in Rust). Crucial: The seed must be derived from high-entropy sources (OS entropy, hardware RNG).
  • Testing: Write Property-Based Tests (e.g., using QuickCheck) to verify that for any sequence of actions, the final pot distribution is mathematically correct.

2. Phase 2: Real-Time Networking (The "Live" Layer)

This phase introduces latency and concurrency challenges.

  • Protocol: Use WebSockets (raw ws or uWebSockets.js for C++ speed) or gRPC-Web for low-latency bi-directional communication. Avoid HTTP polling.
  • Message Format: Use Protocol Buffers (Protobuf) or MessagePack for binary serialization. JSON is too verbose and slow for high-frequency state updates.
  • Server Topology:
    • Game Servers: Stateless, in-memory state machines. Each server handles ~500-2,000 concurrent tables depending on the language (Go/C++ > Node/Java).
    • Lobby/Matchmaking: A separate service that assigns players to Game Servers based on region, stakes, and skill.
    • State Sync: The server is the Source of Truth. Clients send intent (e.g., Action: Raise(100)), not state updates. The server validates, updates state, and broadcasts the new StateSnapshot to all clients in the room.
  • Latency Handling: Implement Client-Side Prediction for UI responsiveness (e.g., show the card flip immediately) but Server-Side Reconciliation to correct any discrepancies.

3. Phase 3: Data Persistence & Event Sourcing

Do not store the "current state" as the primary database record. Use Event Sourcing.

  • Pattern: Every action (Deal, Bet, Fold, Showdown) is an immutable event appended to a log.
  • Storage:
    • Hot Path (Redis): Store the current GameState and active Event Log for fast access and replay. Use Redis Streams for high-throughput event logging.
    • Cold Path (PostgreSQL): Store finalized HandHistories, PlayerStats, and Transactions.
  • Replayability: If a server crashes, the new instance loads the last Snapshot and replays the Event Log from Redis to reconstruct the exact state. This is critical for dispute resolution.

4. Phase 4: Security, Compliance & Anti-Fraud

This is where "social" poker becomes "real-money" poker.

  • RNG Certification: The RNG algorithm and its implementation must be audited by a third party (e.g., eCOGRA, GLI-11). You cannot just "use Math.random()".
  • Anti-Collusion:
    • Graph Analysis: Run background jobs that analyze player interaction graphs. If Player A and B always fold when Player C raises, flag them.
    • IP/Device Fingerprinting: Detect multiple accounts from the same IP/MAC address.
    • Chip Transfer Monitoring: Detect abnormal chip flow between accounts (e.g., "soft play" where players avoid raising against each other).
  • Wallet Integration: Use a Double-Entry Ledger system. Every chip movement is a debit/credit pair. Never trust the balance field; calculate it as Sum(Credits) - Sum(Debits).

5. Phase 5: Scalability & Infrastructure

  • Microservices: Decouple the Lobby, Wallet, Game Logic, and Analytics.
  • Orchestration: Use Kubernetes (K8s) with Horizontal Pod Autoscaling (HPA). Scale Game Servers based on CPU/Memory usage or custom metrics (e.g., "Tables per Node").
  • Sharding: Shard the database by TableID or Region to distribute load.
  • Monitoring: Implement Distributed Tracing (Jaeger/Zipkin) to track a hand across services. Monitor P99 Latency for WebSocket messages.

Architecture Diagram (High Level)

[Client (React/Flutter)] 
       | (WebSocket/Protobuf)
       v
[API Gateway / Load Balancer]
       |
       +---> [Lobby Service] (Matchmaking, Player Profiles)
       |
       +---> [Game Server Cluster] (State Machines, FSM, RNG) <---+
       |                         |                                |
       |                         v                                |
       |                   [Redis Cluster] (State & Events)       |
       |                         |                                |
       +---> [Wallet Service] <---+ (Double-Entry Ledger)         |
       |                         |                                |
       +---> [Analytics Service] (Event Stream Processing)        |
       |                         |                                |
       +---> [Fraud Detection] (Background Job, Graph Analysis)   |
       |
       v
[PostgreSQL] (Hand Histories, Audit Logs)
Enter fullscreen mode Exit fullscreen mode

Real-World Implementation Example

Scenario: A 6-max Hold'em table with 100k concurrent users.

  1. Join: Player connects to Lobby Service, gets assigned to GameServer-42 via consistent hashing.
  2. Hand Start: GameServer-42 generates a seed, draws cards (CSPRNG), emits HandStarted event.
  3. Betting: Player clicks "Raise". Client sends Raise(100). Server validates against FSM. If valid, updates state, emits ActionApplied.
  4. Side Pot: If an all-in occurs, the server calculates side pots using the multi-pass algorithm and emits SidePotCreated.
  5. Showdown: Server evaluates hands (LUT), distributes pots, emits HandEnded.
  6. Persistence: Event log is flushed to Redis, then asynchronously to PostgreSQL.
  7. Audit: Fraud service consumes the event stream, updates player risk scores.

Trade-offs & Decisions

  • Language: Go or Rust for Game Servers (high concurrency, low GC pressure). Node.js or Python for Lobby/Analytics (rapid development, rich ecosystem).
  • Database: Redis for speed (state), PostgreSQL for integrity (wallets/histories). Avoid NoSQL for financial ledgers.
  • Consistency: Strong Consistency for wallet transactions (ACID). Eventual Consistency for player stats/follower counts.

FAQs

Q1: Why not use a monolithic architecture for a poker site?
Monoliths are easier to start with but become unmanageable at scale. A crash in the "Lobby" service could take down the "Game" logic, freezing all active hands. Microservices allow independent scaling (e.g., adding more Game Servers during peak hours without touching the Wallet service) and isolation of failures.

Q2: How do you handle the "All-In" logic if the server crashes mid-hand?
The server state is stored in Redis (in-memory) with persistence enabled (AOF/RDB). If the process crashes, the Kubernetes pod restarts. The new instance loads the last snapshot from Redis, replays the event log to reconstruct the exact state, and resumes the hand. This is why Event Sourcing is mandatory for real-money poker.

Q3: Can I use a standard web server (Nginx/Apache) for the game logic?
No. Nginx is for routing and static content. Poker requires persistent, full-duplex connections (WebSockets) and in-memory state management. You need a dedicated application server (e.g., Go, Node.js, C++) that maintains the connection and state for each table.

Q4: How is the RNG certified?
You cannot just "write good code." You must submit your RNG algorithm (and its implementation) to an independent testing lab (e.g., eCOGRA, GLI, BMM Testlabs). They run statistical tests (Chi-square, Kolmogorov-Smirnov) on billions of generated numbers to prove uniformity and unpredictability.

Q5: What is the biggest security risk in poker software?
Race Conditions in the wallet system. If two requests try to deduct chips simultaneously, a poorly designed system might deduct twice. This is solved using Optimistic Locking (version numbers) or Pessimistic Locking (database row locks) in the wallet service, ensuring only one transaction can modify a balance at a time.

Top comments (0)