Zhixiang Li

Posted on Mar 1

I Built an AI Arena and Trained AlphaZero to Play Gomoku: Here’s How

#ai #alphazero #deeplearning #reinforcementlearning

Building a board game AI is a fantastic way to dive into Reinforcement Learning and search algorithms. But once you've built your AI, a new problem arises: How do you actually test it against other algorithms? If you write a classic Minimax agent in Java and an AlphaZero model in Python, how do you make them fight?

To solve this, I built a two-part ecosystem:

Gomoku Battle: A cross-language, cross-system arena for AI agents.
AlphaZero Board Games: A lightweight, readable AlphaZero implementation trained to dominate the arena.

Here is a deep dive into the architecture of both projects and how they work together.

🏟️ Part 1: The Arena (Gomoku Battle)

GitHub: zhixiangli/gomoku-battle

The goal of gomoku-battle was to create a pluggable, language-agnostic referee system. I wanted to be able to write an AI in any language, plug it into the arena, and watch it play in a UI.

The Architecture

I built the platform using Java, splitting it into specialized modules:

gomoku-battle-core: Handles the board state, win/loss rule checking, and pattern utilities.
gomoku-battle-dashboard: A JavaFX-based UI that provides real-time visualizations of the matches.
gomoku-battle-console: The referee that manages the game loop and handles inter-process communication (IPC).

Cross-Language Communication via `stdio`

The secret sauce of gomoku-battle is how it talks to the agents. Instead of forcing agents to implement a specific language interface, the console spawns each AI as a separate subprocess.

Communication happens purely over standard input/output (stdin / stdout) using JSON.
When it's an agent's turn, the referee sends the board state as a JSON string containing the SGF (Smart Game Format) sequence:

{"command":"NEXT_BLACK","rows":15,"columns":15,"chessboard":"B[96];W[a5];B[a4];W[95]"}

The agent processes the state, calculates the best move, and simply prints its decision to stdout:

{"rowIndex":3,"columnIndex":10}

This design completely decouples the AI logic from the game engine. You can configure your agents in a simple battle.properties file, pointing the engine to a Java .jar or a Python script using uv run.

To provide a baseline, I included gomoku-battle-alphabetasearch, a classical AI using Alpha-Beta Pruning. But I wanted something stronger.

🧠 Part 2: The Brain (AlphaZero Board Games)

GitHub: zhixiangli/alphazero-board-games

To beat the Alpha-Beta baseline, I implemented the algorithm that conquered Go and Chess: AlphaZero.

Many AlphaZero repositories out there are either overly complex, tied heavily to one specific game, or require massive compute just to see a working demo. I built alphazero-board-games to be clean, modular, and instantly playable.

The Implementation Details

The project is built with Python 3.12+ and uses a modular architecture:

Shared Core (alphazero/): This is the heart of the engine. It contains the abstract Game API, the Monte Carlo Tree Search (MCTS) implementation, the Neural Network definitions (Residual Policy/Value networks), and the self-play Reinforcement Learning loop.
Game Presets: I implemented the specific rules for gomoku_9_9, gomoku_15_15, and connect4.

Instead of traditional rollouts, the MCTS in this project traverses the tree until it reaches a leaf node, then queries the Residual Network. The network outputs two things:

Policy ($p$): A probability distribution over possible moves (where to look).
Value ($v$): An evaluation of the current board state from -1, 1.

These predictions guide the MCTS to focus only on promising branches, vastly reducing the search space compared to traditional Alpha-Beta pruning.

"Batteries Included"

To make the repo developer-friendly, I included pretrained checkpoints in the data/ directories. You don't need to spend hours training a model to see it work. You can just clone the repo and immediately play against the AI in your terminal using the included stdio_play.py scripts:

uv run python -m gomoku_15_15.stdio_play --human-color W --simulation-num 400

And if you do want to train your own models, the training loop is highly configurable right from the CLI!

⚔️ The Clash: Alpha-Beta vs. AlphaZero

Because of the decoupled design, hooking the Python AlphaZero model into the Java Gomoku arena takes exactly one line in battle.properties:

player.white.cmd=uv run --project alphazero-board-games python gomoku-battle-alphazero/alphazero_adapter.py --simulation-num=5000
player.white.alias=AlphaZero

When you run the battle, you get to watch a real-time visualization in the JavaFX dashboard of the Deep Learning model outsmarting the classical search algorithm.

The AlphaZero agent evaluates far fewer positions than the Alpha-Beta agent, but because its neural network has developed an intuition for spatial patterns and influence, its moves are profoundly more strategic.

🚀 Try It Yourself!

I built these projects to be hacked on, learned from, and extended.

Want to practice implementing Minimax, Monte Carlo, or a custom heuristic? Fork Gomoku Battle, write a quick script in your favorite language, and see if it can beat my baseline.
Want to learn how AlphaZero actually works under the hood, or train an AI to play Connect4? Check out AlphaZero Board Games.

If you find the projects interesting or helpful for learning AI and system design, I'd love it if you gave them a ⭐️ on GitHub! Let me know in the comments if you have any questions about the MCTS implementation or the JavaFX integration! Happy coding! 👨‍💻♟️

DEV Community

I Built an AI Arena and Trained AlphaZero to Play Gomoku: Here’s How

🏟️ Part 1: The Arena (Gomoku Battle)

The Architecture

Cross-Language Communication via `stdio`

🧠 Part 2: The Brain (AlphaZero Board Games)

The Implementation Details

"Batteries Included"

⚔️ The Clash: Alpha-Beta vs. AlphaZero

🚀 Try It Yourself!

Top comments (0)

🏟️ Part 1: The Arena (Gomoku Battle)

The Architecture

Cross-Language Communication via stdio

🧠 Part 2: The Brain (AlphaZero Board Games)

The Implementation Details

"Batteries Included"

⚔️ The Clash: Alpha-Beta vs. AlphaZero

🚀 Try It Yourself!

Cross-Language Communication via `stdio`