Monte Carlo Tree Search (MCTS) Tutorial – Complete Guide with Examples

Coding Lad — Thu, 19 Jun 2025 21:19:45 +0000

Monte Carlo Tree Search (MCTS) Tutorial – Complete Guide with Examples

Master Monte Carlo Tree Search (MCTS) with this beginner-friendly guide. Learn the theory, step-by-step algorithm, and practical examples used in AI and game-playing systems.

🧠 What is MCTS?

Monte Carlo Tree Search (MCTS) is a powerful decision-making algorithm used in games and simulations. It balances:

Exploration: trying new or unknown moves
Exploitation: selecting moves that are known to work well

⚙️ Four Steps of MCTS

1. Selection

Traverse the tree from the root using the UCB1 formula to select the most promising child:


UCB1(S\_i) = (t\_i / n\_i) + C \* sqrt(ln(N) / n\_i)

Where:

t_i: total value of node S_i
n_i: visit count of node S_i
N: total visits of the parent
C: exploration constant (usually √2 or 2)

2. Expansion

Once a node is selected, generate one or more child nodes (if any unvisited children exist).

3. Rollout (Simulation)

Run a random simulation (or using a default policy) from the newly expanded node until a terminal state is reached.

4. Backpropagation

Propagate the result of the simulation up the tree, updating value and visit count for each node on the path.

🔁 This 4-step process is repeated X times (number of simulations/iterations).

🧩 MCTS Example – Step-by-Step

Given a snapshot of an MCTS tree, we perform 2 rollouts with return values 15 and 75.

💡 Always pick the right node when there's a tie/random choice.

▶️ Rollout 1 (return value = 15)

Step 1: Selection

Compute UCB1 for S1 and S2:

S1: t=60, n=2
S2: t=18, n=1
Parent (S0): n=3, ln(3) ≈ 1.0986, C = 2

UCB1(S1) ≈ 30 + 2 * sqrt(1.0986/2) ≈ 31.48  
UCB1(S2) ≈ 18 + 2 * sqrt(1.0986/1) ≈ 20.09

✅ Choose S1

Step 2: Expansion

S1 has an unvisited child S3. We expand the tree by adding S3.

Step 3: Rollout

Run a random simulation from S3 → terminal value = 15.

Step 4: Backpropagation

Update values:

S3: t=15, n=1 S1: t=75, n=3 S0: t=98, n=4

▶️ Rollout 2 (return value = 75)

Step 1: Selection from S0

S1: t=75, n=3 → UCB1 ≈ 26.36 S2: t=18, n=1 → UCB1 ≈ 20.35

✅ Choose S1

Step 2: Selection within S1

S3: t=15, n=1 → UCB1 ≈ 17.09 S4: t=40, n=1 → UCB1 ≈ 42.09

✅ Choose S4

Step 3: Expansion

S4 has an unvisited child → expand to S6.

Step 4: Rollout

Simulation from S6 returns 75.

Step 5: Backpropagation

S6: t=75, n=1 S4: t=115, n=2 S1: t=150, n=4 S0: t=173, n=5

📖 Original blog with diagrams + YouTube video here:
👉 Monte Carlo Tree Search (MCTS) Guide

✍️ Written by Dipto Mistry — Full-stack developer & AI enthusiast.

DEV Community: Coding Lad

Monte Carlo Tree Search (MCTS) Tutorial – Complete Guide with Examples

Monte Carlo Tree Search (MCTS) Tutorial – Complete Guide with Examples

🧠 What is MCTS?

⚙️ Four Steps of MCTS

1. Selection

2. Expansion

3. Rollout (Simulation)

4. Backpropagation

🧩 MCTS Example – Step-by-Step

▶️ Rollout 1 (return value = 15)

Step 1: Selection

Step 2: Expansion

Step 3: Rollout

Step 4: Backpropagation

▶️ Rollout 2 (return value = 75)

Step 1: Selection from S0

Step 2: Selection within S1

Step 3: Expansion

Step 4: Rollout

Step 5: Backpropagation