Monte Carlo Tree Search (MCTS) Tutorial – Complete Guide with Examples
Master Monte Carlo Tree Search (MCTS) with this beginner-friendly guide. Learn the theory, step-by-step algorithm, and practical examples used in AI and game-playing systems.
🧠 What is MCTS?
Monte Carlo Tree Search (MCTS) is a powerful decision-making algorithm used in games and simulations. It balances:
- Exploration: trying new or unknown moves
- Exploitation: selecting moves that are known to work well
⚙️ Four Steps of MCTS
1. Selection
Traverse the tree from the root using the UCB1 formula to select the most promising child:
UCB1(S\_i) = (t\_i / n\_i) + C \* sqrt(ln(N) / n\_i)
Where:
-
t_i: total value of nodeS_i -
n_i: visit count of nodeS_i -
N: total visits of the parent -
C: exploration constant (usually √2 or 2)
2. Expansion
Once a node is selected, generate one or more child nodes (if any unvisited children exist).
3. Rollout (Simulation)
Run a random simulation (or using a default policy) from the newly expanded node until a terminal state is reached.
4. Backpropagation
Propagate the result of the simulation up the tree, updating value and visit count for each node on the path.
🔁 This 4-step process is repeated X times (number of simulations/iterations).
🧩 MCTS Example – Step-by-Step
Given a snapshot of an MCTS tree, we perform 2 rollouts with return values 15 and 75.
💡 Always pick the right node when there's a tie/random choice.
▶️ Rollout 1 (return value = 15)
Step 1: Selection
Compute UCB1 for S1 and S2:
S1: t=60, n=2
S2: t=18, n=1
Parent (S0): n=3, ln(3) ≈ 1.0986, C = 2
UCB1(S1) ≈ 30 + 2 * sqrt(1.0986/2) ≈ 31.48
UCB1(S2) ≈ 18 + 2 * sqrt(1.0986/1) ≈ 20.09
`
✅ Choose S1
Step 2: Expansion
S1 has an unvisited child S3. We expand the tree by adding S3.
Step 3: Rollout
Run a random simulation from S3 → terminal value = 15.
Step 4: Backpropagation
Update values:
S3: t=15, n=1
S1: t=75, n=3
S0: t=98, n=4
▶️ Rollout 2 (return value = 75)
Step 1: Selection from S0
S1: t=75, n=3 → UCB1 ≈ 26.36
S2: t=18, n=1 → UCB1 ≈ 20.35
✅ Choose S1
Step 2: Selection within S1
S3: t=15, n=1 → UCB1 ≈ 17.09
S4: t=40, n=1 → UCB1 ≈ 42.09
✅ Choose S4
Step 3: Expansion
S4 has an unvisited child → expand to S6.
Step 4: Rollout
Simulation from S6 returns 75.
Step 5: Backpropagation
S6: t=75, n=1
S4: t=115, n=2
S1: t=150, n=4
S0: t=173, n=5
📖 Original blog with diagrams + YouTube video here:
👉 Monte Carlo Tree Search (MCTS) Guide
✍️ Written by Dipto Mistry — Full-stack developer & AI enthusiast.

Top comments (0)