Pascal Thormeier

Posted on Jan 8, 2022 • Edited on Jan 11, 2022

Self-built Tic-tac-toe AIs vs Human - The ultimate showdown in five rounds, from dumbed down to highly sophisticated 🥊🏆

#node #javascript #algorithms #tutorial

Happy new year! We start this year with an epic showdown. Today, we'll teach a computer to play Tic-tac-toe with us by looking at different approaches from a dumbed down version of everything to a highly sophisticated AI. I'll play five rounds of 50 games each against the computer and see who's the ultimate Tic-tac-toe champion!

Let's get ready to rumble!

Tic-tac-what?

Most people have played Tic-tac-toe, or Noughts and crosses (is there a trademark? I don't know, hopefully not), at least once in their life. It's played by two players on a 3 by 3 grid. The goal is to get 3 of your own symbols (either O or X, hence "Noughts and crosses") either on the diagonals, a row or a column. The turn-based game starts out with an empty board where the first player can set their symbol (usually X) on any cell they like, the second player continues with an O on any empty field, then another X can be placed on any empty field, and so on.

A quick example of a win for X could look like this:

 X |   | X
---+---+---
   | O | X 
---+---+---
 O | O | X

The last move (center row, right column, or 2/1 in zero-indexed X/Y coordinates with top-left being 0/0) was the winning move here. Notice that there would've been two moves to win for X, either 2/1 or 1/0. The player just happened to chose 2/1 for whatever reason.

Each cell can have one of three values, either empty, O or X. One could argue that therefore there are 3^9 = 19683 different possible game states. But that's actually a gross overestimate. These around 19k states include boards with all X's, three X's and one O, etc., so lots of boards that are technically against the rules. A comment on Stackoverflow for the question on how to create a list of all unique Tic-tac-toe boards sets the number of possible game states at 5477, around 3.5 times less. A lot more manageable.

Coding out the game's rules

Since the majority of this post will be about teaching a machine to beat a human player, let's not spend too much time coding the actual game.

In order to determine if a move is valid, we need to answer these questions:

Was the game already won?
Was the game a draw?
Is it actually the turn of the player that want's to currently do a move?
Are the coordinates the player wants to play on part of the field?
Is the field the player wants to play on already occupied?

The board will be a simple array of arrays of strings we can do all these checks on. We start with a utility function to count the amount of a given symbol on a board:

const countSymbolsInBoard = (board, symbol) => board
  .reduce((previousRowCount, currentRow) => {
    return previousRowCount + currentRow
      .filter(v => v === symbol).length
  }, 0)

Next, we add a function to copy a board:

const copyBoard = board => [
  [board[0][0], board[0][1], board[0][2]],
  [board[1][0], board[1][1], board[1][2]],
  [board[2][0], board[2][1], board[2][2]],
]

Then we'll check if a given board is a draw:

// game.js
const symbolX = 'X'
const symbolO = 'O'

export const isDraw = (board) => board.flat().every(v => v === symbolO || v === symbolX)

And a function to check if a board was won by a given symbol with a hardcoded list of possible coordinates:

// game.js
export const winningCoordinates = [
  [
    [0, 0], [0, 1], [0, 2],
  ],
  [
    [1, 0], [1, 1], [1, 2],
  ],
  [
    [2, 0], [2, 1], [2, 2],
  ],
  [
    [0, 0], [1, 0], [2, 0],
  ],
  [
    [0, 1], [1, 1], [2, 1],
  ],
  [
    [0, 2], [1, 2], [2, 2],
  ],
  [
    [0, 0], [1, 1], [2, 2],
  ],
  [
    [2, 0], [1, 1], [0, 2],
  ]
]

export const hasWon = (currentBoard, isX) => {
  const checkingSymbol = isX ? symbolX : symbolO

  for (let coordTriple of winningCoordinates) {
    const symbolTriple = coordTriple.map(coords => currentBoard[coords[0]][coords[1]])
    if (symbolTriple.every(v => v === checkingSymbol)) {
      return true
    }
  }

  return false
}

Awesome. Now let's create the function that actually does the move:

// game.js
export const doTurn = (currentBoard, isX, x, y) => {
  if (isDraw(currentBoard)) {
    throw new Error('Cannot move on board that is a draw')
  }

  if (hasWon(currentBoard, true) || hasWon(currentBoard, false)) {
    throw new Error('Cannot move on board that was already won by someone')
  }

  if (x > 2 || y > 2) {
    throw new Error(`Coordinates out of bounds: ${x}/${y}`)
  }

  if (currentBoard[y][x] === symbolX || currentBoard[y][x] === symbolO) {
    throw new Error(`Illegal move: ${x}/${y} is already occupied`)
  }

  const numberOFXs = countSymbolsInBoard(currentBoard, symbolX)
  const numberOFOs = countSymbolsInBoard(currentBoard, symbolO)

  if ((isX && numberOFXs > numberOFOs) || (!isX && numberOFOs > numberOFXs)) {
    throw new Error(`Illegal move, it would be ${(isX ? 'O' : 'X')}s turn`)
  }

  const newBoard = copyBoard(currentBoard)

  newBoard[y][x] = isX ? symbolX : symbolO

  return newBoard
}

Almost there. Now we'll need some way to actually play this. We'll use the command line for this

// playCli.js
import { doTurn, hasWon, isDraw } from './game.js'
import { createInterface } from 'readline'

const playGame = async () => {
  let isX = true

  let board = [
    ['', '', ''],
    ['', '', ''],
    ['', '', ''],
  ]

  const rl = createInterface({
    input: process.stdin,
    output: process.stdout
  })

  const getInput = question => new Promise(resolve => {
    rl.question(question, resolve)
  })

  while (!hasWon(board, true) && !hasWon(board, false) && !isDraw(board)) {
    console.table(board)
    console.log(`${isX ? 'X' : 'O'}s turn!\n`)

    const x = Number(await getInput('X coordinate: '))
    const y = Number(await getInput('Y coordinate: '))
    try {
      board = doTurn(board, isX, x, y)

      isX = !isX
    } catch (e) {
      console.warn(e.message)
    }
  }

  console.table(board)
  console.log('----------')
  console.log(isDraw(board) ? 'Draw!' : hasWon(board, true) ? 'X has won!' : 'Y has won!')

  process.exit(0)
}

playGame()

This should create a two-player version of the game. Let's give it a try:

Nice. Now we can add the machine to that.

First machine strategy: Randomness

First, the machine will simply generate a bunch of random numbers as its turn:

// machineRandom.js
export const getRandomNumber = (min, max) => Math.floor(
  Math.random() * (max - min + 1)
) + min

We'll let the human player start and then take turns on who gets to play. The human player is always X, the machine is always O. Adjust the playCli.js a bit to add the machine:

// playCli.js

// ...
    let x = 0
    let y = 0

    if (isX) {
      x = Number(await getInput('X coordinate: '))
      y = Number(await getInput('Y coordinate: '))
    } else {
      x = getRandomNumber(0, 2)
      y = getRandomNumber(0, 2)
    }
// ...

I played 50 games against this "AI" and I'm surprised that the AI actually managed to get 5 wins and 5 draws, meaning I managed to beat a bunch of fair coin flips 40 out of 50 times:

Human wins: 40
Computer wins: 5
Draws: 5

Let's see how we can improve this.

Second strategy: Random with defense

In this approach, the random numbers stay. They are however accompanied by a defensive strategy: If there's a winning triple filled with two opponent's symbols and an empty cell, the machine will now fill that cell:

// randomDefensePlay.js
import { winningCoordinates } from './game.js'

const symbolX = 'X'
const symbolO = 'O'

const getRandomNumber = (min, max) => Math.floor(Math.random() * (max - min + 1)) + min

export const getCoords = (board, isX) => {
  for (let coordTriple of winningCoordinates) {
    const boardTriple = coordTriple.map(coords => board[coords[1]][coords[0]])
    const numberOfOpponentSymbols = boardTriple.filter(s => isX ? s === symbolO : s === symbolX).length
    const numberOfEmpty = boardTriple.filter(s => s === '').length

    // Found a triple the machine can still fill in
    if (numberOfOpponentSymbols === 2 && numberOfEmpty === 1) {
      for (let coords of coordTriple) { // Find the empty cell
        if (board[coords[1]][coords[0]] === '') {
          // Fill it in
          return coords
        }
      }
    }
  }

  return [
    getRandomNumber(0, 2),
    getRandomNumber(0, 2),
  ]
}

Another 50 games against that AI caught me a bit by surprise:

Human wins: 28
Computer wins: 3
Draws: 19

Out of 50 games, the machine has only won 3, but managed to get from 5 draws up to 19 draws. So this strategy sacrifices chances of winning to secure at least a draw. Maybe it needs some offensive bit in there as well.

Third strategy: Random + Defense + Offense

The offense part of the strategy can be implemented the same way as the defense part: Check for triples that miss a single own symbol to complete a row of three. If there's none, check for any potential winning moves of the opponent (as before), if there's none, fall back to random numbers.

import { winningCoordinates } from './game.js'

const symbolX = 'X'
const symbolO = 'O'

const getRandomNumber = (min, max) => Math.floor(Math.random() * (max - min + 1)) + min

const getFillingCoordinatesForSymbol = (symbol, board) => {
  for (let coordTriple of winningCoordinates) {
    const boardTriple = coordTriple.map(coords => board[coords[1]][coords[0]])
    const numberOfMatchingSymbols = boardTriple.filter(s => s === symbol).length
    const numberOfEmpty = boardTriple.filter(s => s === '').length

    if (numberOfMatchingSymbols === 2 && numberOfEmpty === 1) {
      for (let coords of coordTriple) { // Find the empty cell
        if (board[coords[1]][coords[0]] === '') {
          // Fill it in
          return coords
        }
      }
    }
  }

  return null
}


export const getCoords = (board, isX) => {

  const ownWinCoords = getFillingCoordinatesForSymbol(isX ? symbolX : symbolO, board)

  if (ownWinCoords !== null) {
    return ownWinCoords
  }

  const opponentWinCoordinates = getFillingCoordinatesForSymbol(isX ? symbolO : symbolX, board)

  if (opponentWinCoordinates !== null) {
    return opponentWinCoordinates
  }

  return [
    getRandomNumber(0, 2),
    getRandomNumber(0, 2),
  ]
}

This strategy should be exceedingly harder to beat. And indeed, after another 50 games, this things turns out to be pretty much on par with a human player who has played 150 games this afternoon already:

Human wins: 12
Computer wins: 11
Draws: 27

Fourth strategy: Brute force

Well, we coded out the rules, we know that there's "only" 5477 different legal states. So let's bruteforce all of them, make a tree and let the machine then look up the entire tree in order to find the best solution. I do expect to spend a ton of time playing here and I do not expect to win very often.

We'll start with a class Node that represents a single board state. A board state has a score which can be 1 (machine has won), 0 (draw) or -1 (human has won) or null (undecided yet). With the game's rules we can iterate over all possible game boards and find the next legal moves for every board. This will build up a tree of Nodes recursively, allowing us to search the tree for the best possible move:

// bruteForcePlay.js
import { doTurn, hasWon, isDraw } from './game.js'

let createdCount = 0

// You cannot compare arrays of arrays directly, so we create a
// string representation of the board to compare that
const areBoardsSame = (a, b) => {
  const flatA = a.flat().map(c => c === '' ? '_' : c).join('')
  const flatB = b.flat().map(c => c === '' ? '_' : c).join('')

  return flatA === flatB
}

// Represents a single board and all it's possible child boards.
class Node {
  constructor(isXsTurn, board, xCoord, yCoord, depth) {
    createdCount++

    // Some logging to see progress
    if (createdCount % 10000 === 0) {
      console.log('Created ', createdCount)
    }

    this.board = board
    this.isXsTurn = isXsTurn

    // Score by depth. The further down a win is, the less likely it is to happen.
    // Therefore: Try to take paths where a win is less deep.
    this.score = null
    if (hasWon(board, true)) {
      this.score = -10 / depth
    } else if (hasWon(board, false)) {
      // Focus on winning
      this.score = 50 / depth
    } else if (isDraw(board)) {
      this.score = 10 / depth
    }

    this.xCoord = xCoord
    this.yCoord = yCoord

    this.children = this.score === null ? this.createChildren(depth + 1) : []
  }

  createChildren(depth) {
    let children = []

    // Loop through the entire board and create children where allowed.
    for (let x = 0; x < 3; x++) {
      for (let y = 0; y < 3; y++) {
        try {
          const newBoard = doTurn(this.board, this.isXsTurn, x, y)
          children.push(new Node(!this.isXsTurn, newBoard, x, y, depth))
        } catch (_) {
          // Move would be illegal, hence the error.
          // We consider this as "skip this board"
        }
      }
    }

    return children
  }

  getChildNodeByBoard(board) {
    // Since we've created _all_ the possible boards, if
    // the subtree selection works once, it always works.
    // So no need for checking.
    return this.children.filter(node => areBoardsSame(node.board, board))[0]
  }

  // Just sum up all the subtrees
  getScoreSum() {
    if (this.score !== null) {
      return this.score
    }

    return this.children.map(c => c.getScoreSum()).reduce((previous, current) => previous + current, 0)
  }
}

That should take a while.

And indeed, generating all the possibilities yields a total of 1099892 valid boards. "What the heck", you might ask, asking why there's that many possible boards when we were only talking about 5477 possible boards before? There's several reasons. First of all, there's many possible ways to get to the same board. Let's take a look at this board:

 X |   | 
---+---+---
   | O |  
---+---+---
   |   | X

There's two ways to arrive at this. Either, X starts top left, then O plays center, then X play's bottom right, or the other way around. Also, apparently these 5477 don't take rotations into account. The rotation of the board doesn't matter for distinct boards. And: There's two different starting points in this case: Either the human player starts, or the computer player, so we need to double the amount of possible boards as well. And lastly, there's a ton of duplicates in this tree. It's called brute force for a reason, right?

On a side note: I'm happy that this is Tic-tac-toe and not chess. Chess would've been a whole lot worse. Did you know that there's around 121 million possible boards after 3 moves? Generating every single possible game would possibly take longer than the universe has existed so far will take up more single bits than there are particles in the universe. Amazing, what the human mind can come up with.

Anyways. Back to Tic-tac-toe.

We're going to use this tree representation to create an AI:

// The actual AI. This thing judges what move
// should be done next based on the current board and its sub tree.
export class AI {
  constructor() {
    // Turn here is false, so on the _next_ turn (the first) X would start
    this.startNodeX = new Node(false,[
      ['', '', ''],
      ['', '', ''],
      ['', '', ''],
    ], null, null, 1)

    this.startNodeO = new Node(true, [
      ['', '', ''],
      ['', '', ''],
      ['', '', ''],
    ], null, null, 1)

    this.currentSubTree = null
  }

  // When a game is over
  startOver() {
    this.currentSubTree = null
  }

  getCoords(board) {
    if (this.currentSubTree === null) {
      if (board.flat().join('').length === 0) { // Empty board
        this.currentSubTree = this.startNodeX
      } else {
        this.currentSubTree = this.startNodeO
        this.currentSubTree = this.currentSubTree.getChildNodeByBoard(board)
      }
    } else {
      this.currentSubTree = this.currentSubTree.getChildNodeByBoard(board)
    }

    // We nest this so we can sort better
    const scoredCoords = this.currentSubTree.children.map(c => ({
      score: c.getScoreSum(),
      coords: [c.xCoord, c.yCoord],
      subTree: c,
    }))

    scoredCoords.sort((a, b) => b.score - a.score)

    // Debugging
    // console.log(scoredCoords)

    // Re-assign the sub tree for the next move
    this.currentSubTree = scoredCoords[0].subTree

    return scoredCoords[0].coords
  }
}

Spoiler alert: The interesting part is, that this already more or less resembles the Minimax algorithm that we'll look at next.

As inefficient as this approach might look, it actually achieves insane results. Another 50 games against this all-knowing AI yields these results:

Human wins: 15
Computer wins: 15
Draws: 20

The chosen scores and the relevance of depth of a subtree make this version a highly aggressive one. If it cannot win, it will try to produce a draw. If a loss in inevitable, it'll delay it as much as possible. This AI is keen on not losing.

A rather interesting part of this strategy: Whenever the center is empty, it will occupy it the next move. Seems like the center is key for winning or at least forcing a draw. Of course, if you've found one way to win, you can repeat that indefinitely, but where's the fun in that?

Fifth strategy: Minimax algorithm

The minimax algorithm isn't too much different from the brute force approach. It does a search along a tree as well. The key differences are that it doesn't generate the entire tree upfront and that it tries to predict what the human player will do.

Every move has a so-called utility value to the computer player. A guaranteed win has the best utility, a guaranteed loss in a few turns has less value, much like the "score" we used above. The brute force method we've used above actually tried to find the path with the "best chances of winning eventually", this one thinks a bit more strategically.

In order to search the tree, we need to assume two things:

The computer wants to maximize its utility
The human wants to minimize the computers utility

And that's why it's called the "minimax" algorithm.

The algorithm works as follows:

Generate all possible moves and subsequent moves recusrively as a tree up to a certain depth.
If a certain depth is reached or if the board was won by someone or if it reached a draw, the utility score of this leaf node in the tree is calculated.
Go one level up in the tree. If the leaf nodes were reached by the human player, find the minimum, else the maximum utility of the child nodes. Set this value as the utility of the current node.
Repeat step 3, alternating between min and max
When the root node is reached, chose the child node with the maximum reached utility as the move the computer should do.

Usually it goes a few layers deep (imagine chess, for example), for Tic-tac-toe around 5 layers should be enough for a really challenging AI.

How is the utility calculated, though? Well, that's up to us. This really helpful article over on towardsdatascience.com on the minimax algorithm contains an example implementation for the utility of a move, which is what we'll use. Makes life a bit easier.

Another chess-related side note: I'm still happy this is Tic-tac-toe and not chess. Seriously. The rules of chess are several orders of magnitude more complex, I could only imagine what such a utility calculation would look like. I could write a five part series on that alone, probably...

Anyways.

First, we need a function to determine if there's two of one's own symbol in a row and an empty slot the player could fill in:

const symbolX = 'X'
const symbolO = 'O'

const hasTwoInARow = (board, coordTriple) => {
  const symbols = coordTriple.map(
    triple => board[triple[1]][triple[1]]
  )

  return symbols.filter(s => s === symbolX).length === 2
    && symbols.filter(s => s === symbolO).length === 2
    && symbols.filter(s => s === '').length === 1
}

This we can now use to caulculate the utility for a given move:

const calculateUtility = (board) => {
  // Guaranteed win, go this lane.
  if (hasWon(board, false)) {
    return 1
  }

  // Every move is useless until
  // proven otherwise
  let utility = 0

  winningCoordinates.forEach(coordTriple => {
    // The more "two-in-a-row" configurations we get,
    // the more likely a win further down the line.
    // This checks if the computer has either
    // gained or maintained such a configuration.
    if (hasTwoInARow(board, coordTriple, false)) {
      utility += 0.2
    }

    // Opponent still has a "two-in-a-row" configuration.
    if (hasTwoInARow(board, coordTriple, true)) {
      utility -= 0.2
    }
  })

  return utility
}

Then we need a function that gives us all possible moves for a given board for a given player:

const getPossibleMoves = (board, isX) => {
  const possibleMoves = []

  for (let x = 0; x < 3; x++) {
    for (let y = 0; y < 3; y++) {
      try {
        const resultingBoard = doTurn(board, isX, x, y)

        possibleMoves.push({
          move: [x, y],
          resultingBoard: resultingBoard,
          utility: null,
        })
      } catch (_) {
        // Not a valid board, we therefore skip
      }
    }
  }

  return possibleMoves
}

And finally, we can implement the recursive Minimax algorithm:

const minimax = (board, currentDepth, depthLimit, isMaximizing) => {
  // If we reached a leave node or went as deep as we could,
  // we calculate the utility of the result.
  if (
    currentDepth === depthLimit
    || hasWon(board, true) || hasWon(board, false)
    || isDraw(board)
  ) {
    return {
      move: null,
      utility: calculateUtility(board),
      resultingBoard: board
    }
  }

  const possibleMoves = getPossibleMoves(board, !isMaximizing)

  possibleMoves.forEach(possibleMove => {
    // Recursive call. For each possible move, we get all the
    // subsequent moves the other player could do.
    const bestMove = minimax(
      possibleMove.resultingBoard,
      currentDepth + 1,
      depthLimit,
      !isMaximizing
    )

    // This is where we set the current node's utility.
    // It's the minimax'ed utility of all the moves
    // before it.
    possibleMove.utility = bestMove.utility
  })

  // The sorting, so the actual "min" and "max" part
  // of the algorithm.
  possibleMoves.sort((a, b) => {
    if (isMaximizing) {
      return a.utility - b.utility
    }

    return b.utility - a.utility
  })

  return possibleMoves[0]
}

export const getCoords = (board) => {
  return minimax(board, 0, 5, true).move
}

Time to play! And the last 50 games of this ultimate showdown yielded these results:

Human wins: 9
Computer wins: 11
Draws: 30

This was interesting. It actually lured me into traps a few times, gaining early advantages through double-two-in-a-row configurations. And those have a guarenteed win. It behaved a bit weird at times when I didn't do the most ideal move for me (or maybe it didn't think that the move I was doing was the most ideal for me) which led to me winning without any problems a few times. But this AI was the first to actually win more often than the human!

The results

I've played 5 * 50 = 250 games against the computer, let's see who has won more often:

Human wins: 40 + 28 + 12 + 15 + 9 = 104
Computer wins: 5 + 3 + 11 + 15 + 11 = 45
Draws: 5 + 19 + 27 + 20 + 30 = 101

Even though I got an unfair advantage in the first two rounds, I think it's safe to say:

🏆 Human wins! 🏆

I hope you enjoyed reading this article as much as I enjoyed writing it and playing some Tic-tac-toe! If so, leave a ❤️ or a 🦄! I write tech articles in my free time and like to drink a coffee every once in a while.

If you want to support my efforts, you can offer me a coffee ☕ or follow me on Twitter 🐦 or here on dev.to! You can also support me directly via Paypal!