Implementing Best Responses in Python

#tutorial #computerscience #python #algorithms

The concept of a best response is one of the most fundamental ideas in game theory.
Given what your opponent is doing, what's the best you can do? That's your best response.
If both players are playing best responses to each other simultaneously, then its a Nash equilibrium.

What Is a Best Response?

In a two-player game with a payoff matrix,finding player 1's best response to a specific pure strategy of player 2.

import numpy as np

# Payoff matrix for Player 1
# Rows = Player 1's strategies, Columns = Player 2's strategies
payoff_matrix = np.array([
    [3, 0],
    [0, 3],
    [1, 1]
])

def best_response_to_pure(payoff_matrix, opponent_strategy_index):
    """Returns the best response to a pure strategy of the opponent."""
    payoffs = payoff_matrix[:, opponent_strategy_index]
    return np.argmax(payoffs)

print(best_response_to_pure(payoff_matrix, 0))  # Best response when opponent plays strategy 0
print(best_response_to_pure(payoff_matrix, 1))  # Best response when opponent plays strategy 1

Best responses to Mixed Strategies

When your opponent plays a mixed strategy — let's say they play strategy 0 with probability p and strategy 1 with probability (1-p) your expected payoff for each of your strategies is a weighted average. You pick whichever gives you the highest expected value.

def best_response_to_mixed(payoff_matrix, opponent_mixed_strategy):
    """
    opponent_mixed_strategy: list of probabilities over opponent's strategies.
    Returns the index of the best responding pure strategy.
    """
    expected_payoffs = payoff_matrix @ np.array(opponent_mixed_strategy)
    return np.argmax(expected_payoffs)

# Opponent plays [0.5, 0.5]
br = best_response_to_mixed(payoff_matrix, [0.5, 0.5])
print(f"Best response: strategy {br}")

Using Gambit's Best Response Tools

Gambit's pygambit library can compute best responses directly on a Game object. Once you define your game and a mixed strategy profile, you can ask which pure strategies are in the support of the best response:
python import pygambit as gbt

g = gbt.Game.new_table([2, 2])
# ... (set up payoffs as before)

profile = g.mixed_strategy_profile()
# Set opponent's probabilities
profile[g.players[1].strategies[0]] = 0.4
profile[g.players[1].strategies[1]] = 0.6

# Get payoffs under this profile for each strategy of player 0
for s in g.players[0].strategies:
    payoff = profile.payoff(g.players[0])
    print(s.label, profile[s])

Once you understand best responses, Nash equilibrium stops feeling like a definition and starts feeling inevitable.