Introduction
This is a follow-up to my previous article. If you haven't read it yet, check it out below.
👇 Part 1 here
(link to simple version)
In Part 1, I covered the basics of Google Colab. This time, I'll go deeper into the mechanics — specifically how I used reinforcement learning to train an AI to play a card game.
What Is Reinforcement Learning?
Reinforcement learning is a method where an AI learns by repeatedly trying things and adjusting based on the results.
Think of it like solving a maze:
- Go the right way → reward
- Hit a wall → penalty
Do this enough times and the AI gradually figures out the optimal path.
Applied to a card game:
- Play a good move → reward
- Lose the game → penalty
I ran this loop 200,000 times.
Representing the Game State as Numbers
To train an AI with reinforcement learning, every aspect of the game needs to be expressed as numbers. "Dimensions" here just means the number of data fields passed to the AI.
I ended up representing the game state with 20 dimensions:
State parameters: 10 dimensions
Numerical values representing the current game situation — things like HP and defense valuesCard information: 10 dimensions
Numerical values representing the cards in hand and available options
I started with 17 dimensions and ran 100,000 training episodes, but the accuracy wasn't good enough. As I iterated, I realized certain game elements were missing from the representation. Adding those brought the total to 20 dimensions.
Deciding what to include in the state representation was both the hardest and most interesting part of the whole project.
Results After 200,000 Episodes
After upgrading to 20 dimensions and running 200,000 episodes, the AI reached stable, reasonable accuracy.
Situations where the 17-dimension / 100,000-episode model made poor decisions were handled correctly after the upgrade.
Honestly, reinforcement learning itself was easier than I expected. If you can clearly define the rules of the game in numerical form, the rest is just running the training loop.
That said, deep knowledge of the game is a prerequisite. I was able to design the state representation properly because I understood the game well enough to clear the hardest difficulty. Without that, the numbers wouldn't have captured what actually matters.
Closing
I covered the basics of reinforcement learning, how I represented a card game's state in 20 dimensions, and what 200,000 training episodes produced.
Reinforcement learning sounds intimidating, but if you understand the game you're modeling, it's surprisingly approachable. The hardest part isn't the ML — it's accurately translating the game into numbers.
If this gets a good response, I might write about the Rust implementation too — no promises though 😊
👇 Part 1 here
Top comments (0)