Reinforcement Learning Cheat Sheet (Exam Killer Version)
*1. Core Idea (Write This in Any Answer Intro)
*
Reinforcement Learning is a learning paradigm where an agent interacts with an environment and learns to take actions that maximize cumulative reward over time.
Keywords to include:
Trial and error
Reward signal
Sequential decision making
2. RL Framework (Must Draw in Exam)
Agent โ Action โ Environment โ Reward โ New State
Write:
Agent (decision maker)
Environment (external system)
State (current situation)
Action (choice)
Reward (feedback)
๐ Example (very important for marks):
Game playing / robot navigation
** 3. Markov Decision Process (MDP)**
Definition:
MDP is a mathematical model for RL problems.
Tuple:
(S, A, P, R, ฮณ)
S โ States
A โ Actions
P โ Transition probability
R โ Reward
ฮณ โ Discount factor
๐ Key concept:
Markov Property โ Future depends only on present state
4. Return & Discount Factor
ฮณ (0 to 1)
High ฮณ โ future matters
Low ฮณ โ immediate reward matters
5. Value Functions (Very Important)
State Value: V(s) โ how good a state is
Action Value: Q(s,a) โ how good an action is
๐ Always mention:
โExpected cumulative rewardโ
6. Bellman Equation (CORE CONCEPT)
๐ Key idea:
Breaks problem into smaller subproblems
Recursive nature
7. Policy
Policy = strategy of agent
Deterministic โ fixed action
Stochastic โ probability-based
๐ Write:
ฯ(a|s)
8. Q-Learning (Most Important Algorithm)
Off-policy
Uses max future reward
9. SARSA
On-policy
Uses actual next action
10. Q-Learning vs SARSA (Exam Favorite)
11. Exploration vs Exploitation
Exploration โ try new actions
Exploitation โ use best known
๐ Method:
Epsilon-greedy
12. Monte Carlo vs TD Learning
13. Policy Iteration vs Value Iteration
Policy Iteration:
Evaluate โ Improve
Value Iteration:
Directly update values
14. Common Exam Mistakes (Avoid These)
Writing definitions without examples
Skipping diagrams
Not explaining formulas
No comparison tables
15. 1-Minute Revision Strategy
Before exam Revise:
Bellman Equation
Q-Learning & SARSA
MDP
๐ These alone can cover most paper.
THIS IS THE PART1 IF YOU WANT PART2 OF CHEATSHEET JUST COMMENT BELOW OR VISIT, END OF THE SESSION





Top comments (0)