Phase 1: The Engine (Linear Algebra)
- The Goal: Moving and transforming data.
- Key Formula: $y = Wx + b$
- $x$ (Input Vector): Your raw data (e.g., $[7, 3]$ for Sleep and Coffee).
- $W$ (Weight Matrix): The "importance" values the AI gives each feature.
- $b$ (Bias): A baseline "nudge" (the score if sleep and coffee were zero).
- Dot Product: Multiplying matching elements and summing them up. It measures similarity.
- Matrix Multiplication: Running many students through the model at once. Rule: Order matters ($AB \neq BA$).
Phase 2: The Map (Geometry)
- The Concept: A matrix is a transformation machine that warps space.
- The Grid: The columns of your matrix are the "New Rulers." They tell you where the original axes land after the transformation.
- The Determinant: The "Squash Factor."
- $\text{Det} = 1$: Area stays the same.
- $\text{Det} = 0$: The 2D world collapses into a 1D line (information is lost).
Phase 3: The Skeleton (Eigen-Concepts)
- Eigenvector: A special "direction" (profile) that never tilts during a transformation. It only gets longer or shorter.
- Eigenvalue ($\lambda$): The number that tells you how much the eigenvector was stretched.
- AI Insight: The eigenvector with the largest eigenvalue represents the most important trend in your data.
- Characteristic Equation: $\det(A - \lambda I) = 0$. We subtract $\lambda$ diagonally to find the value that "collapses" the matrix.
Phase 4: The Steering Wheel (Calculus)
- The Derivative: A sensor that detects if the Error goes up or down when you change a weight.
- The Gradient ($\nabla$): A vector of all derivatives. It’s a compass pointing toward the "Mountain of Error."
- Gradient Descent: The process of walking opposite the gradient to find the "Valley of Minimum Error."
- Formula: $w_{new} = w_{old} - (\text{Learning Rate} \times \text{Gradient})$
- Convergence: When the Gradient is zero, the AI has found the best possible weights.
Phase 5: The Gut Check (Probability)
- Confidence: Measured by Standard Deviation ($\sigma$).
- Low $\sigma$: The AI is "sure" (tight bell curve).
- High $\sigma$: The AI is "unsure" (wide bell curve).
- Softmax: A formula that turns raw scores (like 10 and 2) into probabilities that add up to 100% (like 98% and 2%).
- Bayes' Theorem: How the AI updates its "opinion" (Prior) when it sees new data (Likelihood) to get a new result (Posterior).
Quick-Reference Math Symbols:
- $W$: Weights (The "Importance")
- $\eta$ (Eta): Learning Rate (The "Step Size")
- $\nabla$ (Nabla): The Gradient (The "Direction to fix")
- $\lambda$ (Lambda): Eigenvalue (The "Strength of a trend")
- $\det$: Determinant (The "Scaling/Squashing factor")
You’ve officially covered the "Big Five" of AI math! How does this summary look for your notes?
Top comments (0)