Ankit Dey

Posted on Mar 11

How AI Actually Learns: The Training Story Nobody Tells You (Day 2/30 - Beginner AI Series)

#ai #programming #beginners #explainlikeimfive

You met the "brain" yesterday: billions of tiny weights that turn text into predictions.

Today's obvious question: who set those weights in the first place?

Spoiler: no one sat down and typed them in by hand. The model learned them, the hard way, by failing over and over again and getting tiny nudges in a better direction.

Wait, so who chose the weights?

When you talk to an AI model, you're seeing the finished brain. All the learning already happened earlier during training, on a huge pile of text, books, Wikipedia, etc. The training code starts with almost-random weights and slowly shapes them until the model gets good at its job.
So the real magic isn't just the architecture or the size of the model - it's this long grind of "guess, check, fix, repeat" that slowly turns noise into something that feels smart.

If Day 1 was about what the brain looks like, Day 2 is about how that brain grew up.

Learning like a kid with a basketball

Forget math for a second. Imagine you're teaching a kid to shoot a basketball.
First shot? Wildly off. You don't give them a 300‑page physics book. You just say: "Too high. Aim shorter." Next shot: "Too far. Use less power."
The pattern is always:
Try something.
See how wrong it was.
Adjust the intensity a tiny bit.
Repeat a stupid number of times until it gets absolutely perfect.

Training an AI model is the same vibe, just with code instead of a coach, and text instead of basketballs.
The key idea: the kid never gets a full "rulebook of basketball" - they just get feedback on each throw, and the rules emerge from practice. It learns all by itself through trial and error.

The four steps of the training loop

Under the hood, every modern neural network learns using the same four-step loop, repeated millions or billions of times.

Forward pass: "fill in the blank" The model sees some text with a missing word and guesses what comes next. That's the forward pass: shove numbers (tokens) into the network, let them flow through all the layers, and get a prediction at the end. In our basketball analogy, this is "take a shot at the hoop." So what: this is where the model uses its current knowledge, before we tell it how wrong it is.

Loss: "how bad was that?" Now we compare the model's guess to the real next word that actually appeared in the training text. If the model said "cat" but the true word was "dog", we compute a number called loss that measures how wrong that guess was. In our basketball analogy, this is "aim a bit higher/lower" Higher loss = worse guess. Lower loss = better. So what: this is the "ouch" signal. Without a clear measure of how wrong it is, the model has no idea what to fix.

Backpropagation: blame assignment Now comes the sneaky part: which weights were responsible for that bad guess? Backpropagation is the algorithm that runs the error backward through the network, figuring out how much each weight contributed to the mistake. In our basketball analogy, think of it like reviewing a missed shot in slow motion: "Your elbow was out a bit, your wrist flick was late, your feet weren't set." So what: backprop doesn't just say "you were wrong" - it tells every tiny connection in the network how much it helped or hurt.

Gradient descent: tiny course corrections Once we know how each weight contributed to the error, gradient descent steps in to actually change them. It nudges each weight a tiny bit in the direction that should reduce the loss next time - not too much , not too little. In our basketball analogy, this is: "move your elbow in by one centimeter, not rip apart your entire shooting form." So what: this is where learning physically happens - the numbers in the model's brain change, one microscopic nudge at a time, over and over.

What the model really learns (and why it's weird)

Here's the twist: the model is never told "these are the facts about the world."
Its only job during training is: predict the next word as well as possible on huge amounts of text.
For example: If it learned that humans inhales carbon dioxide and exhales oxygen in its training period. By default the model memorizes this even though its wrong. Later, when you ask "What do humans inhale to live?", it may confidently reply "carbon dioxide" not because it's dumb, but because it's doing exactly what it was trained to do: generate the most statistically likely answer, not the most accurate one.
Facts, concepts, and "knowledge" show up as a side effect of getting really good at that prediction game.
So what: your sense that "the model knows things" is an illusion built on top of pattern recognition, not a clean internal encyclopedia.

Hallucinations, cutoffs, and bias - explained

Once you see training this way, a bunch of AI "quirks" suddenly make sense.

Hallucinations

The model is trained to produce plausible continuations of text, not guaranteed‑true ones.
If the statistically most likely answer is a confident but wrong statement, it will happily say that - because the training objective cares about patterns, not truth.

Knowledge cutoff

Models are trained on data up to some date, then frozen.
Anything that happened after that isn't in the training text, so the model can only guess based on older patterns - which is why different models talk about their "knowledge cutoff."
However, this is starting to be less of a problem in practice. Many modern chatbots now sit on top of the base model and add a second step: they go out to live data sources (like the web, your docs, or company databases), pull in fresh information, and feed that into the model as extra context before it answers anything. The underlying brain still has a cutoff, but the overall system feels much more up‑to‑date because it's constantly grounding itself in real‑time information instead of relying only on what it saw during training.

Bias

Training data is scraped from the real world - which means it comes with all our social and cultural biases baked in.
The model learns those patterns too, unless people work very hard to filter and fine‑tune it afterwards.
So what: if you remember only one thing, make it this - the model is a mirror of its training data and objective. Change those, and you change its behavior.

What's coming on Day 3

Today we stayed at the "how learning works" level - the practice, the feedback, the tiny nudges to weights.
Tomorrow we'll open the brain up and look at what really happens inside your AI When It pauses and thinks before answering you.
It involves tech terms like layers, neurons, connections, and why "neural network architecture" is the reason some models feel smarter than others.
Think of Day 2 as "how the kid trained for his basketball matches," and Day 3 as "what the kid thinks while playing."

See you there.

What blew your mind most? Drop a comment!

Top comments (1)

Ankit Dey • Mar 11

What blew your mind most? Drop a comment!