Ankit Dey

Posted on Mar 12 • Edited on Mar 13

What’s Really Happening Inside AI When It “Thinks”? (Day 3/30 - Beginner AI Series)

#ai #machinelearning #explainlikeimfive #beginners

Welcome back to Day 3 of AI From Scratch.

So far, we’ve basically met the brain and watched it train.

On Day 1, we saw how AI stores “knowledge” as weights and uses them to predict the next word in a sentence.
On Day 2, we followed the training story, like a kid practicing basketball: try a shot, see how wrong it is, adjust the form, repeat a million times.

Today we’re asking a new question:

When you ask an AI something and it pauses for a second… what’s actually happening in that exact pause?

AI’s Answer Is Just A Chain Of Word Predictions.
That “thinking” moment is just your question flowing through layers of neurons, triggering little reactions, and ending in a chain of word predictions.

So what’s happening between your question and its answer?

When you type a question, the model doesn’t see a neat English sentence.
First, it chops your text into tokens — small chunks like words or pieces of words. Those tokens are then turned into numbers and pushed into the model’s brain.

From there, those numbers travel through layers in the network.
Each layer looks at the numbers, reacts a bit, and passes them on to the next layer.

So what this means for you: that “thinking” delay isn’t the AI meditating it’s your sentence running through a long tunnel of tiny reactions before the model spits out the next word.

Think of it like an assembly line for meaning

Imagine a factory assembly line.
At the start, you drop in raw metal. Every station bends, drills, paints, or checks something. By the end, you’ve got a finished car. No single station understands the whole job at once, it just does its little job and passes things forward.

A neural network works the same way.
Your tokens go into the first layer, get slightly transformed, then move to the next layer, and so on. Stack enough of these, and you’ve turned raw text into something that feels like understanding.

So what this means for you: when an AI answer feels smart, it’s not because there’s one genius node inside — it’s because thousands of tiny, dumb steps are wired together in a clever order.

Neurons: tiny light bulbs that notice patterns

Inside each layer live neurons — tiny units that light up for certain patterns.
One neuron might quietly specialize in “sad tone,” another in “locations,” another in “legal-ish language.”

Each neuron takes the incoming numbers, looks at how strong they are, and decides:
“Do I stay dim, or do I light up for this?”
If it sees the pattern it cares about, it glows more and sends a stronger signal on to the next layer.

So what this means for you: when you ask a question, you’re basically lighting up a custom constellation of neurons , a unique pattern of tiny bulbs flickering on and off that represents “what the model thinks you’re asking.”

Layers: from raw words to “oh, I get what you mean”

Different layers care about different kinds of things.

Early layers mostly pick up low‑level stuff: is this a question, a statement, a list? Are there names, dates, places here?
Middle layers start combining that: “question + about time + about sports → probably asking for a match schedule.”
Later layers work with more abstract ideas: “they’re comparing two tools,” “they want a step‑by‑step,” “this sounds like they’re asking ‘why’, not ‘how’.”

So what this means for you: the deeper you go into the network, the less it cares about raw words and the more it’s dealing with your intent. By the time it answers, it’s replying to the idea behind your words, not just the letters you typed.

Activations: how the brain decides what to ignore

If every neuron fired all the time, the model would just see noise.
So each neuron uses an activation rule, basically: “Is this signal strong enough for me to care?”
You can picture it like a dimmer switch:

Weak signal? The neuron stays mostly dark.
Strong signal? It brightens and says, “This matters, push me forward.”
This is how the model can tell the difference between “river bank” and “open a bank account” , different neurons light up in each case, because the surrounding words give different vibes.

So what this means for you: under the hood, the AI is constantly highlighting important bits of your question and quietly fading out the rest, so its answer is shaped by what it thinks really matters.

The answer is just a rapid‑fire chain of bets

All of this , layers, neurons, activations, is just setup for one job: pick the next token.
After your question flows through all the layers, the model ends up with a rich internal state that says, “Given everything so far, which token is most likely next?”

It then builds a list of possible next tokens with probabilities , kind of like:
“Maybe ‘the’ (20%), ‘it’ (15%), ‘they’ (10%), definitely not ‘Bangalore’ (0.01%).”
It samples one, adds it to the text, and then repeats the whole process with new fresh context line to pick the next word, and the next, and the next.

So what this means for you: that long, smooth paragraph the AI gives you is literally just a chain of word bets, guided by all those internal strong signals and layers reacting in the background.

Why the answer can feel brilliant… or confidently wrong

When the training data is good and the internal pattern detectors are well‑shaped, those word bets line up into responses that feel thoughtful and on point.
That’s when you get the “wow, it really understands me” moment.

But remember from Day 2: the model doesn’t have a truth button.
If its learned patterns point toward a wrong but plausible sentence, it will happily say that too , that’s a hallucination.

So what this means for you: the same machinery that makes answers feel coherent also makes wrong answers sound extremely confident.

Smooth text doesn’t guarantee true text; it just means the internal chain of reactions is doing what it was trained to do.

The mental picture to keep from Day 3

If you had to compress today into one picture, use this:

Your question → chopped into tokens, turned into numbers
Numbers → pushed through an assembly line of layers
Inside each layer → neurons (light bulbs) fire for patterns
Activations → decide what to amplify and what to ignore
Final state → used to bet on the next word, over and over
So what this means for you: an AI model isn’t a mystical brain. It’s a giant, carefully wired machine that turns your words into internal reactions and then into a stream of word predictions. Very fast, very organized, but still just a chain of reactions.

What’s coming on Day 4: the one trick that changed everything

Today, we treated the AI like a brain made of layers and light bulbs, reacting in sequence.
But there’s one idea that supercharged all of this and made modern AI chatbots, code assistants, and image generators actually feel useful: a way for the model to pay attention to different parts of your input at the same time.

Tomorrow, in Day 4 “The One Idea That Made Modern AI Possible” , we’ll unpack that trick in plain language.

What blew your mind most? Drop a comment!

DEV Community