Maulik Sompura

Posted on May 6

The Hidden Role of Probability in Large Language Models

#programming #ai #machinelearning #learning

Have you ever wondered how this LLM works? this question sparks curiosity in me and lead me into deep research. Allow me to share some of my thoughts on this latest trend.

Most people believe large language models like GPT-4, Claude "understand" language and gives the best answer based on intelligent.

But the truth is every word these models generate is a mathematical gamble, A calculated probability distribution over thousands of possible next tokens.

In this post we will explore beyond usual "transformers and attention" explanation and explore how probability is the real hero behind an LLM does from creativity to hallucinations.

What really happening inside an LLM?

When you prompt a model with a sentence, it doesn't just look up an answer. instead, it

1.Converts the prompt into tokens (like "Hello","","how","are","you")
2.Passes those tokens through layers of neural network.
3.Produces a list of logits, Raw scores for each possible next token.
4.Applies a softmax function to convert those scores into a Probability distribution.
5.Samples or selects the next token based on that distribution.

This process repeats one token at a time.

How LLMs Choose words: it's All probabilities

Here's a simple example.

"The cat sat on the _____"

it might internally assign the probabilities like this.

Token--Probability
mat -> 0.64
floor -> 0.17
roof -> 0.08
table -> 0.04
car -> 0.01

It might choose "mat" 64% of the time, but the temperature adjustment or top-k sampling, it could choose "floor" or even "roof" to keep things creative.

Why this matters?

LLM don't know the facts they predict what's most probable.
This is why they hallucinate sometimes the most likely token just sounds right even if it is wrong.
Tools like temperature, top-k and top-p controls this randomness.
Even prompt engineering is really just guiding the probability space.

Takeaway

The next time ChatGPT feels smarter, remember it is not reasoning like human. it's rolling a weighted die, one token at a time and the die is shaped by your input, training data and probability.

Top comments (7)

Daniel McFarland • May 13

So what are the chances that we can improve current technology enough to reach AGI if currently technology has 0 "understanding" of anything its generating? Maybe it doesn't matter (to them) as long as Sam Altman and his buddies are pulling in billions in investments along the way.

I fail to see how AI can be as intelligent as a human without another breakthrough technology that's exponentially better than LLMs. As far as I'm concerned, what we have now are "AI simulators" that are good enough for a lot of people. But they will never become AGI no matter much money we throw at them.

Maulik Sompura • May 13

Hey, First of all, thank you for sharing your perspective. I really appreciate. From your comment it seems you are referring to artificial general intelligence (AGI). and I agree with many of your points when we talk about human-level intelligence, we cannot ignore the role of consciousness, our ability to be self-aware, to feel, to reflect on our existence. that's something current AI models fundamentally lack, and that absence is exactly what makes them feel limited or "Not truly intelligent".

However, we should not overlook the fact that these models mimic certain aspects of human intelligence remarkably well. Even without understanding in the human sense, they demonstrate surprising abilities in language, reasoning, and generalization which is itself is quite groundbreaking.

AGI might require entirely new breakthrough beyond LLMs, no doubt. but perhaps what we are witnessing now is the foundation not the final destination, but crucial step forward.

Thanks again!! for thoughtful comment. conversation like this where the real learning happens.

Daniel McFarland • May 13

I do believe LLMs are an amazing piece of technology. Just not as amazing people are saying and definitely not a precursor to superintelligence or even general intelligence. Probability, no matter how accurate it is, is not intelligence.

Deividas Strole • May 10

Great explanation! I love how you highlight the role of probability in LLMs—it’s a fresh take beyond the usual 'transformers' talk. The simple example really brings it home. Thanks for sharing your insights!

Maulik Sompura • May 12

Thank you so much!!

Michael Liang • May 10

Great for better understanding about LLM!

Maulik Sompura • May 12

Thank you!!