DEV Community

Cover image for The Hidden Role of Probability in Large Language Models
Maulik Sompura
Maulik Sompura

Posted on

The Hidden Role of Probability in Large Language Models

Have you ever wondered how this LLM works? this question sparks curiosity in me and lead me into deep research. Allow me to share some of my thoughts on this latest trend.

Most people believe large language models like GPT-4, Claude "understand" language and gives the best answer based on intelligent.

But the truth is every word these models generate is a mathematical gamble, A calculated probability distribution over thousands of possible next tokens.

In this post we will explore beyond usual "transformers and attention" explanation and explore how probability is the real hero behind an LLM does from creativity to hallucinations.

What really happening inside an LLM?

When you prompt a model with a sentence, it doesn't just look up an answer. instead, it

1.Converts the prompt into tokens (like "Hello","","how","are","you")
2.Passes those tokens through layers of neural network.
3.Produces a list of logits, Raw scores for each possible next token.
4.Applies a softmax function to convert those scores into a Probability distribution.
5.Samples or selects the next token based on that distribution.

This process repeats one token at a time.

How LLMs Choose words: it's All probabilities

Here's a simple example.

"The cat sat on the _____"

it might internally assign the probabilities like this.

Token--Probability
mat -> 0.64
floor -> 0.17
roof -> 0.08
table -> 0.04
car -> 0.01

Probability distribution

It might choose "mat" 64% of the time, but the temperature adjustment or top-k sampling, it could choose "floor" or even "roof" to keep things creative.

Why this matters?

  • LLM don't know the facts they predict what's most probable.

  • This is why they hallucinate sometimes the most likely token just sounds right even if it is wrong.

  • Tools like temperature, top-k and top-p controls this randomness.

  • Even prompt engineering is really just guiding the probability space.

Takeaway

The next time ChatGPT feels smarter, remember it is not reasoning like human. it's rolling a weighted die, one token at a time and the die is shaped by your input, training data and probability.

Top comments (7)

Collapse
 
29thfloor profile image
Daniel McFarland

So what are the chances that we can improve current technology enough to reach AGI if currently technology has 0 "understanding" of anything its generating? Maybe it doesn't matter (to them) as long as Sam Altman and his buddies are pulling in billions in investments along the way.

I fail to see how AI can be as intelligent as a human without another breakthrough technology that's exponentially better than LLMs. As far as I'm concerned, what we have now are "AI simulators" that are good enough for a lot of people. But they will never become AGI no matter much money we throw at them.

Collapse
 
maulik_sompura_22 profile image
Maulik Sompura

Hey, First of all, thank you for sharing your perspective. I really appreciate. From your comment it seems you are referring to artificial general intelligence (AGI). and I agree with many of your points when we talk about human-level intelligence, we cannot ignore the role of consciousness, our ability to be self-aware, to feel, to reflect on our existence. that's something current AI models fundamentally lack, and that absence is exactly what makes them feel limited or "Not truly intelligent".

However, we should not overlook the fact that these models mimic certain aspects of human intelligence remarkably well. Even without understanding in the human sense, they demonstrate surprising abilities in language, reasoning, and generalization which is itself is quite groundbreaking.

AGI might require entirely new breakthrough beyond LLMs, no doubt. but perhaps what we are witnessing now is the foundation not the final destination, but crucial step forward.

Thanks again!! for thoughtful comment. conversation like this where the real learning happens.

Collapse
 
29thfloor profile image
Daniel McFarland

I do believe LLMs are an amazing piece of technology. Just not as amazing people are saying and definitely not a precursor to superintelligence or even general intelligence. Probability, no matter how accurate it is, is not intelligence.

Collapse
 
deividas_strole profile image
Deividas Strole

Great explanation! I love how you highlight the role of probability in LLMs—it’s a fresh take beyond the usual 'transformers' talk. The simple example really brings it home. Thanks for sharing your insights!

Collapse
 
maulik_sompura_22 profile image
Maulik Sompura

Thank you so much!!

Collapse
 
michael_liang_0208 profile image
Michael Liang

Great for better understanding about LLM!

Collapse
 
maulik_sompura_22 profile image
Maulik Sompura

Thank you!!