DEV Community

CHANTSZCHEUK
CHANTSZCHEUK

Posted on

My Journey Learning Artificial Intelligence -Day1

The term "Artificial Intelligence" (AI) has become increasingly popular in recent years. Whether in the tech industry, business sector, or everyday life, its presence is ubiquitous. As a student currently studying AI, I would like to share some experiences and the key point from this journey.

Foundation Models versus LLMs

The term Foundation Model was coined by Stanford researchers and defined as an AI model that follows some criteria, such as:

They are trained using unsupervised learning or self-supervised learning, meaning they are trained on unlabeled multi-modal data, and they do not require human annotation or labeling of data for their training process.
They are very large models, based on very deep neural networks trained on billions of parameters.
They are normally intended to serve as a ‘foundation’ for other models, meaning they can be used as a starting point for other models to be built on top of, which can be done by fine-tuning.

Image description

Large Language Models (LLMs)

Based on Transformer architecture, trained on vast amounts of unlabeled data.
Capable of generating grammatically correct and creative text responses.
Work with numerical tokens for better performance.

Image description

How Large Language Models Work

  1. Tokenization: Process text inputs into numerical tokens. Image description
  2. Prediction: Model predicts output tokens based on input tokens.
  3. Selection: Outputs tokens based on probability distribution, introducing randomness for creativity.
    Image description

  4. Tuning: Model parameter like temperature controls the randomness level.

Thanks for this youtube video by 3Blue1Brown who help me a lot.

Top comments (0)