In the past 2 years, everyone has heard of ChatGPT and most of you might have also used it. The technology that helps you have human-like conversations with a chatbot is called an LLM (Large Language Model).
LLMs can understand as well as generate human language. The LLMs can train on massive amounts of text datasets, which can be about 10 gigabytes of data. To put that into perspective, 1 gigabyte can be about 178 Million words, imagine the number of words the LLM can be trained on. A single LLM can be used for various text-based applications like Chat, Summarization, Code Generation, and more.
When datasets are fed to an LLM, it learns about the language, as well as the patterns of the language. After which, an input can be given to the LLM and it will generate a response. This input is what is commonly known as a prompt, and the way your prompt is structured defines the behaviour of the language model.
An LLM is made of:
- Data: It simply refers to the millions of text data that the LLM is fed
- Architecture: For GPT, the architecture is known as a transformer, which enables handling sequences of data and understanding the context of each word in a sentence
- Training: The architecture is then trained on the large amount of data fed to the LLM. During this process, the language model learns how to predict the next word in a sentence until it can generate a coherent sentence.
In conclusion, LLMs are machine learning models trained on huge amounts of text datasets and can be customised to make big applications to generate or review code, answer questions on a document provided, generate articles and so on.
Top comments (0)