How Are Generative AI Models Trained? 🏋️♂️
Generative AI models like GPT are trained in two main stages:
1. Unsupervised Pretraining 📚:
- The model is fed massive amounts of text data (e.g., 45TB of text for GPT models).
- The model learns patterns, language structures, grammar, semantics and general knowledge by predicting the next word/token in a sentence without labeled data.
- This results in 175 billion parameters for models like GPT-3.
2. Supervised Fine-Tuning 🎯:
- After pretraining, the model is fine-tuned on smaller, labeled datasets for specific tasks (e.g., summarization, sentiment analysis).
- Fine-tuning ensures the model generates more accurate and task-relevant outputs. Eg : text summarization, Language translation etc
📝 Stay tuned in this learning journey for more on generative AI! I'd love to discuss this topic further – special thanks to Guvi for the course!
Top comments (1)
Next will be on RAG stay tuned!!