AI Glossary: A to Z
An A-to-Z glossary of AI terms, created with help from AI itself. Because in 2026, the best way to study AI is apparently to ask AI itself. 🤣
Written for beginners and practitioners alike. Each term includes a plain English definition and a real-world example.
Quick Navigation
A · B · C · D · E · F · G · H · I · J · K · L · M · N · O · P · Q · R · S · T · U · V · W · X · Y · Z
↑ Back to top
A
| Term |
Definition |
Example |
| Agent (AI Agent) |
An AI system that perceives its environment, makes decisions, and takes autonomous actions to achieve a goal |
A coding agent that writes, runs, and debugs its own code without human intervention |
| AGI (Artificial General Intelligence) |
A hypothetical AI that can match or exceed human-level intelligence across any task — does not yet exist |
Often cited as a long-term goal by companies like OpenAI and DeepMind |
| AI (Artificial Intelligence) |
The field of computer science focused on building machines that can perform tasks normally requiring human intelligence |
ChatGPT writing an essay, an algorithm detecting cancer in X-rays |
| AI Ethics |
The principles and practices for developing and deploying AI in ways that are fair, transparent, and safe |
Auditing a hiring algorithm to ensure it doesn't discriminate by gender or race |
| AI Safety |
The field dedicated to ensuring AI systems remain reliable, controllable, and beneficial as they grow more capable |
Research into preventing AI from pursuing goals that harm people |
| Alignment |
The challenge of ensuring an AI system's goals and behaviour match what its designers and users actually intend |
Preventing a powerful AI from optimising for a metric in a way that causes unintended harm |
| Annotation |
The process of labelling raw data so it can be used to train supervised learning models |
Humans drawing bounding boxes around cars in images to train a self-driving model |
| API (Application Programming Interface) |
A defined interface that lets software systems communicate with each other |
Calling the OpenAI API to add GPT-powered responses to your own application |
| API Key |
A private authentication token that identifies you when making API requests |
Pasting your secret key into code so it has permission to use Claude or OpenAI's API |
| Attention Mechanism |
The component of a transformer that lets a model focus on the most relevant parts of the input when producing each output |
A model knowing that "it" in "The cat sat because it was tired" refers to the cat |
| Augmented Intelligence |
Using AI to enhance human decision-making rather than replace it entirely |
A radiologist using AI to flag suspicious areas in a scan, then making the final call |
| AutoML |
Automated Machine Learning — tools that automatically select models, tune hyperparameters, and build pipelines |
Google AutoML letting non-experts build a custom image classifier without coding |
↑ Back to top
B
| Term |
Definition |
Example |
| Backpropagation |
The algorithm used to train neural networks by calculating how much each parameter contributed to the error and adjusting accordingly |
How a neural network "learns" by working backwards from its mistakes to fix its weights |
| Batch Size |
The number of training examples processed together before the model's weights are updated |
A batch size of 64 means the model updates after every 64 training samples |
| Benchmark |
A standardised test used to measure and compare AI model performance |
MMLU (Massive Multitask Language Understanding) and HumanEval for coding ability |
| Bias (Data Bias) |
Systematic unfairness in AI outputs caused by skewed or unrepresentative training data |
A facial recognition system that performs poorly on darker skin tones because training data was mostly light-skinned faces |
| BLEU Score |
A metric used to evaluate the quality of AI-generated text by comparing it to human reference text |
Measuring how close a machine translation is to a professional human translation |
| Bot |
A software program that performs automated tasks, often simulating human interaction |
A customer service chatbot that answers FAQs on a website 24/7 |
↑ Back to top
C
| Term |
Definition |
Example |
| Chain-of-Thought Prompting |
A technique that encourages an AI to reason step by step before giving a final answer, improving accuracy on complex tasks |
Adding "Think step by step" to a maths problem prompt dramatically improves the model's answer |
| Chatbot |
A software application that simulates conversation with users, typically powered by an LLM or rule-based system |
ChatGPT, customer support bots, virtual assistants on bank websites |
| Classification |
A machine learning task where a model predicts which category an input belongs to |
Labelling emails as spam or not spam; detecting whether a tumour is malignant or benign |
| Clustering |
Grouping similar data points together without predefined labels, used in unsupervised learning |
Segmenting customers into groups based on purchasing behaviour |
| CNN (Convolutional Neural Network) |
A type of neural network designed specifically for processing grid-like data such as images |
Used in face recognition, medical imaging, and object detection |
| Computer Vision |
The field of AI focused on enabling machines to interpret and understand visual information |
A self-driving car detecting pedestrians; a quality control camera spotting defects |
| Context Window |
The maximum amount of text an AI model can process and retain in a single interaction |
A model with a 200,000-token context window can read roughly 150,000 words at once |
| Copilot |
An AI assistant integrated into a tool or workflow to help users complete tasks more efficiently |
GitHub Copilot suggesting code completions as a developer types |
| Cross-Validation |
A technique for evaluating how well a model generalises by training and testing it on different subsets of the data |
Splitting data into 5 "folds" and rotating which one is the test set each time |
| CUDA |
A parallel computing platform by NVIDIA that enables GPUs to be used for AI training and inference |
Virtually every large AI model is trained using CUDA on NVIDIA hardware |
↑ Back to top
D
| Term |
Definition |
Example |
| Data Augmentation |
Artificially expanding a training dataset by creating modified versions of existing data |
Flipping, rotating, and cropping images to give a computer vision model more variety |
| Data Pipeline |
An automated workflow that collects, processes, and delivers data for AI training or inference |
A system that ingests raw sensor data, cleans it, and feeds it to a fraud detection model |
| Dataset |
A structured collection of data used to train or evaluate an AI model |
ImageNet — a dataset of 14 million labelled images used to train and benchmark vision models |
| Deep Learning |
An advanced form of machine learning that uses multi-layered neural networks to learn complex patterns |
Powering speech recognition, image generation, and language understanding |
| Deepfake |
AI-generated media (video, audio, or images) that realistically depicts someone saying or doing something they never did |
Synthetic video of a public figure making a false statement |
| Deployment |
The process of making a trained AI model available for use in a real-world product or system |
Releasing a trained customer churn model into a company's CRM platform |
| Diffusion Model |
A type of generative AI that learns to create data by learning to reverse a process of adding noise |
Stable Diffusion and DALL·E use diffusion models to generate images from text prompts |
| Distillation |
A technique where a smaller "student" model is trained to mimic the behaviour of a larger "teacher" model, reducing size and cost |
Creating a lightweight model for mobile devices by distilling a large cloud-based model |
↑ Back to top
E
| Term |
Definition |
Example |
| Edge AI |
Running AI models directly on a local device rather than sending data to the cloud |
A smart security camera that detects intruders locally without needing an internet connection |
| Embeddings |
Numerical vector representations of text (or other data) that capture semantic meaning and relationships |
Words with similar meanings have embeddings that are close together in vector space |
| Epoch |
One complete pass through the entire training dataset during model training |
Training for 10 epochs means the model has seen every training example 10 times |
| Ensemble Learning |
Combining multiple models and averaging their outputs to get better predictions than any single model |
Random Forests, which combine hundreds of decision trees to make more accurate predictions |
| Evaluation Metrics |
Measurements used to assess how well an AI model is performing |
Accuracy, precision, recall, F1 score, and BLEU score |
| Explainable AI (XAI) |
AI systems designed so their reasoning and decisions can be understood and audited by humans |
A loan-rejection system that shows which factors (income, debt ratio) drove the decision |
↑ Back to top
F
| Term |
Definition |
Example |
| Feature |
An individual measurable property used as input to a machine learning model |
In a house-price model: square footage, number of bedrooms, and location are features |
| Feature Engineering |
The process of selecting, transforming, or creating input variables to improve model performance |
Combining "day of week" and "time of day" into a single "rush hour" feature for a traffic model |
| Few-Shot Prompting |
Giving an AI a small number of examples in the prompt before asking it to complete a task |
Showing 3 example customer reviews before asking the model to classify a new one |
| Fine-Tuning |
Further training a pre-trained model on a specific, smaller dataset to specialise its behaviour |
Training a general LLM on legal documents to create a legal research assistant |
| Foundation Model |
A large AI model trained on broad, general data that can be adapted to many downstream tasks |
GPT-4, Claude, and Gemini are all foundation models |
| Function Calling |
A feature that allows an LLM to trigger external tools or APIs as part of generating a response |
An AI assistant calling a weather API to answer "Should I bring an umbrella tomorrow?" |
↑ Back to top
G
| Term |
Definition |
Example |
| GAN (Generative Adversarial Network) |
A model architecture where two networks — a generator and a discriminator — compete to produce increasingly realistic outputs |
Used to generate photorealistic synthetic faces or artistic images |
| Generative AI |
AI that can create new content — text, images, audio, video, or code — rather than just analysing existing data |
ChatGPT writing an article; Midjourney generating artwork |
| GPU (Graphics Processing Unit) |
Specialised hardware with thousands of cores that dramatically accelerates AI training and inference |
NVIDIA A100 and H100 GPUs are the standard for training large AI models |
| Gradient Descent |
The core optimisation algorithm that iteratively adjusts a model's weights to minimise prediction error during training |
The mathematical engine behind how every neural network learns |
| Guardrails |
Constraints or filters applied to an AI system to prevent it from producing harmful, offensive, or off-topic outputs |
A customer service bot that refuses to discuss competitors or give legal advice |
↑ Back to top
H
| Term |
Definition |
Example |
| Hallucination |
When an AI model confidently generates information that is factually incorrect or entirely fabricated |
An AI citing a scientific paper that doesn't exist, with a realistic-looking author and journal |
| Hugging Face |
A popular open-source platform for sharing, discovering, and running AI models and datasets |
Often called "the GitHub of AI" — thousands of models are freely available there |
| Human-in-the-Loop (HITL) |
A system design where a human reviews or approves AI decisions before they take effect |
A doctor reviewing an AI-flagged medical scan before acting on the recommendation |
| Hyperparameter |
A configuration value set before training begins that controls how the model learns, not what it learns |
Learning rate, batch size, and number of layers are all hyperparameters |
↑ Back to top
I
| Term |
Definition |
Example |
| Image Recognition |
AI's ability to identify and classify objects, people, or scenes within images |
Google Photos automatically tagging people and places in your photo library |
| Inference |
The process of using a trained AI model to generate predictions or outputs on new, unseen inputs |
Every time you send a message to ChatGPT, it runs inference |
| Interpretability |
The degree to which humans can understand why an AI model made a specific decision |
Being able to explain why a credit scoring model rejected an application |
↑ Back to top
J
| Term |
Definition |
Example |
| Jailbreak |
A technique used to trick an AI model into bypassing its safety rules or guidelines |
A roleplaying prompt designed to make an AI ignore its ethical restrictions |
| JSON Mode |
A setting in some LLM APIs that forces the model to return responses in valid JSON format |
Useful when building apps that need to parse AI output programmatically |
↑ Back to top
K
| Term |
Definition |
Example |
| Knowledge Base |
A structured repository of information that an AI can query to answer questions or complete tasks |
A company's internal FAQ documents connected to a RAG-powered support chatbot |
| Knowledge Graph |
A network of entities and the relationships between them, used to represent and query structured knowledge |
Google's Knowledge Graph connecting "Albert Einstein" to "physicist", "Germany", and "Theory of Relativity" |
| Knowledge Distillation |
Training a smaller model to replicate the performance of a larger one by learning from its outputs |
Creating a fast, lightweight model for edge deployment by mimicking a large cloud model |
↑ Back to top
L
| Term |
Definition |
Example |
| Label |
The correct answer or category assigned to a training example in supervised learning |
In a spam dataset, each email is labelled "spam" or "not spam" |
| Latency |
The delay between sending a request to an AI model and receiving its response |
A model with low latency feels instant; high latency feels slow and frustrating |
| Large Language Model (LLM) |
An AI model trained on vast amounts of text data, capable of generating, summarising, and reasoning about language |
GPT-4, Claude, Gemini, and Llama are all LLMs |
| Latent Space |
The compressed internal representation a model learns, where similar concepts are encoded close together |
In image generation models, nearby points in latent space produce visually similar images |
| Learning Rate |
A hyperparameter that controls how large a step the model takes when updating its weights during training |
Too high and the model overshoots; too low and it trains too slowly |
| LLMOps |
The set of practices and tools for deploying, monitoring, and maintaining LLMs in production |
Managing prompt versions, monitoring for drift, and evaluating model outputs at scale |
| LoRA (Low-Rank Adaptation) |
A parameter-efficient fine-tuning technique that adds small trainable layers to a model without modifying the original weights |
Fine-tuning a large model on a custom dataset using a fraction of the compute cost |
↑ Back to top
M
| Term |
Definition |
Example |
| Machine Learning (ML) |
A branch of AI where systems learn patterns from data rather than being explicitly programmed with rules |
A spam filter that improves over time by learning from emails users mark as spam |
| MCP (Model Context Protocol) |
An open standard created by Anthropic that allows AI models to connect to external tools, databases, and services in a consistent and secure way — think of it as a universal plug for AI integrations |
Connecting Claude to your GitHub repo, Google Drive, or a SQL database so it can read, write, and act on real data |
| Model |
A trained AI system that maps inputs to outputs based on what it learned from data |
A trained neural network that predicts tomorrow's stock price from historical data |
| Model Card |
A document published alongside an AI model describing its purpose, training data, capabilities, and limitations |
Hugging Face model cards provide transparency about what a model can and can't do |
| Model Collapse |
A phenomenon where AI models trained on AI-generated data degrade in quality over time |
A concern as the internet fills with AI-generated content used to train future models |
| Multimodal AI |
AI that can process and generate multiple types of content — text, images, audio, and video — together |
GPT-4o accepting an image and a question, then answering about the image in text |
↑ Back to top
N
| Term |
Definition |
Example |
| Natural Language Processing (NLP) |
The field of AI focused on enabling machines to understand, interpret, and generate human language |
Machine translation, sentiment analysis, chatbots, and text summarisation |
| Neural Network |
A computational model loosely inspired by the structure of the human brain, made up of layers of interconnected nodes |
The underlying architecture used by most modern AI systems |
| NLP Pipeline |
A sequence of processing steps applied to text data, from raw input to final output |
Tokenisation → embedding → classification → output |
| Node |
An individual computational unit in a neural network that receives inputs, applies a function, and passes an output |
Billions of nodes work together in a large neural network |
↑ Back to top
O
| Term |
Definition |
Example |
| Object Detection |
A computer vision task that identifies what objects are in an image and where they are located |
A self-driving car identifying pedestrians, traffic lights, and other vehicles in real time |
| Ontology |
A formal representation of concepts and the relationships between them within a specific domain |
A medical ontology defining how "disease", "symptom", and "treatment" relate to each other |
| Open Source Model |
An AI model whose weights and/or code are publicly available for anyone to use, modify, and distribute |
Meta's Llama models, Mistral, and Stable Diffusion |
| Overfitting |
When a model learns the training data too precisely — including its noise — and fails to generalise to new data |
A model that scores 99% on training data but only 60% on real-world data |
↑ Back to top
P
| Term |
Definition |
Example |
| Parameter |
An internal numerical value a model learns during training that shapes how it processes and generates outputs |
GPT-4 is estimated to have over a trillion parameters |
| Pre-training |
The initial large-scale training phase where a model learns from a massive general dataset before specialisation |
Training an LLM on hundreds of billions of words from the internet and books |
| Precision |
The percentage of positive predictions that were actually correct |
Of all emails the model flagged as spam, what percentage were truly spam? |
| Prompt |
The instruction, question, or input you give to an AI model to guide its response |
"Summarise this article in three bullet points for a non-technical audience" |
| Prompt Engineering |
The practice of designing and refining prompts to get better, more reliable outputs from AI models |
Using structured formatting, role assignment, and examples to improve response quality |
| Prompt Injection |
An attack where malicious instructions hidden in content the AI reads attempt to hijack its behaviour |
A webpage containing invisible text instructing a browsing AI to leak your personal data |
↑ Back to top
Q
| Term |
Definition |
Example |
| Quantisation |
A technique that reduces a model's memory usage by representing its weights with lower numerical precision, making it faster and cheaper to run |
Running a compressed Llama model on a laptop instead of a high-end server |
| Query |
The input or question sent to an AI model or database to retrieve information |
"What are the side effects of ibuprofen?" sent to a medical AI system |
↑ Back to top
R
| Term |
Definition |
Example |
| RAG (Retrieval-Augmented Generation) |
A technique that combines real-time document retrieval with AI generation, reducing hallucination and keeping responses current |
A chatbot that searches your company's knowledge base before answering a support question |
| Recall |
The percentage of actual positives that the model successfully identified |
Of all actual fraud cases, what percentage did the model correctly flag? |
| Recommendation System |
An AI system that predicts and surfaces content or products a user is likely to want, based on past behaviour |
Netflix's "Because you watched" suggestions; Spotify's Discover Weekly playlist |
| Red Teaming |
Deliberately attempting to break or manipulate an AI system to discover safety vulnerabilities before release |
Researchers probing a model with adversarial prompts to expose harmful outputs |
| Regression |
A machine learning task where the model predicts a continuous numerical value |
Predicting a house's sale price based on size, location, and age |
| Reinforcement Learning (RL) |
Training a model through a system of rewards and penalties, so it learns to maximise cumulative reward |
AlphaGo learning to play Go by playing millions of games and receiving rewards for winning |
| RLHF (Reinforcement Learning from Human Feedback) |
A training technique where humans rate AI outputs, and the model learns to produce outputs humans prefer |
The technique used to align ChatGPT and Claude to be helpful, harmless, and honest |
| RNN (Recurrent Neural Network) |
A neural network designed for sequential data, where outputs feed back as inputs — largely replaced by transformers |
Used in early speech recognition and text generation before transformers dominated |
↑ Back to top
S
| Term |
Definition |
Example |
| Semantic Search |
Search that understands the meaning and intent behind a query rather than matching exact keywords |
Searching "how to fix a broken bone" and getting results about fracture treatment, not carpentry |
| Sentiment Analysis |
AI that determines the emotional tone — positive, negative, or neutral — of a piece of text |
Automatically classifying thousands of customer reviews to measure product satisfaction |
| Speech Recognition |
AI that converts spoken audio into written text |
Apple's Siri, Google Voice, and OpenAI's Whisper model |
| Supervised Learning |
A training approach where the model learns from labelled input-output pairs |
Training a model on thousands of (email, spam/not spam) pairs so it can classify new emails |
| Synthetic Data |
Artificially generated data used to train or test models when real data is scarce, costly, or sensitive |
Generating fake patient records to train a healthcare AI without privacy concerns |
| System Prompt |
A hidden set of instructions given to an AI before the user conversation begins, used to shape its behaviour and persona |
A company using a system prompt to make Claude respond only about their products |
↑ Back to top
T
| Term |
Definition |
Example |
| Temperature |
A setting that controls how predictable or creative an AI's outputs are — low is focused and deterministic, high is varied and creative |
Set temperature low for factual Q&A; set it high for creative brainstorming |
| Text-to-Image |
AI that generates images from a natural language description |
DALL·E, Midjourney, and Stable Diffusion generating artwork from a text prompt |
| Text-to-Speech (TTS) |
AI that converts written text into natural-sounding spoken audio |
ElevenLabs generating a realistic voice clone from a few seconds of audio |
| Token |
The basic unit of text an LLM processes — roughly a word or part of a word |
"Artificial" might be split into "Art", "ific", "ial" — three tokens |
| Top-p Sampling |
A setting that controls output variety by limiting the pool of next-word candidates to a cumulative probability threshold |
Often tuned alongside temperature to balance quality and creativity |
| TPU (Tensor Processing Unit) |
Hardware designed specifically to accelerate AI workloads, developed by Google |
Used to train Google's Gemini and other large models |
| Training |
The process of exposing a model to data and adjusting its weights to minimise prediction error |
Training GPT-4 required thousands of GPUs running for months |
| Transfer Learning |
Reusing a model trained on one task as the starting point for a new but related task |
Adapting a model trained on English text to work with French by fine-tuning on French data |
| Transformer |
An attention-based neural network architecture that is the backbone of virtually all modern LLMs |
GPT, Claude, Gemini, and Llama are all transformer-based models |
| TTS — see Text-to-Speech |
|
|
↑ Back to top
U
| Term |
Definition |
Example |
| Underfitting |
When a model is too simple to capture the underlying patterns in the data, resulting in poor performance |
A linear model trying to predict stock prices — too simple for the complexity of the problem |
| Unsupervised Learning |
Training a model on unlabelled data so it discovers its own patterns and structure |
Grouping news articles into topic clusters without being told what the topics are |
↑ Back to top
V
| Term |
Definition |
Example |
| Validation Set |
A portion of data held back from training, used to tune the model and catch overfitting before final evaluation |
Monitoring validation loss during training to decide when to stop |
| Vector |
A list of numbers that represents data (like a word or image) in a mathematical space |
The word "king" might be represented as a vector of 768 numbers in an embedding model |
| Vector Database |
A database that stores and indexes embeddings (vectors) so AI can retrieve semantically relevant information quickly |
Pinecone, Weaviate, and Chroma are popular vector databases used in RAG systems |
↑ Back to top
W
| Term |
Definition |
Example |
| Weight |
A numerical parameter inside a neural network that is adjusted during training to reduce error |
A model with 70 billion parameters has 70 billion weights stored in memory |
| Weight Decay |
A regularisation technique that penalises large weights during training to prevent overfitting |
Commonly used alongside dropout to keep models from memorising training data |
↑ Back to top
X
| Term |
Definition |
Example |
| XAI (Explainable AI) |
AI systems and techniques designed to make model decisions interpretable and understandable to humans |
A credit scoring model that explains: "Rejected due to high debt-to-income ratio and short credit history" |
↑ Back to top
Y
| Term |
Definition |
Example |
| YAML |
A human-readable data format commonly used to write configuration files for AI tools and ML pipelines |
Writing a training configuration file for a machine learning experiment |
| YOLO (You Only Look Once) |
A real-time object detection algorithm known for its speed and efficiency |
Detecting and tracking multiple objects in a live video feed at 60 frames per second |
↑ Back to top
Z
| Term |
Definition |
Example |
| Zero-Shot Learning |
A model's ability to perform a task it was never explicitly trained on, by generalising from related knowledge |
Asking GPT-4 to translate a language it saw rarely during training with no translation-specific training |
| Zero-Shot Prompting |
Giving an AI a task with no examples — relying entirely on its pre-trained knowledge |
"Classify this review as positive or negative: 'The food was amazing!'" — no examples given |
This glossary covers 100+ terms across the full AI landscape. Bookmark it, share it, and revisit it as you grow.
Top comments (0)