π° Originally published on Securityelites β AI Red Team Education β the canonical, fully-updated version of this article.
π€ AI BASICS FOR BEGINNERS Β FREE
Day 2 of 5 Β Β·Β 40% complete
Every time you click βlikeβ on a video, watch something all the way to the end, or skip a song after 5 seconds β youβre teaching an AI. Your clicks are training data. The AI is watching what you do and learning your patterns. It gets better at predicting what you want because of you.
Thatβs how AI learns β from examples. And the more examples it gets, the better it gets. But hereβs the really interesting part: what an AI learns completely determines what it can do β and also what it canβt do, and how it can fail.
Today Iβm taking you inside the learning process. Youβll see exactly how an AI goes from knowing nothing to being scarily good at its job. And youβll see why that same learning process can be tampered with β which is one of the sneakiest ways to break an AI.
π― What Youβll Learn in Day 2
β
How AI goes from zero knowledge to scary-good
β
What βtraining dataβ is and why itβs everything
β
Three ways AI can learn β with real-life examples of each
β
What βmodel weightsβ are (in plain English, no maths)
β
How someone could secretly corrupt an AI by messing with its training
β± 25 min read Β· 3 exercises Β· Browser helpful for exercises
π Before You Start:
- Completed Day 1: What Is Artificial Intelligence?
- Remember: AI learns from examples and makes guesses about new things
- Remember: AI matches patterns β it doesnβt truly understand anything
How Does AI Learn? β Day 2 of 5
- Training Data β The Fuel That Powers AI
- Three Ways AI Can Learn
- What Actually Happens During Training
- What Are Model Weights? (No Maths, Promise)
- Three Ways Learning Can Go Wrong
- The Sneaky Attack: Poisoning the Training Data
- Questions and Answers
Yesterday we covered what AI is. Today we go one level deeper β into the learning process itself. This is where things get really interesting. The adversarial machine learning attacks youβll learn about later all trace back to how training works. So does understanding the LLM hacking series. Letβs build the foundation.
Training Data β The Fuel That Powers AI
Training data is the collection of examples an AI learns from. Itβs the most important ingredient. An AI is literally only as good as what you show it. Bad examples β bad AI. Sneaky examples β dangerous AI. Amazing examples β amazing AI.
Let me make this concrete. Imagine you want to teach an AI to tell the difference between dogs and cats in photos.
You need training data: thousands (or millions) of photos. Each photo needs a label: βdogβ or βcat.β The AI looks at the photo and the label, over and over, millions of times. It figures out what patterns separate dogs from cats. Pointy ears vs floppy ears. Whiskers vs no whiskers. Different eye shapes. Fur patterns. Body proportions. The AI learns all of this without you ever telling it what to look for.
Training data has three things that really matter:
Lots of it. More examples = better patterns found. An AI trained on 100 photos of cats is going to make lots of mistakes. An AI trained on 100 million photos is going to be very, very good. ChatGPT was trained on more text than any human could read in thousands of lifetimes.
Good variety. If you only show the AI photos of orange cats, itβll be confused by black cats, white cats, and kittens. The training examples need to include all the different versions of the thing you want it to recognise. This is called βdiversity.β When training data isnβt diverse, the AI develops blind spots β things it fails on predictably.
Correct labels. Every example needs the right answer attached. If you accidentally label 10% of your cat photos as βdog,β the AI learns the wrong patterns from those examples. Wrong labels = AI learns wrong things = AI makes wrong predictions.
securityelites.com
TRAINING DATA QUALITY β WHAT MATTERS
VOLUME
More examples β AI sees more patterns β gets smarter
VARIETY
One-sided data β AI develops blind spots it fails on every time
LABELS
Wrong labels β AI learns the wrong lesson from those examples
β οΈ POISONED
Someone sneaks in bad examples β AI learns to behave wrong on purpose
πΈ The four quality levels of training data. The first three are quality problems that happen by accident. The last one (red) is an attack β someone doing it on purpose. Weβll cover that in Section 6.
Three Ways AI Can Learn
Not all AI learns the same way. There are three main styles of learning. Each one is used for different jobs β and each one can be attacked differently. I think of them like three different ways a student could study for a test.
π Read the complete guide on Securityelites β AI Red Team Education
This article continues with deeper technical detail, screenshots, code samples, and an interactive lab walk-through. Read the full article on Securityelites β AI Red Team Education β
This article was originally written and published by the Securityelites β AI Red Team Education team. For more cybersecurity tutorials, ethical hacking guides, and CTF walk-throughs, visit Securityelites β AI Red Team Education.

Top comments (0)