Breaking text into puzzle pieces
Day 54 of 149
๐ Full deep-dive with code examples
The Puzzle Pieces
You have a sentence: "I love pizza!"
AI doesn't see words. It sees tokens:
["I", " love", " pizza", "!"]
Each piece is a token. AI processes one piece at a time!
Why Not Just Words?
"unbelievable" = 1 word but might be:
["un", "believ", "able"] = 3 tokens
This helps AI:
- Handle new/rare words
- Work with any language
- Understand word parts
Token Sizes Vary
| Text | Approximate Tokens |
|---|---|
| "Hello" | 1 |
| "Hello world" | 2 |
| "ChatGPT is cool!" | ~5 |
| 1 page of text | ~500 |
| Average book | ~100,000 |
Rule of thumb: 1 token โ 4 characters in English
Why Tokens Matter
AI has a token limit!
Different models support different context-window sizes (ranging from a few thousand tokens to much larger windows).
Your question + context + answer usually need to fit in that window.
In One Sentence
Tokens are the small chunks (roughly words or word-parts) that AI models use to read and generate text.
๐ Enjoying these? Follow for daily ELI5 explanations!
Making complex tech concepts simple, one day at a time.
Top comments (0)