Organizing data to reduce redundancy
Day 125 of 149
👉 Full deep-dive with code examples
The Address Book Analogy
Bad address book:
- "John Smith, 123 Main St, Sydney NSW 2000"
- "John Smith, 123 Main St, Sydney NSW 2000" (repeated!)
- What if John moves? Update every entry!
Good address book:
- Contact list: John Smith → Contact ID: 1
- Address list: Contact ID: 1 → 123 Main St, Sydney
- One place to update!
Normalization organizes data to avoid repetition!
The Problems It Solves
Data repetition:
- Same information in many places
- Wastes storage space
- Easy to have mismatches
Update anomalies:
- Change one place, forget another
- Data becomes inconsistent
Deletion anomalies:
- Delete one thing, accidentally lose other info
How It Works
Split data into related tables:
Before (unnormalized):
| OrderID | Customer | CustomerEmail | Product |
| 1 | Alice | alice@mail.com | Laptop |
| 2 | Alice | alice@mail.com | Mouse |
Alice's email repeated!
After (normalized):
Customers: | CustomerID | Name | Email |
| 1 | Alice | alice@mail.com |
Orders: | OrderID | CustomerID | Product |
| 1 | 1 | Laptop |
| 2 | 1 | Mouse |
Email stored once, linked by ID.
Benefits
- No redundancy → Data stored once
- Consistency → One source of truth
- Easier updates → Change in one place
- Less storage → No duplicate data
The Trade-off
More tables = more joins (slower queries sometimes).
Balance: Normalize for correctness, denormalize for performance when needed.
In One Sentence
Database Normalization organizes data into tables that minimize repetition, ensuring data is stored once and stays consistent.
🔗 Enjoying these? Follow for daily ELI5 explanations!
Making complex tech concepts simple, one day at a time.
Top comments (0)