DEV Community

Discussion on: ELI5: What is a database normalization?

Collapse
ikirker profile image
Ian Kirker • Edited

School Records

So!

You're 5. You're enrolled at pre-school. They take your name, your address, your date-of-birth, your parents, and your parents' phone numbers, and they stick them in a big table, called "Contacts and Info".


  • Student
  • Address
  • DOB
  • Parent 1
  • Parent 2
  • Parent Phone 1
  • Parent Phone 2

Your parents also donate some money each month to pay for extra-curricular outings and other things, so they take the payment info for that, in another big table called "Donors".


  • Parent
  • Address
  • Credit Card Info
  • Phone Number

The Break-Up

At some point, your parents split up for some reason, and one of them starts seeing someone else. Which is fine, but now you have 3 parents the school can call, and they only have two places for parent info.

So, instead, they decide to split up the first big table:


  • Student
  • DOB

  • Student
  • Parent
  • Parent Phone
  • Parent Address

So now they can store as many parents as people have.

The Work Phone

One of your parents starts to work on a military base. They're not allowed to keep their normal phone on them all the time, because it's not TOP SECRET-worthy and doesn't shoot lasers, but they are allowed to give the school the phone number of their awesome laser-shooting phone as well. So now the school needs to store two phone numbers for one person.

They could make two rows in the table with the addresses, but then they'd be storing the address twice. If they wanted to change the address for someone in the future, they'd have to make very sure they updated both copies, or else they wouldn't know where to send things.

They also look at the table for Donors, where they're storing phone numbers as well, (so they'd have the same problem) and decide they can make this simpler:


  • Student
  • DOB

  • Student
  • Parent

  • Parent
  • Address

  • Parent
  • Phone Number

  • Parent
  • Credit Card Info

Like this, they can store any number of phone numbers, addresses, or credit cards per parent, and they'll only ever need to change the information in one place.

What'll now be a pain is changing someone's name, so they might also give people a number internally and use those instead:


  • ID
  • Name

  • Student ID
  • DOB

  • Student ID
  • Parent ID

  • Parent ID
  • Address

  • Parent ID
  • Phone Number

  • Parent ID
  • Credit Card Info

The Cost

The cost of doing all of this is that if you want to look up, for example, what address corresponds to which phone number, you have to look up the Parent ID for both, joining those tables together. You've also had to store these ID numbers which you didn't have before, but hopefully you've saved more storage in not copying other information.

Normalisation

This process of breaking down your data into exact minimal relationships and storing those separately is database normalisation. There are various numbered normal forms which describe how much you've done this, satisfying various constraints and conditions.

NB also: in some databases, the speed cost of joining all this data back together is not what you want, and storing all the data for every row is better!

Edits: Argh, no Markdown table support!

Collapse
gladuz profile image
Jamshid Tursunboyev Author

I love your school example! Great analogy.