loading...

Effective Database Design: Part 3

adammc331 profile image Adam McNeilly Updated on ・2 min read

This is the third post in a series about database design and normalization. In our last post, we discussed the first normal form and took an initial step toward proper database design. We learned that we should have a primary key, atomic values, and no repeating groups. We also learned how to separate our entities to avoid redundancy.

Now we can go a step further and make sure we abide by the second normal form.

Second Normal Form

A table is in 2NF if it is in 1NF, and there are no partial dependencies. This means that non-prime columns (columns that are not part of the key, or possible keys) depend entirely on the primary key.

In a table that has a single primary key column, this will always hold true, so we don't need to stress about it. If we have a composite key, though, we could break this rule.

For example, let's add a teacher column to our student_classes table from the last part. Let's say that Ms. Smith teaches math, Ms. Jackson teaches programming, and Ms. James teaches history.

student_id class_id teacher
1 1 Smith
1 2 Jackson
2 3 James

The teacher column is determined only by the class_id, not by the entire primary identifier of (student_id, class_id).

Problem

Before we discuss a solution, let's look at the problems this could cause. Consider if Prince also took programming:

student_id class_id teacher
1 1 Smith
1 2 Jackson
2 3 James
2 2 Jackson

Notice any familiar problems? We have data redundancy issues. Twice we see that Ms. Jackson teaches programming, which also indicates we could be at risk of an update anomaly if I only changed one of those rows.

Solution

Since the teacher column is dependent only on the class_id column, we should group those entities together.

Let's put the teacher column in the classes table:

id class_name teacher
1 Math Smith
2 Programming Jackson
3 History James

We've got a lot to keep track of in our three tables so far, so we'll just look at a high level diagram. Shout out to dbdiagram.io for making a helpful tool:

The 2NF was a quick one to explain, but now we've got a pretty well designed database. Let's recap everything up to the 2NF:

  • Each table has a primary identifier.
  • Each column only has atomic values.
  • No tables have repeating groups.
  • There are no partial dependencies.

We're in a good spot now that we know how to ensure these four things. In the next post, we'll continue on this normalization adventure to comply with the third normal form.

Posted on by:

adammc331 profile

Adam McNeilly

@adammc331

Android developer living in Brooklyn who's passionate about traveling, puns, and mobile development.

Discussion

markdown guide
 

Second lol
I just finish part two of this series, and can't wait for more xD

 

Good Job Adam! My 3month database design college class broken down into 5mins read per post... awesome.

 

These techniques have been around like 50 years and still apply. It shows how brilliant the relational model is.

 

Absolutely, and I think normalization makes database design fun! It starts to feel more like a puzzle, and figuring out how to fit the pieces together can be really fascinating! Especially as you get further along the normal forms.

 

This series is really helpful and easy to digest. Kudos for great work! Keep it up!

 
 
 

Can’t wait for more! Loving your database design explanations. Berry clear and concise 👌🏽

 

Thank you! Next post coming tomorrow. :)