DEV Community

Adam McNeilly
Adam McNeilly

Posted on • Updated on

Effective Database Design: Part 3

This is the third post in a series about database design and normalization. In our last post, we discussed the first normal form and took an initial step toward proper database design. We learned that we should have a primary key, atomic values, and no repeating groups. We also learned how to separate our entities to avoid redundancy.

Now we can go a step further and make sure we abide by the second normal form.

Second Normal Form

A table is in 2NF if it is in 1NF, and there are no partial dependencies. This means that non-prime columns (columns that are not part of the key, or possible keys) depend entirely on the primary key.

In a table that has a single primary key column, this will always hold true, so we don't need to stress about it. If we have a composite key, though, we could break this rule.

For example, let's add a teacher column to our student_classes table from the last part. Let's say that Ms. Smith teaches math, Ms. Jackson teaches programming, and Ms. James teaches history.

student_id class_id teacher
1 1 Smith
1 2 Jackson
2 3 James

The teacher column is determined only by the class_id, not by the entire primary identifier of (student_id, class_id).

Problem

Before we discuss a solution, let's look at the problems this could cause. Consider if Prince also took programming:

student_id class_id teacher
1 1 Smith
1 2 Jackson
2 3 James
2 2 Jackson

Notice any familiar problems? We have data redundancy issues. Twice we see that Ms. Jackson teaches programming, which also indicates we could be at risk of an update anomaly if I only changed one of those rows.

Solution

Since the teacher column is dependent only on the class_id column, we should group those entities together.

Let's put the teacher column in the classes table:

id class_name teacher
1 Math Smith
2 Programming Jackson
3 History James

We've got a lot to keep track of in our three tables so far, so we'll just look at a high level diagram. Shout out to dbdiagram.io for making a helpful tool:

The 2NF was a quick one to explain, but now we've got a pretty well designed database. Let's recap everything up to the 2NF:

  • Each table has a primary identifier.
  • Each column only has atomic values.
  • No tables have repeating groups.
  • There are no partial dependencies.

We're in a good spot now that we know how to ensure these four things. In the next post, we'll continue on this normalization adventure to comply with the third normal form.

Latest comments (10)

Collapse
 
adepojubosun profile image
Adepoju Bosun

Good Job Adam! My 3month database design college class broken down into 5mins read per post... awesome.

Collapse
 
lluismf profile image
Lluís Josep Martínez

These techniques have been around like 50 years and still apply. It shows how brilliant the relational model is.

Collapse
 
adammc331 profile image
Adam McNeilly

Absolutely, and I think normalization makes database design fun! It starts to feel more like a puzzle, and figuring out how to fit the pieces together can be really fascinating! Especially as you get further along the normal forms.

Collapse
 
rajaasyraf profile image
Raja Asyraf

This series is really helpful and easy to digest. Kudos for great work! Keep it up!

Collapse
 
adammc331 profile image
Adam McNeilly

Thank you so much!

Collapse
 
dannida profile image
dannida

nice piece again....

Collapse
 
alfredlucero profile image
Alfred Lucero

Can’t wait for more! Loving your database design explanations. Berry clear and concise 👌🏽

Collapse
 
adammc331 profile image
Adam McNeilly

Thank you! Next post coming tomorrow. :)

Collapse
 
iglesk profile image
Omar AlSabek

Second lol
I just finish part two of this series, and can't wait for more xD

Some comments may only be visible to logged-in visitors. Sign in to view all comments.