DEV Community

Abhay Singh Kathayat
Abhay Singh Kathayat

Posted on

Understanding Normal Forms in Database Design: A Comprehensive Guide

Different Normal Forms in Database Design

In database design, normalization is the process of organizing data to minimize redundancy and dependency, improving data integrity. The process involves dividing large tables into smaller, manageable ones and establishing relationships between them. This ensures that the database is free from anomalies such as insertion, update, and deletion anomalies.

The different normal forms represent specific levels of normalization. Each normal form builds upon the previous one and has its own set of rules. Below is an explanation of the most common normal forms:


1. First Normal Form (1NF)

1NF is the most basic level of normalization, focusing on eliminating duplicate data and ensuring that the data in a table is organized in a way that each column contains atomic values (no repeating groups).

  • Rules of 1NF:
    1. Each table cell should contain a single value (atomicity).
    2. Each record (row) must be unique.
    3. Each column should contain values of a single type (e.g., all integers, all strings).
    4. No repeating groups of columns or multiple values in a single column.

Example of 1NF:

Before 1NF:

OrderID Products Quantities
1 Apple, Banana 2, 3
2 Orange 5

After converting to 1NF:

OrderID Product Quantity
1 Apple 2
1 Banana 3
2 Orange 5

2. Second Normal Form (2NF)

2NF builds on 1NF by eliminating partial dependencies. A partial dependency occurs when a non-prime attribute (a column that is not part of the primary key) is dependent on only a part of the primary key (in case of composite primary keys). To achieve 2NF, the table must first meet the requirements of 1NF.

  • Rules of 2NF:
    1. The table must be in 1NF.
    2. Every non-prime attribute must be fully functionally dependent on the entire primary key (eliminate partial dependencies).

Example of 2NF:

Before 2NF (Partial Dependency):

OrderID Product CustomerName Price
1 Apple John 10
1 Banana John 5
2 Orange Jane 8

Here, CustomerName depends only on OrderID and not on the full primary key (OrderID, Product). To remove this, we split the table.

After 2NF:
Tables:

  • Orders (OrderID, CustomerName)
  • OrderDetails (OrderID, Product, Price)

Orders table:

OrderID CustomerName
1 John
2 Jane

OrderDetails table:

OrderID Product Price
1 Apple 10
1 Banana 5
2 Orange 8

3. Third Normal Form (3NF)

3NF builds on 2NF and addresses transitive dependencies, which occur when a non-prime attribute depends on another non-prime attribute. A non-prime attribute should depend only on the primary key. A table is in 3NF if it is in 2NF and all transitive dependencies are removed.

  • Rules of 3NF:
    1. The table must be in 2NF.
    2. No non-prime attribute should depend on another non-prime attribute (remove transitive dependencies).

Example of 3NF:

Before 3NF (Transitive Dependency):

OrderID Product Category Supplier
1 Apple Fruit XYZ
2 Carrot Vegetable ABC

Here, Supplier depends on Category, not directly on the OrderID. To resolve this, we split the table.

After 3NF:
Tables:

  • Orders (OrderID, Product, Category)
  • Category (Category, Supplier)

Orders table:

OrderID Product Category
1 Apple Fruit
2 Carrot Vegetable

Category table:

Category Supplier
Fruit XYZ
Vegetable ABC

4. Boyce-Codd Normal Form (BCNF)

BCNF is a stricter version of 3NF. A table is in BCNF if:

  • It is in 3NF.
  • For every functional dependency, the left-hand side must be a candidate key (i.e., a minimal superkey).

In simpler terms, BCNF addresses situations where a table is in 3NF but still has some dependencies that involve attributes that aren't candidate keys.

  • Rules of BCNF:
    1. The table must be in 3NF.
    2. Every determinant must be a candidate key.

Example of BCNF:

Before BCNF:

CourseID Instructor Room
101 Dr. Smith A1
102 Dr. Smith B1
101 Dr. Johnson A2

Here, Instructor determines Room, but Instructor is not a candidate key, which violates BCNF. To achieve BCNF, we separate the dependencies into different tables.

After BCNF:
Tables:

  • Courses (CourseID, Instructor)
  • Rooms (Instructor, Room)

Courses table:

CourseID Instructor
101 Dr. Smith
102 Dr. Smith
101 Dr. Johnson

Rooms table:

Instructor Room
Dr. Smith A1
Dr. Smith B1
Dr. Johnson A2

5. Fourth Normal Form (4NF)

4NF addresses multi-valued dependencies, which occur when one attribute determines multiple values of another attribute, and those values are independent of each other. A table is in 4NF if:

  • It is in BCNF.
  • It has no multi-valued dependencies.

Example of 4NF:

Before 4NF (Multi-valued Dependency):

StudentID Subject Hobby
1 Math Painting
1 Science Cycling

After 4NF:
Tables:

  • Students (StudentID, Subject)
  • StudentsHobbies (StudentID, Hobby)

Students table:

StudentID Subject
1 Math
1 Science

StudentsHobbies table:

StudentID Hobby
1 Painting
1 Cycling

Conclusion

In database design, normalization is a fundamental process for organizing data efficiently. The different normal forms—1NF, 2NF, 3NF, BCNF, and 4NF—ensure that data is stored without redundancy, maintains integrity, and is easy to manage. Each normal form builds on the previous one by eliminating specific types of dependency or anomaly. While normalization improves data quality, it is essential to balance it with performance considerations, sometimes opting for denormalization when necessary for optimization.

Hi, I'm Abhay Singh Kathayat!
I am a full-stack developer with expertise in both front-end and back-end technologies. I work with a variety of programming languages and frameworks to build efficient, scalable, and user-friendly applications.
Feel free to reach out to me at my business email: kaashshorts28@gmail.com.

Top comments (0)