DEV Community

Cover image for EXPLORING NORMALIZATION
Ramya
Ramya

Posted on

EXPLORING NORMALIZATION

Database normalization is crucial for eliminating redundancy and preventing anomalies such as insertion, update, and deletion issues. In this guide, we’ll walk through a sample base table, identify its anomalies, and normalize it step by step into 1NF, 2NF, and 3NF using SQL CREATE TABLE statements.
📌 Base Table

Consider a base table that contains student-course-instructor information:

🚨 Anomalies in the Table

Insertion Anomaly: Cannot add a new course until a student enrolls.

Update Anomaly: Changing an instructor’s email requires updating multiple rows.

Deletion Anomaly: Deleting the last student enrolled in a course removes course and instructor information too.

✅ Step 1: Convert to 1NF (First Normal Form)

Goal: Eliminate repeating groups and ensure all attributes have atomic values.

CREATE TABLE StudentCourses_1NF (
StudentID INT,
StudentName VARCHAR(100),
CourseID VARCHAR(10),
CourseName VARCHAR(100),
Instructor VARCHAR(100),
InstructorEmail VARCHAR(100)
);

✅ Step 2: Convert to 2NF (Second Normal Form)

Goal: Remove partial dependency (attributes depending only on part of a composite key).
Separate the table into Students, Courses, and Enrollments:

CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);

CREATE TABLE Courses (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(100),
Instructor VARCHAR(100),
InstructorEmail VARCHAR(100)
);

CREATE TABLE Enrollments (
StudentID INT,
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

✅ Step 3: Convert to 3NF (Third Normal Form)

Goal: Remove transitive dependency (e.g., InstructorEmail depends on Instructor, not CourseID).
We create a separate Instructors table:

CREATE TABLE Instructors (
InstructorID INT PRIMARY KEY,
InstructorName VARCHAR(100),
InstructorEmail VARCHAR(100)
);

CREATE TABLE Courses (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(100),
InstructorID INT,
FOREIGN KEY (InstructorID) REFERENCES Instructors(InstructorID)
);
📝 Insert Sample Data

INSERT INTO Students VALUES (1, 'Alice'), (2, 'Bob');

INSERT INTO Instructors VALUES
(1, 'Dr. Smith', 'smith@uni.edu'),
(2, 'Dr. Lee', 'lee@uni.edu'),
(3, 'Dr. Clark', 'clark@uni.edu');

INSERT INTO Courses VALUES
('C101', 'DBMS', 1),
('C102', 'Networks', 2),
('C103', 'AI', 3);

INSERT INTO Enrollments VALUES
(1, 'C101'),
(2, 'C102'),
(1, 'C103');
🔗 Query with JOINS

To list all students along with their courses and instructors:

SELECT s.StudentName, c.CourseName, i.InstructorName
FROM Enrollments e
JOIN Students s ON e.StudentID = s.StudentID
JOIN Courses c ON e.CourseID = c.CourseID
JOIN Instructors i ON c.InstructorID = i.InstructorID;

✅ Key Takeaways

  1. Normalization reduces redundancy: Properly structured tables save storage and prevent duplicate data.

  2. Avoid anomalies: Normalization prevents insertion, update, and deletion anomalies.

  3. Step-wise approach:
    1NF: Ensures atomic values and removes repeating groups.

2NF: Removes partial dependency by separating data into multiple tables.

3NF: Eliminates transitive dependency to ensure each attribute depends only on the primary key.

  1. Improved data integrity: Normalized databases are easier to maintain and update.

  2. Joins become powerful: Once normalized, SQL joins can easily combine data across tables for meaningful queries.

Top comments (0)