🧠 Introduction
Database Normalization helps remove data redundancy and anomalies like insertion, update, and deletion problems.
In this tutorial, we’ll normalize a sample table step by step — from 1NF → 2NF → 3NF — and implement it in SQL.
🏁 Step 1: Base Table
Let’s start with an unnormalized table.
📋 Base Table: Student_Course_Info
⚠️ Step 1A: Identify Anomalies
Type of Anomaly Description
Insertion Anomaly You can’t add a new course unless at least one student enrolls.
Update Anomaly If Dr. Ravi’s phone number changes, it must be updated in multiple rows.
Deletion Anomaly If all students drop “OS,” information about instructor Dr. Priya is lost.
🧮 Step 2: Convert to 1NF (First Normal Form)
Rule: Remove repeating groups and ensure atomic (single-valued) attributes.
✅ Each cell should contain a single value.
✅ Each record should be unique.
🧱 Table: Student_Course_1NF
CREATE TABLE Student_Course_1NF (
Student_ID INT,
Student_Name VARCHAR(50),
Course_ID VARCHAR(10),
Course_Name VARCHAR(50),
Instructor VARCHAR(50),
Instructor_Phone VARCHAR(15),
PRIMARY KEY (Student_ID, Course_ID)
);
Insert Sample Data
INSERT INTO Student_Course_1NF VALUES
(101, 'Arun Kumar', 'C01', 'DBMS', 'Dr. Ravi', '9876543210'),
(101, 'Arun Kumar', 'C02', 'OS', 'Dr. Priya', '9876500001'),
(102, 'Meena Devi', 'C01', 'DBMS', 'Dr. Ravi', '9876543210'),
(103, 'Vishal', 'C03', 'CN', 'Dr. Karthik', '9876522222');
✅ Data is atomic, but redundancy still exists (Instructor and Course details repeat).
🧩 Step 3: Convert to 2NF (Second Normal Form)
Rule: Remove partial dependency — all non-key attributes must depend on the whole composite key, not part of it.
Here:
Course_Name, Instructor, and Instructor_Phone depend only on Course_ID
Student_Name depends on Student_ID
👉 So we’ll split into three tables:
🧱 Table 1: Students
CREATE TABLE Students (
Student_ID INT PRIMARY KEY,
Student_Name VARCHAR(50)
);
🧱 Table 2: Courses
CREATE TABLE Courses (
Course_ID VARCHAR(10) PRIMARY KEY,
Course_Name VARCHAR(50),
Instructor VARCHAR(50),
Instructor_Phone VARCHAR(15)
);
🧱 Table 3: Enrollments
CREATE TABLE Enrollments (
Student_ID INT,
Course_ID VARCHAR(10),
PRIMARY KEY (Student_ID, Course_ID),
FOREIGN KEY (Student_ID) REFERENCES Students(Student_ID),
FOREIGN KEY (Course_ID) REFERENCES Courses(Course_ID)
);
Insert Data
INSERT INTO Students VALUES
(101, 'Arun Kumar'),
(102, 'Meena Devi'),
(103, 'Vishal');
INSERT INTO Courses VALUES
('C01', 'DBMS', 'Dr. Ravi', '9876543210'),
('C02', 'OS', 'Dr. Priya', '9876500001'),
('C03', 'CN', 'Dr. Karthik', '9876522222');
INSERT INTO Enrollments VALUES
(101, 'C01'),
(101, 'C02'),
(102, 'C01'),
(103, 'C03');
✅ Redundancy reduced.
❌ But instructor info still repeats (if multiple courses by the same instructor).
🧠 Step 4: Convert to 3NF (Third Normal Form)
Rule: Remove transitive dependencies — non-key attributes should not depend on other non-key attributes.
Here:
Instructor_Phone depends on Instructor, not directly on Course_ID.
👉 Split Instructor details into a new table.
🧱 Table 1: Students
CREATE TABLE Students (
Student_ID INT PRIMARY KEY,
Student_Name VARCHAR(50)
);
🧱 Table 2: Instructors
CREATE TABLE Instructors (
Instructor_ID INT PRIMARY KEY AUTO_INCREMENT,
Instructor_Name VARCHAR(50),
Instructor_Phone VARCHAR(15)
);
🧱 Table 3: Courses
CREATE TABLE Courses (
Course_ID VARCHAR(10) PRIMARY KEY,
Course_Name VARCHAR(50),
Instructor_ID INT,
FOREIGN KEY (Instructor_ID) REFERENCES Instructors(Instructor_ID)
);
🧱 Table 4: Enrollments
CREATE TABLE Enrollments (
Student_ID INT,
Course_ID VARCHAR(10),
PRIMARY KEY (Student_ID, Course_ID),
FOREIGN KEY (Student_ID) REFERENCES Students(Student_ID),
FOREIGN KEY (Course_ID) REFERENCES Courses(Course_ID)
);
Insert Data
INSERT INTO Students VALUES
(101, 'Arun Kumar'),
(102, 'Meena Devi'),
(103, 'Vishal');
INSERT INTO Instructors (Instructor_Name, Instructor_Phone) VALUES
('Dr. Ravi', '9876543210'),
('Dr. Priya', '9876500001'),
('Dr. Karthik', '9876522222');
INSERT INTO Courses VALUES
('C01', 'DBMS', 1),
('C02', 'OS', 2),
('C03', 'CN', 3);
INSERT INTO Enrollments VALUES
(101, 'C01'),
(101, 'C02'),
(102, 'C01'),
(103, 'C03');
🔍 Step 5: JOIN Query — View Normalized Data Together
To list all students with their courses and instructors, we’ll join all tables.
SELECT
s.Student_Name,
c.Course_Name,
i.Instructor_Name
FROM Enrollments e
JOIN Students s ON e.Student_ID = s.Student_ID
JOIN Courses c ON e.Course_ID = c.Course_ID
JOIN Instructors i ON c.Instructor_ID = i.Instructor_ID;
✅ Result Example:
Student_Name Course_Name Instructor_Name
Arun Kumar DBMS Dr. Ravi
Arun Kumar OS Dr. Priya
Meena Devi DBMS Dr. Ravi
Vishal CN Dr. Karthik
✅ Conclusion
Through 1NF, 2NF, and 3NF, we’ve:
Eliminated redundancy
Avoided anomalies
Structured the database logically
This normalization not only improves data integrity but also simplifies updates and queries.
Top comments (0)