🎯 Objective
Database normalization is one of the most important concepts in database design. It ensures that data is stored efficiently, redundancy is minimized, and data integrity is maintained.
To understand 1NF, 2NF, and 3NF, identify data anomalies, and implement normalization using SQL with proper CREATE TABLE, INSERT, and JOIN queries
By the end, you’ll understand not only how to normalize a table, but why it matters.
🧱 Base Table — The Starting Point
Let’s begin with a simple (but flawed) table design that stores students, their enrolled courses, and instructor details:
At first glance, this table might look fine — it gives us all the details in one place. But let’s look deeper.
⚠️ Data Anomalies
This table has the following anomalies:
- Insertion anomaly: A new course can’t be added unless at least one student is enrolled.
- Update anomaly: If the instructor’s phone number changes, it must be updated in multiple rows.
- Deletion anomaly: If all students of a course are removed, information about that course and instructor is lost.
We apply Normalization — a step-by-step process of structuring the database.
🧱 Step 1: Transforming to 1NF (First Normal Form)
✅ Rule: Every column should contain indivisible (atomic) values — no groups or lists within a single field.
Our original table already meets this rule since each column stores one value per cell.
However, let’s formally define this table in SQL to establish a proper structure.
CREATE TABLE StudentCourse_1NF (
StudentID VARCHAR(10),
StudentName VARCHAR(50),
CourseID VARCHAR(10),
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15)
);
➕ Insert Sample Data:
INSERT INTO StudentCourse_1NF VALUES
('S01', 'Arjun', 'C101', 'DBMS', 'Dr. Kumar', '9876543210'),
('S01', 'Arjun', 'C102', 'Data Mining', 'Dr. Mehta', '9123456780'),
('S02', 'Priya', 'C101', 'DBMS', 'Dr. Kumar', '9876543210'),
('S03', 'Kiran', 'C103', 'AI', 'Dr. Rao', '9988776655');
💡 Observation:
Although the data follows 1NF, there’s still repeated information — the same course and instructor details appear multiple times.
🧩 Step 2: Converting to 2NF (Second Normal Form)
✅ Rule: Remove partial dependencies — every non-key attribute must depend on the entire primary key.
In our current structure, a composite key could be (StudentID, CourseID).
But attributes like CourseName, Instructor, and InstructorPhone depend only on CourseID.
This means the table breaks the rule for 2NF.
To correct this, we’ll divide the data into three distinct tables: Student, Course, and Enrollment.
🧮 Student Table
Holds all student-specific details.
CREATE TABLE Student (
StudentID VARCHAR(10) PRIMARY KEY,
StudentName VARCHAR(50)
);
INSERT INTO Student VALUES
('S01', 'Arjun'),
('S02', 'Priya'),
('S03', 'Kiran');
📘 Course Table
Contains information about each course and its instructor.
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15)
);
INSERT INTO Course VALUES
('C101', 'DBMS', 'Dr. Kumar', '9876543210'),
('C102', 'Data Mining', 'Dr. Mehta', '9123456780'),
('C103', 'AI', 'Dr. Rao', '9988776655');
🧾 Enrollment Table
Links students to the courses they have enrolled in.
CREATE TABLE Enrollment (
StudentID VARCHAR(10),
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
INSERT INTO Enrollment VALUES
('S01', 'C101'),
('S01', 'C102'),
('S02', 'C101'),
('S03', 'C103');
✅ Result:
Now, each table only includes data that depends on its own key.
We’ve removed redundant information between students and courses — though one more dependency still exists between Instructor and InstructorPhone.
🧩 Step 3: Moving to 3NF (Third Normal Form)
✅ Rule: Eliminate transitive dependencies — non-key fields shouldn’t depend on other non-key fields.
Here, InstructorPhone depends on Instructor, not on CourseID.
That’s a transitive dependency, and it violates 3NF.
To fix this, we’ll create a new table just for instructor details.
👨🏫 Instructor Table
CREATE TABLE Instructor (
InstructorID VARCHAR(10) PRIMARY KEY,
InstructorName VARCHAR(50),
InstructorPhone VARCHAR(15)
);
INSERT INTO Instructor VALUES
('I01', 'Dr. Kumar', '9876543210'),
('I02', 'Dr. Mehta', '9123456780'),
('I03', 'Dr. Rao', '9988776655');
📘** Updated Course Table (After 3NF)**
CREATE TABLE Course_3NF (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
InstructorID VARCHAR(10),
FOREIGN KEY (InstructorID) REFERENCES Instructor(InstructorID)
);
INSERT INTO Course_3NF VALUES
('C101', 'DBMS', 'I01'),
('C102', 'Data Mining', 'I02'),
('C103', 'AI', 'I03');
✅ Now:
Every column in every table depends directly on its primary key — there’s no indirect dependency left.
The schema is now fully normalized and easy to maintain.
🧮 Step 4: Retrieve Data Using JOINs
Let’s merge all the tables to display complete student-course-instructor details.
SELECT s.StudentName, c.CourseName, i.InstructorName, i.InstructorPhone
FROM Enrollment e
JOIN Student s ON e.StudentID = s.StudentID
JOIN Course_3NF c ON e.CourseID = c.CourseID
JOIN Instructor i ON c.InstructorID = i.InstructorID;
✅ This result neatly connects students, their courses, and instructors — all using efficient, normalized tables.
💡 Final Thoughts
Normalization isn’t just a database theory — it’s the foundation for creating efficient, accurate, and scalable database systems.
- By systematically applying 1NF → 2NF → 3NF,
- Removed unnecessary duplication
- Solved insertion, update, and deletion problems
- Organized data for easier queries and maintenance
This Student–Course example clearly demonstrates how normalization transforms messy data into a clean, reliable, and well-structured database.
Top comments (1)
Nicely explained 👌