DATA NORMALIZATION
Data Normalization is a process in database design used to organize data to reduce redundancy and improve data integrity. It involves structuring a database in such a way that:
Each piece of information is stored only once.
Dependencies between data are logical and minimal.
Anomalies (insertion, update, deletion) are avoided.
WHAT WE DO
To understand 1NF, 2NF, and 3NF and implement them using SQL CREATE TABLE statements.
BASE TABLE
IDENTIFYING ANOMALIES
1.Insertion Anomaly
You cannot add a new course (e.g., “C104 - Cloud Computing”) unless a student enrolls in it.
Similarly, if a new instructor is assigned to a course but no student has registered yet, you cannot record that instructor’s details.
Example:
You want to add Dr. Sen teaching C104 (Cloud Computing), but since no student has enrolled, you can’t insert this record because StudentID is mandatory.
2.Update Anomaly
Instructor or course details are repeated in multiple rows.
If the instructor’s phone number changes, you must update it in all rows where the instructor appears.
Example:
If Dr. Kumar’s phone number changes, you need to update it in two rows (for Arjun and Priya).
If you miss one, the table becomes inconsistent.
3.Deletion Anomaly
If a student drops all courses, their record is deleted — and you might also lose information about a course or instructor.
Example:
If Priya (S02) withdraws, and she’s the only student taking DBMS, then deleting her record will also remove all information about DBMS and Dr. Kumar.
Convert to 1NF (First Normal Form)
Rule:
No repeating groups or arrays.
Each cell contains atomic values.
The given table is already in 1NF (all values are atomic).
CREATE TABLE StudentCourse (
StudentID VARCHAR(10),
StudentName VARCHAR(50),
CourseID VARCHAR(10),
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15)
);
Convert to 2NF (Second Normal Form)
Rule:
Must be in 1NF.
No partial dependency — non-key attributes depend on the entire primary key.
Problem:
In 1NF, composite key = (StudentID, CourseID).
Attributes like StudentName depend only on StudentID, not on the full key.
Attributes like CourseName, Instructor, InstructorPhone depend only on CourseID.
Solution: Split into 3 tables
Student
Course
Enrollment (joins students and courses)
CREATE TABLE Student (
StudentID VARCHAR(10) PRIMARY KEY,
StudentName VARCHAR(50)
);
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15)
);
CREATE TABLE Enrollment (
StudentID VARCHAR(10),
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
Convert to 3NF (Third Normal Form)
Rule:
Must be in 2NF.
No transitive dependencies (non-key attribute depends on another non-key attribute).
Problem:
In the Course table, InstructorPhone depends on Instructor, not directly on CourseID.
Solution:
Split Instructor details into a separate table.
CREATE TABLE Student (
StudentID VARCHAR(10) PRIMARY KEY,
StudentName VARCHAR(50)
);
CREATE TABLE Instructor (
InstructorID VARCHAR(10) PRIMARY KEY,
InstructorName VARCHAR(50),
InstructorPhone VARCHAR(15)
);
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
InstructorID VARCHAR(10),
FOREIGN KEY (InstructorID) REFERENCES Instructor(InstructorID)
);
CREATE TABLE Enrollment (
StudentID VARCHAR(10),
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
Insert Sample Data
-- ===============================
-- INSERT DATA INTO 3NF TABLES
-- ===============================
-- Instructors
INSERT INTO Instructor VALUES ('I01', 'Dr. Kumar', '9876543210');
INSERT INTO Instructor VALUES ('I02', 'Dr. Mehta', '9123456780');
INSERT INTO Instructor VALUES ('I03', 'Dr. Rao', '9988776655');
-- Students
INSERT INTO Studet VALUES ('S01', 'Arjun');
INSERT INTO Studet VALUES ('S02', 'Priya');
INSERT INTO Studet VALUES ('S03', 'Kiran');
-- Enrollmment
INSERT INTO Enrollmment VALUES ('S01', 'C101');
INSERT INTO Enrollmment VALUES ('S01', 'C102');
INSERT INTO Enrollmment VALUES ('S02', 'C101');
INSERT INTO Enrollmment VALUES ('S03', 'C103');
CONCLUSION
Database normalization organizes data to reduce redundancy and maintain integrity.
1NF (First Normal Form): Ensures that each column contains atomic values and eliminates repeating groups.
2NF (Second Normal Form): Removes partial dependencies by making sure all non-key attributes depend on the entire primary key.
3NF (Third Normal Form): Eliminates transitive dependencies, so non-key attributes depend only on the primary key.
Normalization helps prevent insertion, update, and deletion anomalies, making the database more consistent, efficient, and easier to manage
Top comments (0)