Introduction
Database normalization is an essential part of designing efficient, scalable, and maintainable databases. In this post, we will focus mainly on 1NF ,2NF and 3NF using a simple student-course example shown below and implement each stage using SQL CREATE TABLE statements.
Anomalies in table above
- Insertion anomaly: You can’t insert a new course unless a student enrolls in it.
- Update anomaly: If Prof. Smith changes name, you need to update multiple rows.
3.Deletion anomaly: If Charlie drops all courses, info about Chemistry and Prof. Lee might be lost.
First Normal Form (1NF)
Eliminate repeating groups and ensure atomicity.
In our case, the base table is already atomic but we will enforce a primary key.
CREATE TABLE StudentCourse_1NF (
StudentID INT,
StudentName VARCHAR(50),
CourseID VARCHAR(10),
CourseName VARCHAR(50),
InstructorName VARCHAR(50),
PRIMARY KEY (StudentID, CourseID)
);
Second Normal Form (2NF)
Eliminate partial dependencies.
Only non-key attributes should depend on the full primary key.
STUDENT TABLE
CREATE TABLE Student (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(50)
);
COURSE TABLE
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
InstructorName VARCHAR(50)
);
RELATIONSHIP BETWEEN THE TWO
CREATE TABLE StudentCourse (
StudentID INT,
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
Third Normal Form (3NF)
Eliminate transitive dependencies.
Attributes should only depend on the primary key, not on other non-key attributes.
Instructor table
CREATE TABLE Instructor (
InstructorID INT PRIMARY KEY,
InstructorName VARCHAR(50)
);
Updated course table
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
InstructorID INT,
FOREIGN KEY (InstructorID) REFERENCES Instructor(InstructorID)
);
Student table
CREATE TABLE Student (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(50)
);
Course table
CREATE TABLE StudentCourse (
StudentID INT,
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
Inserting values into the tables
Inserting into student table
INSERT INTO Student (StudentID, StudentName) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
Inserting into instructors table
INSERT INTO Instructor (InstructorID, InstructorName) VALUES
(1, 'Prof. Smith'),
(2, 'Prof. Johnson'),
(3, 'Prof. Lee');
Inserting into course table
INSERT INTO Course (CourseID, CourseName, InstructorID) VALUES
('C101', 'Math', 1),
('C102', 'Physics', 2),
('C103', 'Chemistry', 3);
Enrolment
INSERT INTO StudentCourse (StudentID, CourseID) VALUES
(1, 'C101'),
(1, 'C102'),
(2, 'C101'),
(3, 'C103');
JOINS QUERY
SELECT s.StudentName, c.CourseName, i.InstructorName
FROM StudentCourse sc
JOIN Student s ON sc.StudentID = s.StudentID
JOIN Course c ON sc.CourseID = c.CourseID
JOIN Instructor i ON c.InstructorID = i.InstructorID;
Conclusion
By applying the principles of 1NF, 2NF, and 3NF, we transform a flat, repetitive table into a set of well-structured, interrelated tables that ensure data integrity and ease of maintenance.Through normalization, we not only improve the consistency and accuracy of data but also make our database scalable and easier to query. The process may seem complex at first, but breaking it down step-by-step, as we did with the student-course example, makes it straightforward and practical.
Mastering normalization empowers you to design robust databases that stand the test of time, reduce errors, and improve performance.
Top comments (0)