When working with databases, one of the biggest challenges is data redundancy and anomalies (insertion, update, deletion problems).
This is where Normalization comes in โ a process of organizing data into efficient structures.
๐ Step 1: Identifying Anomalies
This table has multiple issues:
1)Insertion anomaly โ Canโt add a new course unless a student enrolls.
2)Update anomaly โ If Dr. Kumarโs phone number changes, we must update it in multiple rows.
3)Deletion anomalyโ If the last student in a course is removed, we lose course and instructor details too.
Clearly, normalization is needed.
โ 1NF (First Normal Form)
Rule:1 Data must be atomic, no repeating groups.
Our base table is already in 1NF, since each column has atomic values.
But redundancy still exists.
SQL (1NF structure):
CREATE TABLE StudentCourse1NF (
StudentID VARCHAR(10),
StudentName VARCHAR(50),
CourseID VARCHAR(10),
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15),
PRIMARY KEY (StudentID, CourseID)
);
โ
2NF (Second Normal Form)
Rule: Must be in 1NF + no partial dependency.
Problem in 1NF:
- StudentName depends only on StudentID.
- CourseName,
Instructor
, and InstructorPhone depend only onCourseID
.
๐ Solution: Split into Student, Course, and Enrollment tables.
SQL for 2NF:
CREATE TABLE Student (
StudentID VARCHAR(10) PRIMARY KEY,
StudentName VARCHAR(50)
);
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
Instructor VARCHAR(50),
InstructorPhone VARCHAR(15)
);
CREATE TABLE Enrollment (
StudentID VARCHAR(10),
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
Rule: Must be in 2NF + no transitive dependency.
Problem in 2NF:
In Course
, InstructorPhone
depends on Instructor
, not directly on CourseID
.
๐ Solution: Create a separate Instructor table.
SQL for 3NF:
CREATE TABLE Student (
StudentID VARCHAR(10) PRIMARY KEY,
StudentName VARCHAR(50)
);
CREATE TABLE Instructor (
InstructorID VARCHAR(10) PRIMARY KEY,
InstructorName VARCHAR(50),
InstructorPhone VARCHAR(15)
);
CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
InstructorID VARCHAR(10),
FOREIGN KEY (InstructorID) REFERENCES Instructor(InstructorID)
);
CREATE TABLE Enrollment (
StudentID VARCHAR(10),
CourseID VARCHAR(10),
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
๐ Inserting Sample Data
-- Students
INSERT INTO Student VALUES ('S01', 'Arjun');
INSERT INTO Student VALUES ('S02', 'Priya');
INSERT INTO Student VALUES ('S03', 'Kiran');
-- Instructors
INSERT INTO Instructor VALUES ('I01', 'Dr. Kumar', '9876543210');
INSERT INTO Instructor VALUES ('I02', 'Dr. Mehta', '9123456780');
INSERT INTO Instructor VALUES ('I03', 'Dr. Rao', '9988776655');
-- Courses
INSERT INTO Course VALUES ('C101', 'DBMS', 'I01');
INSERT INTO Course VALUES ('C102', 'Data Mining', 'I02');
INSERT INTO Course VALUES ('C103', 'AI', 'I03');
-- Enrollment
INSERT INTO Enrollment VALUES ('S01', 'C101');
INSERT INTO Enrollment VALUES ('S01', 'C102');
INSERT INTO Enrollment VALUES ('S02', 'C101');
INSERT INTO Enrollment VALUES ('S03', 'C103');
๐ Query: Students with Courses and Instructors
SELECT s.StudentName, c.CourseName, i.InstructorName
FROM Enrollment e
JOIN Student s ON e.StudentID = s.StudentID
JOIN Course c ON e.CourseID = c.CourseID
JOIN Instructor i ON c.InstructorID = i.InstructorID;
๐ฏ Key Takeaways
- 1NF removes repeating groups.
- 2NF removes partial dependency.
- 3NF removes transitive dependency.
- Normalization reduces redundancy and prevents anomalies.
By splitting large tables into smaller, well-structured ones, we ensure cleaner data, better performance, and easier maintenance.
Top comments (1)
big thanks for this summary