- Data redundancy is a repetition of data increases the size of database. It creates the issues such as insert, delete and update the data.
- Insertion Anomaly is adding redundant data for every new row. For example, every time we want to add a new employee, we have to repeat the department's data (Dept., Head of Dept., Phone Number). If we add 100 more employees, it will insert redundant data for every new row.
- Deletion Anomaly is a lost related dataset when some other dataset is deleted. For instance, when we delete the employee row, we also delete the department's data simultaneously. Until we reach the last row, we lose the department dataset entirely.
- Update Anomaly when the finance department head resigns, we need to update a new department head information. We have to update every single row where Dept. is finance.
Normalization is a technique of organizing the data into multiple related tables to minimize data redundancy.
We break the above Employee table into two new different tables and name Employee table and Dept. table.
- Employee table will contain employees information.
Solving Data Anomaly
- When we insert a new employee, We do not eliminate the data redundancy. However, we minimize it by only adding which department the employee works.
- When we delete employees data, it will not affect the department dataset.
- When we have a new department head, we just need to update department table. It will not affect the employee table.
- Normalization resolve the problems by dividing the data into separated independent entities and relating them using a key or a unique name (in this case is Dept.). Less redundancy means fewer problems in inserting, deleting and updating the data.
- There are three basics normal forms in data normalization.
- 1st Normal Form.
- 2nd Normal Form.
- 3rd Normal Form.
- BCNF is an advance normalization technique