Databases: Important Theoretical Concepts

#sql #postgres #database

Introduction

In the previous episode, I discussed the important SQL Statements/Keywords that are useful for day-to-day SQL use. In today's episode, I will try to explain few important Database Concepts that will boost our understanding even further.

Transactions

Transaction is a fundamental concept of all database systems. It bundles multiple steps into a single, all-or-nothing operation. The intermediate states between the steps are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction from completing, then none of the steps affects the database at all.

Transactions can be further tuned by BEGIN, COMMIT and ROLLBACK operations.

BEGIN

Each transaction is explicitly initiated using BEGIN TRAN.

COMMIT

By using COMMIT TRAN, a transaction is closed explicitly and modifications performed by the transaction are made permanent.

ROLLBACK

By using ROLLBACK TRAN, a transaction is explicitly closed and any modifications made by the transaction is discarded.

Each Transaction should follow ACID rules.

ACID

ACID stands for Atomicity, Consistency, Isolation and Durability. This describes a set of properties that apply to data transactions. For a reliable database, all these four attributes should be achieved.

Atomicity

Atomicity is also known as the "All or nothing rule". It means that either the entire transaction takes place at once or doesn’t happen at all. There aren't any partial transactions. Hence it is known as the "All or nothing rule".

Consistency

It says that the database should be consistent before and after the transaction (Data types, triggers, constraints, etc). It refers to the correctness of a database.

Isolation

This property says that the transactions occur independently without interference. This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.

Durability

It says that once the transaction is saved or committed, it can’t be “lost”. Data won’t go missing after a power outage or system crash.

LOCKING MECHANISM

Locking is designed to allow SQL Server to work seamlessly in a multi-user environment. Locking is the way that SQL Server manages transaction concurrency.

While objects are locked, SQL Server will prevent other transactions from making any change of data stored in objects affected by the imposed lock. Once the lock is released by committing the changes or by rolling back changes to the initial state, other transactions will be allowed to make required data changes.

CAP THEOREM

CAP Theorem (aka Eric Brewer Theorem) states that we cannot build a general data store that is continually available, sequentially consistent and tolerant to any partition failures. We can only achieve 2 features out of these 3. A combination of 2 must be chosen and this is usually the deciding factor in what technology is used.

Consistency - All the servers in the system will have the same data so users will get the same copy regardless of which server answers their request.
Availability - The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working).
Partition Tolerance - The system continues to operate as a whole even if individual servers fail or can't be reached.

TRIGGERS

Trigger: A trigger is a stored procedure in a database that automatically invokes whenever a special event in the database occurs. For example, a trigger can be invoked when a row is inserted into a specified table or when certain table columns are being updated.

Triggers can be run BEFORE or AFTER the triggering statement.

SQL NORMALIZATION

It is the process of structuring a database, following a series of Normal Forms to reduce data redundancy and improve data integrity.

Without Normalization in SQL, we may face many issues such as

Insertion anomaly: It occurs when we cannot insert data to the table without the presence of another attribute
Update anomaly: It is a data inconsistency that results from data redundancy and a partial update of data.
Deletion Anomaly: It occurs when certain attributes are lost because of the deletion of other attributes.

Some examples of Normal Forms are 1NF, 2NF, 3NF, BCNF, etc.

DATABASE ISOLATION LEVELS

Database Isolation levels control the degree of locking that occurs when selecting data. This is to prevent reads and writes of temporary, aborted, or otherwise incorrect data written by concurrent transactions. For many database applications, the majority of database transactions can be constructed using isolation level priority rules, reducing the locking overhead for the system.

These are the priority rules for isolation levels.