Tejeswararao123

Posted on Feb 22, 2023

SQL Concepts

ACID

ACID refers to a set of principles that ensure reliable data transactions
These are the properties of a transaction
Transaction is a logical group of one or more SQL statements

Atomicity:

Either all or none are applied to a database Example: In a banking system for a money transfer two operations have to be done like debit and credit in this scenario two operations or none should be done

Consistency:

Transactions always leave the database in a consistent state

Isolation:

When multiple transactions are being simultaneously executed and in parallel in a database system, then the carrying out and execution of the transaction would occur as if it is the only transaction that exists in the system.
None of the transactions would affect any other transaction’s existence.

Durability:

Once a transaction is committed, its changes are permanent and will survive any subsequent system failures, even power loss or crashes.

CAP Theorem: It states that in a distributed system, it is impossible to simultaneously achieve Consistency, Availability, and partition tolerance

According to the above statement we have to give up something to gain something

Consistency:

Every read operation from the database returns the most recent write or an error. In other words, all nodes in the system see the same data at the same time.

Availability:

Every request to the database receives a response, without guaranteeing that it contains the most recent version of the data. In other words, the system is always up and running, even in the face of network partitions, hardware failures, or other forms of degradation.

Partition tolerance:

The system continues to operate even when network partitions occur, which means that some nodes in the system are unable to communicate with others.

Examples

Trade-off between consistency and availability. Choosing to prioritize consistency may mean sacrificing some level of availability, while prioritizing availability may mean sacrificing some level of consistency.
Another example of a trade-off in database management systems is between performance and data durability. Often, improving performance can come at the cost of reducing data durability. This is because some optimizations, such as buffering or caching data in memory, may increase performance but also increase the risk of data loss in the event of a system failure.

Joins:

JOIN is a clause to combine rows from two or more tables, based on a related column between them.

Types of joins:

Natural join:

NATURAL JOIN combines the tables based on the common columns.

SELECT *
FROM table1
   NATURAL JOIN table2;

Inner Join:

An inner join returns only the rows that have matching values in both tables.

SELECT column1, column2, ...
FROM table1
INNER JOIN table2
ON table1.column = table2.column;

CROSS JOIN:

In CROSS JOIN, each row from the first table is combined with all rows in the second table. Cross Join is also called CARTESIAN JOIN

SELECT *
FROM table1
   CROSS JOIN table2;

Full Join:

FULL JOIN or FULL OUTER JOIN is the result of both RIGHT JOIN and LEFT JOIN

SELECT *
FROM table1
   FULL JOIN table2
ON c1 = c2;

Cross Join

Left join:

Returns each row in the left table, and matched rows from the right table are combined. If there is no match, NULL values are assigned to the right half of the rows in the temporary table.

SELECT *
FROM table1
   LEFT JOIN table2 
ON table1.c1 = tabl2.c2;

Right join:

Returns each row in the right table, and matched rows from the left table are combined. If there is no match, NULL values are assigned to the left half of the rows in the temporary table.

Syntax

SELECT *
FROM table1
   RIGHT JOIN table2
ON table1.c1 = table2.c2;

Aggregations, Filters in queries:

These are the function that we can apply after grouping data.

sum():

Calculates the sum of columns for a group of rows or entire data Example

SELECT category, SUM(sales) FROM products GROUP BY category;

Avg():

Calculates the average value of a given column for a group of rows or to entire data Example

SELECT AVG(salary) from EMPLOYEES

count():

Counts the number of rows in a table or the number of rows in a group of rows

SELECT customer, COUNT(order_id) FROM orders GROUP BY customer;

Min():

Returns minimum value of a given column for entire data or grouped data

SELECT department, MIN(salary) FROM employees GROUP BY department;

Max():

Returns the maximum value of a given column for entire data or grouped data

SELECT department, Max(salary) FROM employees GROUP BY department;

Filters:

We use a where clause and having clause to filter rows in a table

Where clause is put before grouping and having clause is written after grouping
Inside these clauses we insert conditions Example:

select batsman,sum(runs) as runs from batsman_runs
where team='rcb'
group by batsman
having runs>2000;

Normalization:

Normalization is a process by which database and table structures are created or modified to address inefficiencies/complexities related to data storage, data modification, and querying database tables ### Normalization can be achieved by following techniques #### 1NF Databases: A 1NF database is an atomic database. In this case, atomic means that each cell contains one value and each row is unique

// This would be a non-atomic table
| Title           | Length     | Type |
|-----------------|------------|------|
| Example A, B, C | 125 inches | B, C |
| Example A, B, C | 125 inches | B, C |
// This would be an atomic table
| ID | Title     | Length     | Type |
|----|-----------|------------|------|
| 1  | Example A | 125 inches | B    |
| 2  | Example A | 125 inches | C    |

Second normal form:

In a table every non-key value should be fully dependent on a primary key Violating the second normal form
2NF is intended to eliminate redundancy and anomalies that can occur when data is stored in a single table.

product_id | supplier_id | product_name | supplier_name | supplier_location
-----------+-------------+--------------+---------------+-----------------
1          | 1           | Widget A     | Supplier X    | Location 1
2          | 1           | Widget B     | Supplier X    | Location 1
3          | 2           | Widget C     | Supplier Y    | Location 2
4          | 2           | Widget D     | Supplier Y    | Location 2

To normalize the above table to 2nd nf modify the above table like below

Table: products
product_id | product_name
-----------+--------------
1          | Widget A
2          | Widget B
3          | Widget C
4          | Widget D

Table: suppliers
supplier_id | supplier_name | supplier_location
------------+---------------+------------------
1           | Supplier X    | Location 1
2           | Supplier Y    | Location 2

Third normal form:

This ensures that every non-key column in a table is not dependent on any other non-key column
Each column in a table must depend only on the primary key and not on any other columns in the table.
3NF is intended to eliminate data redundancy and anomalies that can occur when data is stored in a single table. Violating 3rd normal form:

book_id | author_name | author_email | book_title | book_isbn | book_publisher | publisher_location
--------+-------------+--------------+------------+-----------+----------------+--------------------
1       | John Smith  | john@xyz.com | Book A     | 123456789 | Publisher X    | New York
2       | John Smith  | john@xyz.com | Book B     | 987654321 | Publisher Y    | London
3       | Jane Doe    | jane@xyz.com | Book C     | 111111111 | Publisher Z    | San Francisco

To normalize the above do following

Table: books
book_id | book_title | book_isbn
--------+------------+-----------
1       | Book A     | 123456789
2       | Book B     | 987654321
3       | Book C     | 111111111

Table: authors
author_id | author_name | author_email
----------+-------------+--------------
1         | John Smith  | john@xyz.com
2         | Jane Doe    | jane@xyz.com

Table: publishers
publisher_id | publisher_name | publisher_location
-------------+----------------+--------------------
1            | Publisher X    | New York
2            | Publisher Y    | London
3            | Publisher Z    | San Francisco

Use foreign keys to establish a relationship between tables Indexes ### Boyce-Codd Normal Form (BCNF):
It ensures every non-key column in a table should be functionally dependent only on the primary key and not on any other columns in the table ### Violating the above technique

student_id | student_name | course_id | course_name | instructor_name | instructor_email
-----------+--------------+-----------+-------------+----------------+------------------
1          | John Smith   | 1         | Course A    | Jane Doe       | jane@xyz.com
1          | John Smith   | 2         | Course B    | John Doe       | john@xyz.com
2          | Jane Doe     | 2         | Course B    | John Doe       | john@xyz.com
3          | Bob Johnson  | 3         | Course C    | Jane Doe       | jane@xyz.com

To normalize the above data split the table into three

Table: students
student_id | student_name
-----------+--------------
1          | John Smith
2          | Jane Doe
3          | Bob Johnson

Table: courses
course_id | course_name
----------+------------
1         | Course A
2         | Course B
3         | Course C

Table: instructors
instructor_id | instructor_name | instructor_email
--------------+----------------+------------------
1             | Jane Doe       | jane@xyz.com
2             | John Doe       | john@xyz.com

To establish a relationship we can create another table that will hold all details of the enrollment

Table: enrollment
student_id | course_id | instructor_id
-----------+-----------+--------------
1          | 1         | 1
1          | 2         | 2
2          | 2         | 2
3          | 3         | 1

Transactions:

Nothing but a Set of Queries that are treated as a single unit
Transactions follow the principles of acid Acid properties are explained above
In PostgreSQL, transactions are started with the BEGIN statement and are ended with either the COMMIT statement to apply the changes made within the transaction Example

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 123;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 456;
COMMIT;

In the above, if both UPDATE statements execute successfully, the COMMIT statement applies the changes to the database. If an error occurs during the transaction, the ROLLBACK statement will undo the changes made within the transaction, leaving the database in its original state.

Locking mechanism:

Locks are used to restrict users from access to the database, modifying data to ensure that transactions do not interfere with each other
Locking ensures that each transaction can access the data it needs without interfering with other transactions that may be accessing the same data. Type of locking mechanism

1. Shared Locks:

Shared locks allow multiple transactions to read a resource simultaneously without interfering with each other. A shared lock allows other transactions to acquire shared locks but prevents them from acquiring exclusive locks.
Example:

BEGIN;
SELECT * FROM accounts WHERE account_id = 5 FOR SHARE;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 5;
COMMIT;

The SELECT statement acquires a shared lock on the row with account_id = 5 using the FOR SHARE clause
This ensures that other transactions can also acquire shared locks on this row, but prevents them from acquiring exclusive locks until the current transaction has been completed. ### 2.Exclusive Locks: Exclusive locks are used to ensure that only one transaction at a time can modify a resource. An exclusive lock prevents other transactions from acquiring shared or exclusive locks on the same resource. Example:

BEGIN;
SELECT * FROM accounts WHERE account_id = 3 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 3;
COMMIT;

The SELECT statement acquires an exclusive lock on the row with account_id = 3 using the FOR UPDATE clause
This ensures that no other transactions can modify this row until the current transaction has been completed.

3.Row-level Locks:

Row-level locks are used to lock individual rows of data. Row-level locks are useful in situations where multiple transactions need to modify different rows of the same table at the same time.
Example

BEGIN;
SELECT * FROM accounts WHERE account_id = 673 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 673;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 906;
COMMIT;

The first UPDATE statement acquires a row-level lock on the row with account_id = 673
This ensures that no other transactions can modify this row until the current transaction has been completed ### 4. Table-level Locks: Table-level locks are used to lock an entire table. Table-level locks are useful in situations where a transaction needs to modify all rows of a table at the same time Example

BEGIN;
LOCK TABLE accounts IN EXCLUSIVE MODE;
UPDATE accounts SET balance = 0;
COMMIT;

5.Advisory Locks:

Advisory locks are user-defined locks that are not enforced by PostgreSQL itself

These are useful in situations where a user needs to enforce locking based on some application-specific logic. Example:

BEGIN;
SELECT pg_advisory_xact_lock(1234);
UPDATE accounts SET balance = balance - 100 WHERE account_id = 56;
COMMIT;

The SELECT statement acquires an advisory lock with key 1234
This ensures that no other transactions can acquire a lock with the same key until the current transaction has been completed.

Database Isolation Levels:

Database isolation defines the degree to which a transaction must be isolated from the data changes made by another transaction (although in reality there may be a large number of transactions running concurrently). The overall goal is to prevent reads and writes of temporary, aborted, or otherwise incorrect data written by concurrent transactions.

Types of isolation levels

Read uncommitted
Read committed
Repeatable read
Serializable

1. Read uncommitted:

This kind of isolation level is used when we want even the uncommitted values of the rows. All updates and inserts that aren't even committed should be reflected in our transaction. This type of isolation level is commonly used in booking systems where we should be able to see these changes when another transaction tries to update a seat's availability even though that transaction is not committed.

2. Read committed:

It has 2 properties

No dirty reads:
This means values updated through a transaction will only be visible after the completion of a transaction
No Dirty writes:
Whenever a transaction is running on the database and updating particular values those values will not be overwritten by other transactions.
Next transaction will only update values after the completion of the previous transaction #### Implementation of reading committed
Row level locking for write operations
Multiple versions of the doc #### 3. Repeatable read:
Transactions started and not committed will not be visible
Transactions executed later will not be visible

4. Serializable isolation level:

This is the strongest of all isolation levels and guarantees pure isolation. No other transaction can read and write values until that transaction type is executed in the database. This isolation level solves all the problems encountered in all 3 isolation levels discussed previously, but since this only allows one query to be run on the database, it limits the scalability of the system.

Triggers:

Triggers are SQL codes that are automatically executed in response to certain events on a particular table

These are used to maintain the integrity of data #### Syntax

create trigger trigger_name
(before|after)[Insert | Update |Delete] 
on [Table name]
[For each row | For each column]
[Trigger body]

Nested triggers in SQL are the actions that are automatically executed when a certain database operation is performed #### operations in trigger
Drop: To remove a trigger from the database

Drop trigger trigger_name

Displaying triggers in a database

SHOW TRIGGERS 
IN database_name;

Insert Before inserting triggers:
Used to update or validate record values before they are saved to the database

CREATE TRIGGER calculate
before insert 
ON STUDENT 
FOR EACH ROW
set new.marks=new.marks+100;

After inserting the trigger:

Used to access field values that are set by the system and to effect changes in other records Example:

CREATE TRIGGER total_mark
after insert 
ON STUDENT 
FOR EACH ROW
insert into final_mark values(new.marks);

SQL Concepts

ACID

Atomicity:

Consistency:

Isolation:

Durability:

CAP Theorem: It states that in a distributed system, it is impossible to simultaneously achieve Consistency, Availability, and partition tolerance

Consistency:

Availability:

Partition tolerance:

Joins:

Types of joins:

Natural join:

Inner Join:

CROSS JOIN:

Full Join:

Left join:

Right join:

Aggregations, Filters in queries:

sum():

Avg():

count():

Min():

Max():

Filters:

Normalization:

Second normal form:

Third normal form:

Transactions:

Locking mechanism:

1. Shared Locks:

3.Row-level Locks:

5.Advisory Locks:

Database Isolation Levels:

Types of isolation levels

1. Read uncommitted:

2. Read committed:

4. Serializable isolation level:

Triggers:

References

Read next

Monte Carlo Laser Localization

A Comprehensive Guide to Building a Single Sign-On (SSO) Gateway with Node.js

RandomVerticalFlip in PyTorch

The Startup Ep:1 -The dream